Improve trainer function
Split the trainer run function into different steps in order to make it easier to use the different part of the code again in case of a different type of training (example : Multi task learning) where we will need to work with multiple losses. The function is now split to :
-
check_gpu(device) : Where we check the device type and the availability of GPU.
-
save_model_summary(output_folder, model) : Save summary of the model in a txt file.
-
static_information_to_csv(static_logfile_name, device, n) : Save the static information in a csv file.
-
check_exist_logfile(logfile_name, arguments) : Check existence of logfile (trainlog.csv), If the logfile exist the and the epochs number are still 0, The logfile will be replaced.
-
create_logfile_fields(valid_loader, device) : Creation of the logfile fields that will appear in the logfile.
-
train_sample_process(samples, model, optimizer, losses, device, criterion) : Processing the training inputs (Images, ground truth, masks) and apply the backprogration to update the training losses.
-
valid_sample_process(samples, model, valid_losses, device, criterion) : Processing the validation inputs (Images, ground truth, masks) and update validation losses.
-
checkpointer_process(checkpointer, checkpoint_period, valid_losses, lowest_validation_loss, arguments, epoch, max_epoch,) : Process the checkpointer, save the final model and keep track of the best model.
-
write_log_info(epoch, current_time, eta_seconds, losses, valid_losses, optimizer, logwriter, logfile, device,) : Write log info in trainlog.csv
Merge request reports
Activity
@dkhalil: can you use the merge request description to describe which changes you are putting forward? "Improve trainer function" is a bit too short. Thanks!
assigned to @andre.anjos
unassigned @andre.anjos
assigned to @andre.anjos
mentioned in commit f8c0bb96
Thanks, @dkhalil - it would have sufficed that the MR description is non-empty and resumes the contents. You don't have to repeat the comments from every individual function.
mentioned in issue #14 (closed)