Skip to content

Making use of LightningDataModule and simplification of data loading

Daniel CARRON requested to merge add-datamodule into main

This merge request adds LightningDataModule to better organize the code and make better use of lightning's features. This centralizes common tasks such as DataLoader creation and application of transforms into a base class to be inherited from.

Data loading was also simplified by removing custom Sample classes and maker functions, and the addition of RuntimeDataset and CachedDataset.

Remaining tasks:

  • (@dcarron) Create a common/default DataModule for the shenzhen dataset that takes protocols and transforms as parameters to avoid copying code for each protocol configuration
  • (@dcarron) Add typehints to ShenzenDataModule
  • (@dcarron) Investigate issue where training a new model with ElasticDeformation as a transform converges more slowly if the data is not cached.
  • (@andre.anjos) Update documentation on ShenzenDataModule
  • (@biosignal) Update all datasets, using Shenzhen as a reference
    • (@mdelitroz) Montgomery
    • (@mdelitroz) Hivtb
    • (@andre.anjos) Indian
    • (@andre.anjos) Padchest
    • (@mdelitroz) Tbpoc
    • (@andre.anjos) tbx11_simplified -> renamed as tbx11k, protocol v1 (uses original dataset organisation)
    • (@andre.anjos) tbx11_simplified_v2 -> renamed as tbx11k, protocol v2 (uses original dataset organisation)
    • (@andre.anjos) mc_ch -> renamed as montgomery-shenzhen
    • (@andre.anjos) mc_ch_in -> renamed as montgomery-shenzhen-indian
    • (@andre.anjos) mc_ch_in_11k -> renamed as montgomery-shenzhen-indian-tbx11k-v1
    • (@andre.anjos) mc_ch_in_11k_v2 -> renamed as montgomery-shenzhen-indian-tbx11k-v2
    • (@andre.anjos) mc_ch_in_pc -> renamed as montgomery-shenzhen-indian-padchest
    • (@andre.anjos) nih_cxr14_re -> renamed as nih-cxr14 (n.b.: multi-class dataset, radiological findings)
    • (@andre.anjos) nih_cxr14_re_pc -> renamed as nih-cxr14-padchest
  • (@dcarron) Update models
  • (@dcarron) Update evaluation scripts
  • (@dcarron) Update support for extra_validation datasets
  • (@biosignal) Update full documentation
  • (@biosignal) Update unit tests

Addresses the following issues:

Edited by André Anjos

Merge request reports