Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
mednet
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
medai
software
mednet
Commits
d3d65b93
Commit
d3d65b93
authored
1 year ago
by
André Anjos
Browse files
Options
Downloads
Patches
Plain Diff
[data.datamodule] Only reset datasets if model_transforms change
parent
f0f7784b
No related branches found
Branches containing commit
No related tags found
Tags containing commit
1 merge request
!6
Making use of LightningDataModule and simplification of data loading
Pipeline
#77168
passed
1 year ago
Stage: qa
Stage: test
Stage: doc
Stage: dist
Changes
1
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
src/ptbench/data/datamodule.py
+31
-18
31 additions, 18 deletions
src/ptbench/data/datamodule.py
with
31 additions
and
18 deletions
src/ptbench/data/datamodule.py
+
31
−
18
View file @
d3d65b93
...
@@ -498,20 +498,6 @@ class ConcatDataModule(lightning.LightningDataModule):
...
@@ -498,20 +498,6 @@ class ConcatDataModule(lightning.LightningDataModule):
DatasetDictionary
:
typing
.
TypeAlias
=
dict
[
str
,
Dataset
]
DatasetDictionary
:
typing
.
TypeAlias
=
dict
[
str
,
Dataset
]
"""
A dictionary of datasets mapping names to actual datasets.
"""
"""
A dictionary of datasets mapping names to actual datasets.
"""
model_transforms
:
list
[
Transform
]
|
None
"""
Transforms required to fit data into the model.
A list of transforms (torch modules) that will be applied after raw-
data-loading. and just before data is fed into the model or eventual
data-augmentation transformations for all data loaders produced by
this data module. This part of the pipeline receives data as output
by the raw-data-loader, or model-related transforms (e.g. resize
adaptions), if any is specified. If data is cached, it is cached
**after** model-transforms are applied, as that is a potential
memory saver (e.g., if it contains a resizing operation to smaller
images).
"""
def
__init__
(
def
__init__
(
self
,
self
,
splits
:
ConcatDatabaseSplit
,
splits
:
ConcatDatabaseSplit
,
...
@@ -535,7 +521,8 @@ class ConcatDataModule(lightning.LightningDataModule):
...
@@ -535,7 +521,8 @@ class ConcatDataModule(lightning.LightningDataModule):
self
.
cache_samples
=
cache_samples
self
.
cache_samples
=
cache_samples
self
.
_train_sampler
=
None
self
.
_train_sampler
=
None
self
.
balance_sampler_by_class
=
balance_sampler_by_class
self
.
balance_sampler_by_class
=
balance_sampler_by_class
self
.
model_transforms
:
list
[
Transform
]
|
None
=
None
self
.
_model_transforms
:
list
[
Transform
]
|
None
=
None
self
.
drop_incomplete_batch
=
drop_incomplete_batch
self
.
drop_incomplete_batch
=
drop_incomplete_batch
self
.
parallel
=
parallel
# immutable, otherwise would need to call
self
.
parallel
=
parallel
# immutable, otherwise would need to call
...
@@ -602,8 +589,35 @@ class ConcatDataModule(lightning.LightningDataModule):
...
@@ -602,8 +589,35 @@ class ConcatDataModule(lightning.LightningDataModule):
"
multiprocessing_context
"
"
multiprocessing_context
"
]
=
multiprocessing
.
get_context
(
"
spawn
"
)
]
=
multiprocessing
.
get_context
(
"
spawn
"
)
@property
def
model_transforms
(
self
)
->
list
[
Transform
]
|
None
:
"""
Transforms required to fit data into the model.
A list of transforms (torch modules) that will be applied after
raw- data-loading. and just before data is fed into the model or
eventual data-augmentation transformations for all data loaders
produced by this data module. This part of the pipeline
receives data as output by the raw-data-loader, or model-related
transforms (e.g. resize adaptions), if any is specified. If
data is cached, it is cached **after** model-transforms are
applied, as that is a potential memory saver (e.g., if it
contains a resizing operation to smaller images).
"""
return
self
.
_model_transforms
@model_transforms.setter
def
model_transforms
(
self
,
value
:
list
[
Transform
]
|
None
):
old_value
=
self
.
_model_transforms
self
.
_model_transforms
=
value
# datasets that have been setup() for the current stage are reset
# datasets that have been setup() for the current stage are reset
self
.
_datasets
=
{}
if
value
!=
old_value
and
len
(
self
.
_datasets
):
logger
.
warning
(
f
"
Reseting
{
len
(
self
.
_datasets
)
}
loaded datasets due
"
"
to changes in model-transform properties. If you were caching
"
"
data loading, this will (eventually) trigger a reload.
"
)
self
.
_datasets
=
{}
@property
@property
def
balance_sampler_by_class
(
self
):
def
balance_sampler_by_class
(
self
):
...
@@ -801,8 +815,7 @@ class ConcatDataModule(lightning.LightningDataModule):
...
@@ -801,8 +815,7 @@ class ConcatDataModule(lightning.LightningDataModule):
* ``test``: uses only the test dataset
* ``test``: uses only the test dataset
* ``predict``: uses only the test dataset
* ``predict``: uses only the test dataset
"""
"""
pass
self
.
_datasets
=
{}
def
train_dataloader
(
self
)
->
DataLoader
:
def
train_dataloader
(
self
)
->
DataLoader
:
"""
Returns the train data loader.
"""
"""
Returns the train data loader.
"""
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment