Lightning acc
1 unresolved thread
1 unresolved thread
Closes #25 (closed)
Edited by Daniel CARRON
Merge request reports
Activity
assigned to @gokhan.ozbulak
87 "batch-size needs to be divisible by batch-chunk-count, otherwise an " 88 "error will be raised. This parameter is used to reduce the number of " 89 "samples loaded in each iteration, in order to reduce the memory usage " 90 "in exchange for processing time (more iterations). This is especially " 91 "interesting when one is training on GPUs with limited RAM. The " 92 "default of 1 forces the whole batch to be processed at once. Otherwise " 93 "the batch is broken into batch-chunk-count pieces, and gradients are " 94 "accumulated to complete each batch.", 86 "loaded for every iteration will be batch-size*batch-chunk-count. " 87 "This parameter is used to reduce the number of samples loaded in each " 88 "iteration, in order to reduce the memory usage in exchange for " 89 "processing time (more iterations). This is especially interesting " 90 "when one is training on GPUs with limited RAM. The default of 1 forces " 91 "the whole batch to be processed at once. Otherwise the batch is " 92 "multiplied by batch-chunk-count pieces, and gradients are accumulated " 93 "to complete each batch.", @gokhan.ozbulak: I fear that keeping the old option name is going to be misleading as the meaning has substantially changed. I suggest we use the lightning nomenclature instead for the option name:
--accumulate-grad-batches
. This shall make it clear. We should also follow, as much as possible, the documentation for lightning here:Accumulates gradients over k batches before stepping the optimizer
.changed this line in version 2 of the diff
You can complement the documentation of the option with bits from the currently available docs:
This parameter, used in conjunction with the batch-size, may be used to reduce the number of samples loaded in each iteration, to affect memory usage in exchange for processing time (more iterations). This is especially interesting when one is training on GPUs with a limited amount of onboard RAM.
Edited by André Anjosadded 1 commit
- 2e79aae0 - Change flag for batch accumulation. #25 (closed)
mentioned in commit f3a00b6a
Please register or sign in to reply