Skip to content
Snippets Groups Projects

Lightning acc

Merged Gokhan OZBULAK requested to merge lightning-acc into main
1 unresolved thread

Closes #25 (closed)

Edited by Daniel CARRON

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
87 "batch-size needs to be divisible by batch-chunk-count, otherwise an "
88 "error will be raised. This parameter is used to reduce the number of "
89 "samples loaded in each iteration, in order to reduce the memory usage "
90 "in exchange for processing time (more iterations). This is especially "
91 "interesting when one is training on GPUs with limited RAM. The "
92 "default of 1 forces the whole batch to be processed at once. Otherwise "
93 "the batch is broken into batch-chunk-count pieces, and gradients are "
94 "accumulated to complete each batch.",
86 "loaded for every iteration will be batch-size*batch-chunk-count. "
87 "This parameter is used to reduce the number of samples loaded in each "
88 "iteration, in order to reduce the memory usage in exchange for "
89 "processing time (more iterations). This is especially interesting "
90 "when one is training on GPUs with limited RAM. The default of 1 forces "
91 "the whole batch to be processed at once. Otherwise the batch is "
92 "multiplied by batch-chunk-count pieces, and gradients are accumulated "
93 "to complete each batch.",
  • You can complement the documentation of the option with bits from the currently available docs:

    This parameter, used in conjunction with the batch-size, may be used to
    reduce the number of samples loaded in each iteration, to affect  
    memory usage in exchange for processing time (more iterations).  This is 
    especially interesting when one is training on GPUs with a limited amount 
    of onboard RAM.
    Edited by André Anjos
  • Also, please check the documentation for when passing this option around. In some places like engine/trainer.py, the old documentation for the parameter is still in place, which may also mislead an eager user.

  • I had grepped with 'batch-chunk-count' but I apparently missed some other related content as there are statements with 'batch_chunk_count'. I will check in more details and update. Thanks for pointing out.

  • Daniel CARRON changed the description

    changed the description

  • added 1 commit

    Compare with previous version

  • André Anjos added 1 commit

    added 1 commit

    • 96b8c28e - [scripts.train] Fix description

    Compare with previous version

  • merged

  • André Anjos mentioned in commit f3a00b6a

    mentioned in commit f3a00b6a

  • Please register or sign in to reply
    Loading