Skip to content
Snippets Groups Projects

Multiproc data loading

Merged André Anjos requested to merge multiproc_data_loading into master

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • André Anjos added 2 commits

    added 2 commits

    • 0410e714 - [script] Use dashed options and constrain option much more
    • 96ae0825 - [test.test_cli] Do not invoke the same program twice on the test

    Compare with previous version

  • André Anjos resolved all threads

    resolved all threads

  • @arenzom: everything should be fine now. Thank you. Set to auto-merge if build succeeds.

  • André Anjos enabled an automatic merge when the pipeline for 96ae0825 succeeds

    enabled an automatic merge when the pipeline for 96ae0825 succeeds

  • Please note the pipeline is failing for an unrelated issue now. It needs to be fixed elsewhere in the system.

  • André Anjos aborted the automatic merge because source branch was updated

    aborted the automatic merge because source branch was updated

  • André Anjos added 1 commit

    added 1 commit

    • 6908db95 - [test.test_cli] Test mp with a single process (does not work on mac cis)

    Compare with previous version

  • pytorch with Python==3.7 does not like the multiprocessing option on macOS. My understanding is that this is a Python-vs-pytorch specific problem we can safely ignore for now.

  • It might be related to this python issue. The default for multiprocess has changed from fork to spawn which may require some changes in the way resources are shared between parent and child processes.

    If memory serves well, the default has changed with Python 3.8 but there still might be something related happening here.

  • Thanks for this feedback, @samuel.gaist. I reset this locally, and I can't reproduce it at my machine.

  • More material: This SO thread indicates this problem is related to pytorch==1.7.0, which is the version we use. However, the thread also indicates the problem is reproducible on Python 3.6 and 3.8, whereas in our case, the problem only appears at Python==3.7+CI.

    Edited by André Anjos
  • I think I'll just put a warning and then create an issue on this package to treat this again when we upgrade pytorch.

  • Problem is reproducible if the whole test suite is executed. Just running a single test does not trigger the problem, corroborating what @samuel.gaist just shared.

  • The fix here seems to work as a workaround for this problem.

  • André Anjos added 1 commit

    added 1 commit

    • 643ae030 - [script] Fix predict/train to use spawn context in case of multiprocess data loading

    Compare with previous version

  • André Anjos added 1 commit

    added 1 commit

    • 9c933497 - [script] Enable mt fix only on darwin

    Compare with previous version

  • OK, finally green.

  • merged

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading