Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • bob.bio.face bob.bio.face
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 22
    • Issues 22
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 4
    • Merge requests 4
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • bobbob
  • bob.bio.facebob.bio.face
  • Issues
  • #39
Closed
Open
Issue created Dec 22, 2020 by Laurent COLBOIS@lcolboisMaintainer

[LGBPHS] wrong tempfiles path when running on the grid

Hi, I have an issue when running the LGBPHS baseline, e.g.

bob bio vanilla-biometrics pipeline mobio-male lgbphs -vv -l sge

where it fails with the following traceback :

Click to see traceback
Traceback (most recent call last):
  File "./bin/bob", line 47, in <module>
    sys.exit(bob.extension.scripts.main_cli())
  File "/idiap/temp/lcolbois/miniconda3/envs/bob_tf2/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/idiap/temp/lcolbois/miniconda3/envs/bob_tf2/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/idiap/temp/lcolbois/miniconda3/envs/bob_tf2/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/idiap/temp/lcolbois/miniconda3/envs/bob_tf2/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/idiap/temp/lcolbois/miniconda3/envs/bob_tf2/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/idiap/temp/lcolbois/miniconda3/envs/bob_tf2/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/idiap/temp/lcolbois/miniconda3/envs/bob_tf2/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/src/bob.bio.base/bob/bio/base/script/vanilla_biometrics.py", line 215, in vanilla_biometrics
    **kwargs,
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/src/bob.bio.base/bob/bio/base/pipelines/vanilla_biometrics/vanilla_biometrics.py", line 143, in execute_vanilla_biometrics
    _ = compute_scores(post_processed_scores, dask_client)
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/src/bob.bio.base/bob/bio/base/pipelines/vanilla_biometrics/vanilla_biometrics.py", line 23, in compute_scores
    result = result.compute(scheduler=dask_client)
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/eggs/dask-2.30.0-py3.7.egg/dask/base.py", line 167, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/eggs/dask-2.30.0-py3.7.egg/dask/base.py", line 452, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/eggs/distributed-2.30.1-py3.7.egg/distributed/client.py", line 2725, in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/eggs/distributed-2.30.1-py3.7.egg/distributed/client.py", line 1992, in gather
    asynchronous=asynchronous,
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/eggs/distributed-2.30.1-py3.7.egg/distributed/client.py", line 833, in sync
    self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/eggs/distributed-2.30.1-py3.7.egg/distributed/utils.py", line 340, in sync
    raise exc.with_traceback(tb)
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/eggs/distributed-2.30.1-py3.7.egg/distributed/utils.py", line 324, in f
    result[0] = yield future
  File "/idiap/temp/lcolbois/miniconda3/envs/bob_tf2/lib/python3.7/site-packages/tornado/gen.py", line 735, in run
    value = future.result()
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/eggs/distributed-2.30.1-py3.7.egg/distributed/client.py", line 1851, in _gather
    raise exception.with_traceback(traceback)
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/src/bob.bio.base/bob/bio/base/pipelines/vanilla_biometrics/pipelines.py", line 175, in write_scores
    return self.score_writer.write(scores)
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/src/bob.bio.base/bob/bio/base/pipelines/vanilla_biometrics/score_writers.py", line 56, in write
    return _write(probe_sampleset)
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/src/bob.bio.base/bob/bio/base/pipelines/vanilla_biometrics/score_writers.py", line 35, in _write
    if isinstance(probe[0], DelayedSample):
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/src/bob.pipelines/bob/pipelines/sample.py", line 165, in __getitem__
    return self.samples.__getitem__(item)
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/src/bob.pipelines/bob/pipelines/sample.py", line 187, in samples
    return self._load()
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/src/bob.bio.base/bob/bio/base/pipelines/vanilla_biometrics/legacy.py", line 364, in _load
    return joblib.load("/tmp/" + path)
  File "/remote/idiap.svm/temp.biometric03/lcolbois/bob.bio.face/eggs/joblib-0.17.0-py3.7.egg/joblib/numpy_pickle.py", line 577, in load
    with open(filename, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp//tmp/tmp1dt8ulg0/scores/uman/m103/02_mobile/m103_02_f12_i0_0uman/m103/01_mobile/m103_01_p01_i0_0_uman/m104/01_mobile/m104_01_p01_i0_0_uman/m106/01_mobile/m106_01_p01_i0_0.joblib'

Looks like there is an error when computing the path of some temporary files. Note that the issue:

  • Is specific to LGBPHS (Gabor graph for example works flawlessly)
  • Does not happen when running in local
  • Does not happen when using checkpointing (-c)

I am unsure how to start tracking down the root cause.

ping @tiago.pereira @ydayer

Assignee
Assign to
Time tracking