Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • bob.bio.base bob.bio.base
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 14
    • Issues 14
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 1
    • Merge requests 1
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • bob
  • bob.bio.basebob.bio.base
  • Issues
  • #167

Closed
Open
Created Nov 29, 2021 by Manuel Günther@mguentherMaintainer

Algorithms with training that requires split by class don't seem to work

When running a small baseline algorithm, such as lda, it seems that the required classes for the training samples is not forwarded to the training algorithm:

$ bob bio pipelines vanilla-biometrics -vv atnt  lda

...

File ".../bob.bio.base/bob/bio/base/transformers/algorithm.py", line 62, in fit
    training_data = split_X_by_y(X, y)
  File ".../bob.bio.base/bob/bio/base/transformers/__init__.py", line 6, in split_X_by_y
    for x1, y1 in zip(X, y):
TypeError: 'NoneType' object is not iterable

I have checked what is going on, and it seems that y=None in: https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/1c3f542ee4d77592146ddc54aa8a51194a853745/bob/bio/base/transformers/__init__.py#L4 called by: https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/1c3f542ee4d77592146ddc54aa8a51194a853745/bob/bio/base/transformers/algorithm.py#L61

Unfortunately, I cannot trace the issue back further since my experience in debugging dask is very limited.

Maybe we should allow to run the pipeline without dask -- as far as I understood, the dask-pipeline is only a wrapper around the whole pipeline. Is it possible to skip using the dask wrapper and run everything local in a single thread? This would make debugging much easier.

Actually, I wanted to try out the above pipeline to debug my dask setup, which does not work.

Assignee
Assign to
Time tracking