Algorithms with training that requires split by class don't seem to work
When running a small baseline algorithm, such as lda
, it seems that the required classes for the training samples is not forwarded to the training algorithm:
$ bob bio pipelines vanilla-biometrics -vv atnt lda
...
File ".../bob.bio.base/bob/bio/base/transformers/algorithm.py", line 62, in fit
training_data = split_X_by_y(X, y)
File ".../bob.bio.base/bob/bio/base/transformers/__init__.py", line 6, in split_X_by_y
for x1, y1 in zip(X, y):
TypeError: 'NoneType' object is not iterable
I have checked what is going on, and it seems that y=None
in: https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/1c3f542ee4d77592146ddc54aa8a51194a853745/bob/bio/base/transformers/__init__.py#L4
called by:
https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/1c3f542ee4d77592146ddc54aa8a51194a853745/bob/bio/base/transformers/algorithm.py#L61
Unfortunately, I cannot trace the issue back further since my experience in debugging dask
is very limited.
Maybe we should allow to run the pipeline without dask
-- as far as I understood, the dask-pipeline is only a wrapper around the whole pipeline. Is it possible to skip using the dask
wrapper and run everything local in a single thread? This would make debugging much easier.
Actually, I wanted to try out the above pipeline to debug my dask
setup, which does not work.