bob issueshttps://gitlab.idiap.ch/groups/bob/-/issues2022-02-21T18:23:19Zhttps://gitlab.idiap.ch/bob/bob.learn.em/-/issues/34Does not work with h5py 32022-02-21T18:23:19ZAmir MOHAMMADIDoes not work with h5py 3Job [#257793](https://gitlab.idiap.ch/bob/bob.learn.em/-/jobs/257793) failed for 462d8bda27b8bef01a11f187fbb7ade153a650a0:Job [#257793](https://gitlab.idiap.ch/bob/bob.learn.em/-/jobs/257793) failed for 462d8bda27b8bef01a11f187fbb7ade153a650a0:Amir MOHAMMADIAmir MOHAMMADIhttps://gitlab.idiap.ch/bob/bob.learn.em/-/issues/33Nightlies failing because of this one2022-02-22T13:48:19ZTiago de Freitas PereiraNightlies failing because of this oneNightlies is failing because of this one.
The issue seems related to HDF5
https://gitlab.idiap.ch/bob/nightlies/-/jobs/257644
https://gitlab.idiap.ch/bob/bob.learn.em/-/jobs/257680
Could you please look at that @ydayer ?
ThanksNightlies is failing because of this one.
The issue seems related to HDF5
https://gitlab.idiap.ch/bob/nightlies/-/jobs/257644
https://gitlab.idiap.ch/bob/bob.learn.em/-/jobs/257680
Could you please look at that @ydayer ?
ThanksYannick DAYERYannick DAYERhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/175Script for feature extraction from database2022-03-19T11:10:26ZManuel Günthersiebenkopf@googlemail.comScript for feature extraction from databaseIn many cases, we would just want to have a script to extract the features for all samples of our database (using a specifiable ``Transformer``), so that we can use them in a different process. Currently, there is no such script availabl...In many cases, we would just want to have a script to extract the features for all samples of our database (using a specifiable ``Transformer``), so that we can use them in a different process. Currently, there is no such script available.
I would propose to add a script as follows:
```
import argparse
import os
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter,
description='Extract features from the given dataset'
)
parser.add_argument("--transformer", "-e", required=True, help="Select the transformer to be used")
parser.add_argument("--dataset", "-d", required=True, help="Select the dataset from which to extract features")
parser.add_argument("--output-directory", "-o", required=True, help="Select the directory where to write the data to")
args = parser.parse_args()
import bob.bio.base
import bob.core
import bob.io.base
logger = bob.core.log.setup("bob.paper.osijbc")
bob.core.log.set_verbosity_level(logger, 2)
database = bob.bio.base.load_resource(args.dataset, "database")
transformer = bob.bio.base.load_resource(args.transformer, "transformer")
for idx, samples in enumerate(database.all_samples()):
logger.info('Extracting features for sample', )
features = transformer.transform(samples)
for feature in features:
output = os.path.join(args.output_directory, feature.key + ".hdf5")
logger.debug('Writing file', output)
bob.io.base.save(feature.data, output, True)
```
To be consistent with our other scripts, I would recommend to use `click` instead of `argparse`. Unfortunately, I am not familiar with click and I have no time to learn how to implement click commands right now. Would anyone else -- with more experience with `click` take this over?Tiago de Freitas PereiraTiago de Freitas Pereirahttps://gitlab.idiap.ch/bob/bob.io.base/-/issues/22Does not compile with hdf5 1.122022-02-21T11:37:29ZTiago de Freitas PereiraDoes not compile with hdf5 1.12Compilation issue with HDF5
https://gitlab.idiap.ch/bob/nightlies/-/jobs/257523/raw
Probably related to these bumps
https://gitlab.idiap.ch/bob/bob.devtools/-/merge_requests/273
```
/scratch/builds/bob/nightlies/src/bob/bob.io.base/...Compilation issue with HDF5
https://gitlab.idiap.ch/bob/nightlies/-/jobs/257523/raw
Probably related to these bumps
https://gitlab.idiap.ch/bob/bob.devtools/-/merge_requests/273
```
/scratch/builds/bob/nightlies/src/bob/bob.io.base/bob/io/base/cpp/HDF5Group.cpp: In member function 'herr_t bob::io::base::detail::hdf5::Group::iterate_callback(hid_t, const char*, const H5L_info2_t*)':
/scratch/builds/bob/nightlies/src/bob/bob.io.base/bob/io/base/cpp/HDF5Group.cpp:88:73: error: too few arguments to function 'herr_t H5Oget_info_by_name3(hid_t, const char*, H5O_info2_t*, unsigned int, hid_t)'
88 | herr_t status = H5Oget_info_by_name(self, name, &obj_info, H5P_DEFAULT);
| ^
In file included from /scratch/builds/bob/nightlies/miniconda/conda-bld/bob.io.base_1645283626826/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pl/include/H5Apublic.h:22,
from /scratch/builds/bob/nightlies/miniconda/conda-bld/bob.io.base_1645283626826/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pl/include/hdf5.h:23,
from /scratch/builds/bob/nightlies/src/bob/bob.io.base/bob/io/base/include/bob.io.base/HDF5Group.h:15,
from /scratch/builds/bob/nightlies/src/bob/bob.io.base/bob/io/base/cpp/HDF5Group.cpp:18:
/scratch/builds/bob/nightlies/miniconda/conda-bld/bob.io.base_1645283626826/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pl/include/H5Opublic.h:557:15: note: declared here
557 | H5_DLL herr_t H5Oget_info_by_name3(hid_t loc_id, const char *name, H5O_info2_t *oinfo, unsigned fields,
| ^~~~~~~~~~~~~~~~~~~~
make[2]: *** [CMakeFiles/bob_io_base.dir/build.make:160: CMakeFiles/bob_io_base.dir/scratch/builds/bob/nightlies/src/bob/bob.io.base/bob/io/base/cpp/HDF5Group.cpp.o] Error 1
```Amir MOHAMMADIAmir MOHAMMADIhttps://gitlab.idiap.ch/bob/bob.pad.face/-/issues/43Nightlies failing2022-02-22T17:55:07ZTiago de Freitas PereiraNightlies failingNightlies failing because of this one
https://gitlab.idiap.ch/bob/nightlies/-/jobs/257429
Can someone have a look on that?
Thanks
ping @amohammadi @ageorgeNightlies failing because of this one
https://gitlab.idiap.ch/bob/nightlies/-/jobs/257429
Can someone have a look on that?
Thanks
ping @amohammadi @ageorgeAmir MOHAMMADIAmir MOHAMMADIhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/174Nightlies failing here2022-02-16T09:30:59ZTiago de Freitas PereiraNightlies failing herePLDA was temporarily removed from `bob.learn.em` with https://gitlab.idiap.ch/bob/bob.learn.em/-/merge_requests/42
we need to remove it from here too.PLDA was temporarily removed from `bob.learn.em` with https://gitlab.idiap.ch/bob/bob.learn.em/-/merge_requests/42
we need to remove it from here too.https://gitlab.idiap.ch/bob/bob.ip.facedetect/-/issues/12Nightlies failling here2022-02-15T15:40:38ZTiago de Freitas PereiraNightlies failling hereSomething with boosting.
I don't understand why this is failing now.
Needs investigation
https://gitlab.idiap.ch/bob/nightlies/-/pipelines/58308Something with boosting.
I don't understand why this is failing now.
Needs investigation
https://gitlab.idiap.ch/bob/nightlies/-/pipelines/58308https://gitlab.idiap.ch/bob/bob.paper.8years/-/issues/3Move plots to bob.bio.face2022-08-08T10:59:50ZTiago de Freitas PereiraMove plots to bob.bio.faceHi @mguenther,
We've made in this package several customized plots.
Can I move those customized plots to `bob.bio.face`?
ThanksHi @mguenther,
We've made in this package several customized plots.
Can I move those customized plots to `bob.bio.face`?
Thankshttps://gitlab.idiap.ch/bob/bob.io.video/-/issues/20Nightlies failing ON ARM because of this one.2022-01-19T13:08:57ZTiago de Freitas PereiraNightlies failing ON ARM because of this one.I think we need make this guy with "!" like the other ones until we decide to fix issues on this platform.I think we need make this guy with "!" like the other ones until we decide to fix issues on this platform.https://gitlab.idiap.ch/bob/bob.bio.face/-/issues/75Resources for arface dataset are mixed up2022-01-14T16:07:35ZManuel Günthersiebenkopf@googlemail.comResources for arface dataset are mixed upIn the `setup.py`, the entries of two ARface protocols are exchanged: https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/bc421ecae8908299ef5b879e965741a25dca6567/setup.py#L242In the `setup.py`, the entries of two ARface protocols are exchanged: https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/bc421ecae8908299ef5b879e965741a25dca6567/setup.py#L242Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob/-/issues/272Nightlies failing because of this one2022-01-17T17:21:57ZTiago de Freitas PereiraNightlies failing because of this one
https://gitlab.idiap.ch/bob/nightlies/-/jobs/253709/
```sh
with channels:
- http://www.idiap.ch/software/bob/conda/label/beta
- conda-forge
The reported errors are:
Encountered problems while solving:
- cannot install both ...
https://gitlab.idiap.ch/bob/nightlies/-/jobs/253709/
```sh
with channels:
- http://www.idiap.ch/software/bob/conda/label/beta
- conda-forge
The reported errors are:
Encountered problems while solving:
- cannot install both psutil-5.9.0-py38h497a2fe_0 and psutil-5.8.0-py310h6acc77f_2
Traceback (most recent call last):
File "/scratch/builds/bob/nightlies/miniconda/bin/bdt", line 11, in <module>
sys.exit(main())
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/bob/devtools/scripts/bdt.py", line 43, in _decorator
value = view_func(*args, **kwargs)
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/bob/devtools/scripts/ci.py", line 739, in nightlies
ctx.invoke(
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/bob/devtools/scripts/bdt.py", line 43, in _decorator
value = view_func(*args, **kwargs)
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/bob/devtools/scripts/build.py", line 305, in build
paths = conda_build.api.build(
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/conda_build/api.py", line 186, in build
return build_tree(
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/conda_build/build.py", line 3083, in build_tree
packages_from_this = build(metadata, stats,
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/conda_build/build.py", line 2123, in build
create_build_envs(top_level_pkg, notest)
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/conda_build/build.py", line 1980, in create_build_envs
environ.get_install_actions(m.config.test_prefix,
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/boa/cli/mambabuild.py", line 70, in mamba_get_install_actions
solution = solver.solve_for_action(_specs, prefix)
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/boa/core/solver.py", line 214, in solve_for_action
t = self.solve(specs)
File "/scratch/builds/bob/nightlies/miniconda/lib/python3.9/site-packages/boa/core/solver.py", line 200, in solve
raise RuntimeError("Solver could not find solution.")
RuntimeError: Solver could not find solution
```Amir MOHAMMADIAmir MOHAMMADIhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/173Create an option --force in the VanillaBiometrics CLI command....2022-01-14T09:16:56ZTiago de Freitas PereiraCreate an option --force in the VanillaBiometrics CLI command....In that way checkpoints will be regenerated even if they already exists
Related to #152.In that way checkpoints will be regenerated even if they already exists
Related to #152.https://gitlab.idiap.ch/bob/bob/-/issues/271Pytest compatibility2022-03-03T15:47:31ZAmir MOHAMMADIPytest compatibilityWhen we use pytest to test our packages, some tests do not run because they are not found by pytest. Pytest ignores `test.py` files and does not consider them tests!
Looking at the checkout of some bob packages that I have, these package...When we use pytest to test our packages, some tests do not run because they are not found by pytest. Pytest ignores `test.py` files and does not consider them tests!
Looking at the checkout of some bob packages that I have, these packages have a `test.py` file:
```
../bob.pad.base/bob/pad/base/test/test.py
../bob.db.mnist/bob/db/mnist/test.py
../bob.io.image/bob/io/image/test.py
../bob.db.atnt/bob/db/atnt/test.py
../bob.bio.vein/bob/bio/vein/tests/test.py
../bob.blitz/bob/blitz/examples/bob.example.extension/bob/example/extension/test.py
../bob.blitz/bob/blitz/examples/bob.example.library/bob/example/library/test.py
../bob.blitz/bob/blitz/examples/bob.example.project/bob/example/project/test.py
../bob.blitz/bob/blitz/test.py
../bob.devtools/bob/devtools/scripts/test.py
../bob.learn.activation/bob/learn/activation/test.py
../bob.io.stream/bob/io/stream/test/test.py
../bob.ip.stereo/bob/ip/stereo/test/test.py
../bob.io.audio/bob/io/audio/test.py
../bob.io.video/bob/io/video/test.py
../bob.ip.color/bob/ip/color/test.py
../bob.ip.facedetect/bob/ip/flandmark/test.py
../bob.ip.gabor/bob/ip/gabor/test.py
../bob.ip.qualitymeasure/bob/ip/qualitymeasure/test.py
../bob.learn.linear/bob/learn/linear/test.py
../bob.learn.pytorch/bob/learn/pytorch/test/test.py
```Amir MOHAMMADIAmir MOHAMMADIhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/171In the Algorithm, the model_fusion_function is ignored2022-05-19T14:28:15ZManuel Günthersiebenkopf@googlemail.comIn the Algorithm, the model_fusion_function is ignoredWhile the constructor of the `Algorithm ` class has two parameters dealing with score fusion: https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/a43b31fd50acc27540ee29924357b8e2301bbe47/bob/bio/base/algorithm/Algorithm.py#L83, one of them i...While the constructor of the `Algorithm ` class has two parameters dealing with score fusion: https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/a43b31fd50acc27540ee29924357b8e2301bbe47/bob/bio/base/algorithm/Algorithm.py#L83, one of them is ignored. In `score_for_multiple_models` the `model_fusion_function` should b used, but instead the `probe_fusion_function` is used: https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/a43b31fd50acc27540ee29924357b8e2301bbe47/bob/bio/base/algorithm/Algorithm.py#L218
So far, there is no problem because both have the same default value. But in case someone wants to change only one of them, this is currently not possible.Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.bio.face/-/issues/73Implementation of Distance algorithm for deep feature extractors not optimal2021-12-14T17:02:54ZManuel Günthersiebenkopf@googlemail.comImplementation of Distance algorithm for deep feature extractors not optimalThere are two different concepts that have been emerged lately in face recognition with deep features, which have been shown to improve performance considerably:
1. The best way to handle several samples for enrollment or probing is to ...There are two different concepts that have been emerged lately in face recognition with deep features, which have been shown to improve performance considerably:
1. The best way to handle several samples for enrollment or probing is to compute the average of the features.
2. When comparing deep features, use the cosine similarity.
Unfortunately, neither of the two concepts is used in our baselines, when we simply use the `Distance` implementation from `bob.bio.base`, where the default behavior is:
1. When having several features for enrollment or probing, compute the pairwise distances and then use the average of the scores. This is tricky to see since this is hidden in the base class constructor: https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/a43b31fd50acc27540ee29924357b8e2301bbe47/bob/bio/base/algorithm/Algorithm.py#L83
which will then be translated to computing **average scores** (not the score between averaged features): https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/a43b31fd50acc27540ee29924357b8e2301bbe47/bob/bio/base/utils/__init__.py#L27
2. The default comparison function in `Distance` is the Euclidean distance: https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/a43b31fd50acc27540ee29924357b8e2301bbe47/bob/bio/base/algorithm/Distance.py#L34
So, when we simply use the default constructor as in here: https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/f494d6cb9ca23d4809e08498d046f2120cb21df3/bob/bio/face/embeddings/pytorch.py#L417
and most probably also in all other implementations, we will get Euclidean instead of cosine distance.
Tasks:
- [ ] Implement the averaging of features both for the enrollment and the probes (in case there are multiple). This can either be done by adapting the existing `Distance` function through adding a different `multiple_model_scoring` or `multiple_probe_scoring` parameter, or by implementing a completely separate Algorithm class for that.
- [ ] Change the default in all of the baselines to use the new behavior, but at least to select the cosine distance instead of Euclidean.Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.bio.face/-/issues/72VGG16 preprocessing buggy?2021-12-14T17:45:35ZManuel Günthersiebenkopf@googlemail.comVGG16 preprocessing buggy?When using the VGG16 network, we need to subtract the RGB mean from the channels. As the images are in bob format (`NxCxHxW`), we would need to subtract the mean from `[:,i,:,:]`. Instead, we subtract it from `[:,:,:,i]`:
https://gitlab....When using the VGG16 network, we need to subtract the RGB mean from the channels. As the images are in bob format (`NxCxHxW`), we would need to subtract the mean from `[:,i,:,:]`. Instead, we subtract it from `[:,:,:,i]`:
https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/3567e990d0e523ceb5d3f9598054d8a27d7f7000/bob/bio/face/embeddings/opencv.py#L140
This is most certainly incorrect, especially since we use the correct dimension later on to convert RGB to BGR:
https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/3567e990d0e523ceb5d3f9598054d8a27d7f7000/bob/bio/face/embeddings/opencv.py#L146
Finally, in the pipeline, we define an MTCNN annotator with particular parameters: https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/3567e990d0e523ceb5d3f9598054d8a27d7f7000/bob/bio/face/embeddings/opencv.py#L203
but this is ignored since the pipeline uses `"mtcnn"`.https://gitlab.idiap.ch/bob/bob.pipelines/-/issues/40Nightlies are failing because of this package2021-11-30T14:05:42ZTiago de Freitas PereiraNightlies are failing because of this packageCheck here
https://gitlab.idiap.ch/bob/nightlies/-/jobs/250661
and
https://gitlab.idiap.ch/bob/bob.pipelines/-/jobs/250818
This is blocking the development of the upper stack.
```
=================================== FAILURES ======...Check here
https://gitlab.idiap.ch/bob/nightlies/-/jobs/250661
and
https://gitlab.idiap.ch/bob/bob.pipelines/-/jobs/250818
This is blocking the development of the upper stack.
```
=================================== FAILURES ===================================
______________________ test_dataset_pipeline_with_dask_ml ______________________
def test_dataset_pipeline_with_dask_ml():
scaler = dask_ml.preprocessing.StandardScaler()
pca = dask_ml.decomposition.PCA(n_components=3, random_state=0)
clf = SGDClassifier(random_state=0, loss="log", penalty="l2", tol=1e-3)
clf = dask_ml.wrappers.Incremental(clf, scoring="accuracy")
iris_ds = _build_iris_dataset(shuffle=True)
estimator = mario.xr.DatasetPipeline(
[
dict(
estimator=scaler,
output_dims=[("feature", None)],
input_dask_array=True,
),
dict(
estimator=pca,
output_dims=[("pca_features", 3)],
input_dask_array=True,
),
dict(
estimator=clf,
fit_input=["data", "target"],
output_dims=[],
input_dask_array=True,
fit_kwargs=dict(classes=range(3)),
),
]
)
with dask.config.set(scheduler="synchronous"):
> estimator = estimator.fit(iris_ds)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/bob/pipelines/tests/test_xarray.py:260:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/bob/pipelines/xarray.py:551: in fit
self._transform(ds, do_fit=True)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/bob/pipelines/xarray.py:510: in _transform
block.estimator_ = _fit(*args, block=block)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/bob/pipelines/xarray.py:243: in _fit
block.estimator.fit(*args, **block.fit_kwargs)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask_ml/wrappers.py:495: in fit
self._fit_for_estimator(estimator, X, y, **fit_kwargs)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask_ml/wrappers.py:479: in _fit_for_estimator
result = fit(
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask_ml/_partial.py:139: in fit
return value.compute()
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask/base.py:288: in compute
(result,) = compute(self, traverse=False, **kwargs)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask/base.py:571: in compute
results = schedule(dsk, keys, **kwargs)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask/local.py:553: in get_sync
return get_async(
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask/local.py:496: in get_async
for key, res_info, failed in queue_get(queue).result():
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/concurrent/futures/_base.py:437: in result
return self.__get_result()
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/concurrent/futures/_base.py:389: in __get_result
raise self._exception
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask/local.py:538: in submit
fut.set_result(fn(*args, **kwargs))
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask/local.py:234: in batch_execute_tasks
return [execute_task(*a) for a in it]
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask/local.py:234: in <listcomp>
return [execute_task(*a) for a in it]
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask/local.py:225: in execute_task
result = pack_exception(e, dumps)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask/local.py:220: in execute_task
result = _execute_task(task, data)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask/core.py:119: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/dask_ml/_partial.py:17: in _partial_fit
model.partial_fit(x, y, **kwargs)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/sklearn/linear_model/_stochastic_gradient.py:841: in partial_fit
return self._partial_fit(
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/sklearn/linear_model/_stochastic_gradient.py:572: in _partial_fit
X, y = self._validate_data(
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/sklearn/base.py:576: in _validate_data
X, y = check_X_y(X, y, **check_params)
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/sklearn/utils/validation.py:956: in check_X_y
X = check_array(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
array = ('pca.transform-98eb05bfe3c4e482e6896d5f42ca3d48', 1, 0)
accept_sparse = 'csr'
def check_array(
array,
accept_sparse=False,
*,
accept_large_sparse=True,
dtype="numeric",
order=None,
copy=False,
force_all_finite=True,
ensure_2d=True,
allow_nd=False,
ensure_min_samples=1,
ensure_min_features=1,
estimator=None,
):
"""Input validation on an array, list, sparse matrix or similar.
By default, the input is checked to be a non-empty 2D array containing
only finite values. If the dtype of the array is object, attempt
converting to float, raising on failure.
Parameters
----------
array : object
Input object to check / convert.
accept_sparse : str, bool or list/tuple of str, default=False
String[s] representing allowed sparse matrix formats, such as 'csc',
'csr', etc. If the input is sparse but not in the allowed format,
it will be converted to the first listed format. True allows the input
to be any format. False means that a sparse matrix input will
raise an error.
accept_large_sparse : bool, default=True
If a CSR, CSC, COO or BSR sparse matrix is supplied and accepted by
accept_sparse, accept_large_sparse=False will cause it to be accepted
only if its indices are stored with a 32-bit dtype.
.. versionadded:: 0.20
dtype : 'numeric', type, list of type or None, default='numeric'
Data type of result. If None, the dtype of the input is preserved.
If "numeric", dtype is preserved unless array.dtype is object.
If dtype is a list of types, conversion on the first type is only
performed if the dtype of the input is not in the list.
order : {'F', 'C'} or None, default=None
Whether an array will be forced to be fortran or c-style.
When order is None (default), then if copy=False, nothing is ensured
about the memory layout of the output array; otherwise (copy=True)
the memory layout of the returned array is kept as close as possible
to the original array.
copy : bool, default=False
Whether a forced copy will be triggered. If copy=False, a copy might
be triggered by a conversion.
force_all_finite : bool or 'allow-nan', default=True
Whether to raise an error on np.inf, np.nan, pd.NA in array. The
possibilities are:
- True: Force all values of array to be finite.
- False: accepts np.inf, np.nan, pd.NA in array.
- 'allow-nan': accepts only np.nan and pd.NA values in array. Values
cannot be infinite.
.. versionadded:: 0.20
``force_all_finite`` accepts the string ``'allow-nan'``.
.. versionchanged:: 0.23
Accepts `pd.NA` and converts it into `np.nan`
ensure_2d : bool, default=True
Whether to raise a value error if array is not 2D.
allow_nd : bool, default=False
Whether to allow array.ndim > 2.
ensure_min_samples : int, default=1
Make sure that the array has a minimum number of samples in its first
axis (rows for a 2D array). Setting to 0 disables this check.
ensure_min_features : int, default=1
Make sure that the 2D array has some minimum number of features
(columns). The default value of 1 rejects empty datasets.
This check is only enforced when the input data has effectively 2
dimensions or is originally 1D and ``ensure_2d`` is True. Setting to 0
disables this check.
estimator : str or estimator instance, default=None
If passed, include the name of the estimator in warning messages.
Returns
-------
array_converted : object
The converted and validated array.
"""
if isinstance(array, np.matrix):
warnings.warn(
"np.matrix usage is deprecated in 1.0 and will raise a TypeError "
"in 1.2. Please convert to a numpy array with np.asarray. For "
"more information see: "
"https://numpy.org/doc/stable/reference/generated/numpy.matrix.html", # noqa
FutureWarning,
)
# store reference to original array to check if copy is needed when
# function returns
array_orig = array
# store whether originally we wanted numeric dtype
dtype_numeric = isinstance(dtype, str) and dtype == "numeric"
dtype_orig = getattr(array, "dtype", None)
if not hasattr(dtype_orig, "kind"):
# not a data type (e.g. a column named dtype in a pandas DataFrame)
dtype_orig = None
# check if the object contains several dtypes (typically a pandas
# DataFrame), and store them. If not, store None.
dtypes_orig = None
has_pd_integer_array = False
if hasattr(array, "dtypes") and hasattr(array.dtypes, "__array__"):
# throw warning if columns are sparse. If all columns are sparse, then
# array.sparse exists and sparsity will be preserved (later).
with suppress(ImportError):
from pandas.api.types import is_sparse
if not hasattr(array, "sparse") and array.dtypes.apply(is_sparse).any():
warnings.warn(
"pandas.DataFrame with sparse columns found."
"It will be converted to a dense numpy array."
)
dtypes_orig = list(array.dtypes)
# pandas boolean dtype __array__ interface coerces bools to objects
for i, dtype_iter in enumerate(dtypes_orig):
if dtype_iter.kind == "b":
dtypes_orig[i] = np.dtype(object)
elif dtype_iter.name.startswith(("Int", "UInt")):
# name looks like an Integer Extension Array, now check for
# the dtype
with suppress(ImportError):
from pandas import (
Int8Dtype,
Int16Dtype,
Int32Dtype,
Int64Dtype,
UInt8Dtype,
UInt16Dtype,
UInt32Dtype,
UInt64Dtype,
)
if isinstance(
dtype_iter,
(
Int8Dtype,
Int16Dtype,
Int32Dtype,
Int64Dtype,
UInt8Dtype,
UInt16Dtype,
UInt32Dtype,
UInt64Dtype,
),
):
has_pd_integer_array = True
if all(isinstance(dtype, np.dtype) for dtype in dtypes_orig):
dtype_orig = np.result_type(*dtypes_orig)
if dtype_numeric:
if dtype_orig is not None and dtype_orig.kind == "O":
# if input is object, convert to float.
dtype = np.float64
else:
dtype = None
if isinstance(dtype, (list, tuple)):
if dtype_orig is not None and dtype_orig in dtype:
# no dtype conversion required
dtype = None
else:
# dtype conversion required. Let's select the first element of the
# list of accepted types.
dtype = dtype[0]
if has_pd_integer_array:
# If there are any pandas integer extension arrays,
array = array.astype(dtype)
if force_all_finite not in (True, False, "allow-nan"):
raise ValueError(
'force_all_finite should be a bool or "allow-nan". Got {!r} instead'.format(
force_all_finite
)
)
if estimator is not None:
if isinstance(estimator, str):
estimator_name = estimator
else:
estimator_name = estimator.__class__.__name__
else:
estimator_name = "Estimator"
context = " by %s" % estimator_name if estimator is not None else ""
# When all dataframe columns are sparse, convert to a sparse array
if hasattr(array, "sparse") and array.ndim > 1:
# DataFrame.sparse only supports `to_coo`
array = array.sparse.to_coo()
if array.dtype == np.dtype("object"):
unique_dtypes = set([dt.subtype.name for dt in array_orig.dtypes])
if len(unique_dtypes) > 1:
raise ValueError(
"Pandas DataFrame with mixed sparse extension arrays "
"generated a sparse matrix with object dtype which "
"can not be converted to a scipy sparse matrix."
"Sparse extension arrays should all have the same "
"numeric type."
)
if sp.issparse(array):
_ensure_no_complex_data(array)
array = _ensure_sparse_format(
array,
accept_sparse=accept_sparse,
dtype=dtype,
copy=copy,
force_all_finite=force_all_finite,
accept_large_sparse=accept_large_sparse,
)
else:
# If np.array(..) gives ComplexWarning, then we convert the warning
# to an error. This is needed because specifying a non complex
# dtype to the function converts complex to real dtype,
# thereby passing the test made in the lines following the scope
# of warnings context manager.
with warnings.catch_warnings():
try:
warnings.simplefilter("error", ComplexWarning)
if dtype is not None and np.dtype(dtype).kind in "iu":
# Conversion float -> int should not contain NaN or
# inf (numpy#14412). We cannot use casting='safe' because
# then conversion float -> int would be disallowed.
array = np.asarray(array, order=order)
if array.dtype.kind == "f":
_assert_all_finite(array, allow_nan=False, msg_dtype=dtype)
array = array.astype(dtype, casting="unsafe", copy=False)
else:
> array = np.asarray(array, order=order, dtype=dtype)
E ValueError: could not convert string to float: 'pca.transform-98eb05bfe3c4e482e6896d5f42ca3d48'
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.8/site-packages/sklearn/utils/validation.py:738: ValueError
```https://gitlab.idiap.ch/bob/bob.pipelines/-/issues/39Passing "resources" to dask_jobqueue.core.Job raises an exception2021-11-29T17:13:12ZManuel Günthersiebenkopf@googlemail.comPassing "resources" to dask_jobqueue.core.Job raises an exceptionWhen loading the resource `sge`, the following error is thrown:
```
File ".../bob/pipelines/distributed/sge.py", line 57, in __init__
super().__init__(
TypeError: __init__() got an unexpected keyword argument 'resources'
```
Tracin...When loading the resource `sge`, the following error is thrown:
```
File ".../bob/pipelines/distributed/sge.py", line 57, in __init__
super().__init__(
TypeError: __init__() got an unexpected keyword argument 'resources'
```
Tracing down the error, it seems that you are passing the `resources`: https://gitlab.idiap.ch/bob/bob.pipelines/-/blob/d8162ffc4fa072a14a8a4d7ac3b558de464a56ef/bob/pipelines/distributed/sge.py#L347
as a `kwargs` to `__init__`, which are simply passed on to the base class constructor:
https://gitlab.idiap.ch/bob/bob.pipelines/-/blob/d8162ffc4fa072a14a8a4d7ac3b558de464a56ef/bob/pipelines/distributed/sge.py#L58
I would recommend to have `resources` as a regular parameter in `__init__` so that it is not passed on to the base class constructor.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/168resources.py does not list dask clients2021-11-30T15:17:25ZManuel Günthersiebenkopf@googlemail.comresources.py does not list dask clientsWhile all other parts of the pipeline can be listed through `resources.py`, this is not the case for registered `dask` clients. When running `bob bio pipelines vanilla-biometrics -h` we can see the option `-l, --dask-client`, but curren...While all other parts of the pipeline can be listed through `resources.py`, this is not the case for registered `dask` clients. When running `bob bio pipelines vanilla-biometrics -h` we can see the option `-l, --dask-client`, but currently there is no simple way of listing which clients are registered.Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.pipelines/-/issues/38local-parallel queue is not setup well2021-12-06T11:09:29ZManuel Günthersiebenkopf@googlemail.comlocal-parallel queue is not setup wellThe setup of the current `local-parallel` configuration does not work as expected, for several reasons:
https://gitlab.idiap.ch/bob/bob.pipelines/-/blob/d8162ffc4fa072a14a8a4d7ac3b558de464a56ef/bob/pipelines/config/distributed/local_para...The setup of the current `local-parallel` configuration does not work as expected, for several reasons:
https://gitlab.idiap.ch/bob/bob.pipelines/-/blob/d8162ffc4fa072a14a8a4d7ac3b558de464a56ef/bob/pipelines/config/distributed/local_parallel.py#L10
1. When we set `processes=False`, we will only use the python threading module, which will effectively limit the CPU usage to around 100% (i.e., one core), no matter how many cores we use. Only with `processes=True`, we will get real parallelization.
2. Selecting all possible CPUs via `cpu_count()` by default does not work well. I have a machine with 128 CPU cores, so setting up all 128 cores takes longer than an experiment -- especially when using `processes=False` above, I commonly get a timeout error.
Before, we had something like `local-p4` with 4 parallel cores, and alike. I think it would be a good idea to incorporate several of these here. Are there any objections?