bob issueshttps://gitlab.idiap.ch/groups/bob/-/issues2022-04-01T14:15:17Zhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/178It would be great if we could provide documentation here on how to enable/dis...2022-04-01T14:15:17ZTiago de Freitas PereiraIt would be great if we could provide documentation here on how to enable/disable the fit function from being calledThe following discussion from !280 should be addressed:
- [ ] @mguenther started a [discussion](https://gitlab.idiap.ch/bob/bob.bio.base/-/merge_requests/280#note_72711): (+1 comment)
> It would be great if we could provide docume...The following discussion from !280 should be addressed:
- [ ] @mguenther started a [discussion](https://gitlab.idiap.ch/bob/bob.bio.base/-/merge_requests/280#note_72711): (+1 comment)
> It would be great if we could provide documentation here on how to enable/disable the fit function from being called.Tiago de Freitas PereiraTiago de Freitas Pereirahttps://gitlab.idiap.ch/bob/bob.extension/-/issues/87Nightlies failing because of this one2022-04-04T08:15:39ZTiago de Freitas PereiraNightlies failing because of this oneThere is one test breaking
`FAIL: bob.extension.test_click_helper.test_config_dump2`
https://gitlab.idiap.ch/bob/nightlies/-/jobs/263066There is one test breaking
`FAIL: bob.extension.test_click_helper.test_config_dump2`
https://gitlab.idiap.ch/bob/nightlies/-/jobs/263066Tiago de Freitas PereiraTiago de Freitas Pereirahttps://gitlab.idiap.ch/bob/bob.learn.em/-/issues/43Saving the state in HDF52022-03-31T12:02:24ZTiago de Freitas PereiraSaving the state in HDF5Hi @amohammadi and @ydayer,
Does it make sense to have functions to load and save the state of the objects (GMMMachine, KMeansMachine...) in HDF5?
Everything is picklable now, so, why bother with HDF5?
Thanks for the any clarification...Hi @amohammadi and @ydayer,
Does it make sense to have functions to load and save the state of the objects (GMMMachine, KMeansMachine...) in HDF5?
Everything is picklable now, so, why bother with HDF5?
Thanks for the any clarification?
Cheershttps://gitlab.idiap.ch/bob/bob.extension/-/issues/86Moving to Github and de-branding as a Bob package2022-11-23T08:31:30ZAndré AnjosMoving to Github and de-branding as a Bob packageThere is a general will to move software that can be used by a larger audience (that is not necessarily somebody at the @biometric group) to GitHub/conda-forge. This move would also de-brand this package as belonging to Bob.
To do this...There is a general will to move software that can be used by a larger audience (that is not necessarily somebody at the @biometric group) to GitHub/conda-forge. This move would also de-brand this package as belonging to Bob.
To do this, I propose we take on this task by first identifying the various bits in here that would be useful as standalone components. I find there are mainly 5 categories of functions:
- Build tools for C++: cmake, boost, pkgconfig, utils, __init__
- Helpers for Sphinx building: utils (`link_documentation`)
- Helpers to build CLIs: `scripts.click_helper`, `scripts.main_cli`
- Helpers for configuration: __init__, config, rc_config.py
- Helpers for logging: `log` (however some bits of it concern logging for C++)
I'm guessing that everything related to building other Bob packages (mostly the C++ code), can be considered deprecated once all C++ code has finally been ported to pure-Python alternatives. This then leaves us with the 4 other categories of helpers we have to somehow group (or not), to make packages.
Then, I propose we simply leave this package be (or archive it), and move the pieces of interest to a dedicated Python-only-builds GitHub project. We then ask each package going forward to make use of those specialised packages instead of bob.extension.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/177bob bio pipelines vanilla-biometrics is a bad name2022-04-03T11:24:27ZAmir MOHAMMADIbob bio pipelines vanilla-biometrics is a bad nameThis command line has a terrible name and there are several issues with it:
- It's too long!
- It has both `bio` and `biometrics` in its name, which point to the same thing
- The `biometrics` part matches with `metrics`. So every time i...This command line has a terrible name and there are several issues with it:
- It's too long!
- It has both `bio` and `biometrics` in its name, which point to the same thing
- The `biometrics` part matches with `metrics`. So every time in bash, I try to run the last `bob bio metrics` command, I type `metrics` in my search and instead I get this command.
- I cannot use the shorthand name of this command, i.e. typing `bob bio pip vanil atnt arcface`, because it matches this command and the `vanilla-biometrics-score-normalization` command.
I suggest renaming both this command and the score normalization one to:
```
$ bob bio pipeline vanilla
$ bob bio pipeline score-norm
```Amir MOHAMMADIAmir MOHAMMADIhttps://gitlab.idiap.ch/bob/bob.devtools/-/issues/94Recursive upload cannot hash files2022-03-29T11:49:36ZAndré AnjosRecursive upload cannot hash filesFor some reason, when using `bdt day upload <dir>`, the end filenames cannot be hashed such as it is the case when they are uploaded individually. Trace:
```
Traceback (most recent call last):
File "/remote/idiap.svm/user.active/aanj...For some reason, when using `bdt day upload <dir>`, the end filenames cannot be hashed such as it is the case when they are uploaded individually. Trace:
```
Traceback (most recent call last):
File "/remote/idiap.svm/user.active/aanjos/mamba/bin/bdt", line 11, in <module>
sys.exit(main())
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/bob/devtools/scripts/bdt.py", line 43, in _decorator
value = view_func(*args, **kwargs)
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/bob/devtools/scripts/dav.py", line 280, in upload
path_with_hash = augment_path_with_hash(k)
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/bob/devtools/dav.py", line 72, in augment_path_with_hash
raise ValueError(
ValueError: Can only augment path to files with a hash. Got: optic-cup
```André AnjosAndré Anjoshttps://gitlab.idiap.ch/bob/bob.bio.spear/-/issues/37Drop dependency on bob.ap2022-04-27T20:15:31ZAmir MOHAMMADIDrop dependency on bob.apUse torchaudio instead.
Let's track this with low priorityUse torchaudio instead.
Let's track this with low priorityhttps://gitlab.idiap.ch/bob/bob.pad.base/-/issues/41Drop gridtk dependency2022-03-24T14:45:45ZAmir MOHAMMADIDrop gridtk dependencyThe Great DeprecationAmir MOHAMMADIAmir MOHAMMADIhttps://gitlab.idiap.ch/bob/bob.extension/-/issues/85Recursive upload cannot hash files2022-03-25T06:21:17ZAndré AnjosRecursive upload cannot hash filesFor some reason, when using `bdt day upload <dir>`, the end filenames cannot be hashed such as it is the case when they are uploaded individually. Trace:
```
Traceback (most recent call last):
File "/remote/idiap.svm/user.active/aanj...For some reason, when using `bdt day upload <dir>`, the end filenames cannot be hashed such as it is the case when they are uploaded individually. Trace:
```
Traceback (most recent call last):
File "/remote/idiap.svm/user.active/aanjos/mamba/bin/bdt", line 11, in <module>
sys.exit(main())
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/bob/devtools/scripts/bdt.py", line 43, in _decorator
value = view_func(*args, **kwargs)
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/bob/devtools/scripts/dav.py", line 280, in upload
path_with_hash = augment_path_with_hash(k)
File "/remote/idiap.svm/user.active/aanjos/mamba/lib/python3.9/site-packages/bob/devtools/dav.py", line 72, in augment_path_with_hash
raise ValueError(
ValueError: Can only augment path to files with a hash. Got: optic-cup
```https://gitlab.idiap.ch/bob/bob.learn.em/-/issues/42Tests hang in the CI, Job Failed #2613622022-04-04T13:02:57ZAmir MOHAMMADITests hang in the CI, Job Failed #261362Job [#261362](https://gitlab.idiap.ch/bob/bob.learn.em/-/jobs/261362) failed for a2a4890844ba8c96e414fdde7ecb7f1f5ea54ace:Job [#261362](https://gitlab.idiap.ch/bob/bob.learn.em/-/jobs/261362) failed for a2a4890844ba8c96e414fdde7ecb7f1f5ea54ace:https://gitlab.idiap.ch/bob/bob.learn.em/-/issues/41ISV on python + Dask2022-04-27T19:12:42ZTiago de Freitas PereiraISV on python + DaskI'm sketching something. Hope to have a version by next week.I'm sketching something. Hope to have a version by next week.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/176Vanilla-biometrics: defining partitions size or number of partitions2022-05-10T13:26:38ZYannick DAYERVanilla-biometrics: defining partitions size or number of partitionsI have some issues with the way data is partitioned in vanilla-biometrics with Dask.
##### Actual behavior:
- "automatic": takes `max(len(background_model_samples), len(reference_samples), len(probes_samples))`, then computes a partitio...I have some issues with the way data is partitioned in vanilla-biometrics with Dask.
##### Actual behavior:
- "automatic": takes `max(len(background_model_samples), len(reference_samples), len(probes_samples))`, then computes a partition size according to that and the number of worker available.
- user-set partition size (`-s` option): the size of partitions is fixed by the user.
##### My issue:
I have a big training set and a small enrollment set.
Using the automatic way, the number of partitions is defined by the size of `background_model_samples` (let's say 3000 elements, giving a partition size of 300, with 100 workers). But when processing my small set of `reference_samples` (10 elements), the whole set fits in one partition (of size 300) and thus is computed on one worker by Dask. And the enrollment step takes time and is done one reference at a time.
Setting the partition size manually (with `-s`) is no good either as I would set it to 1 to split my 10 enrollment tasks as much as possible and this will create 3000 tasks when training on the `background_model_samples` (too many tasks for dask, lots of transfer time).
##### A solution:
Split the data not in terms of partition size but in a number of partitions. `ToDaskBag` supports setting `npartition` instead of `partition_size`, and could easily be used that way. The number of partitions could be the number of available workers.
And in that case, 3000 `background_model_samples` will be split into 100 partitions (because of 100 available workers) and the `reference_samples` will be split into 10 partitions of size 1.
Am I missing a reason why it was not done like so @tiago.pereira ?
Another solution would be to allow the user to set the number of partitions manually (similar to the setting of partition size).Yannick DAYERYannick DAYERhttps://gitlab.idiap.ch/bob/bob.learn.em/-/issues/40rename k_means.py to kmean.py2022-03-22T17:34:30ZAmir MOHAMMADIrename k_means.py to kmean.pyand fix imports accordingly.
It's just a better module name IMO.and fix imports accordingly.
It's just a better module name IMO.Yannick DAYERYannick DAYERhttps://gitlab.idiap.ch/bob/bob.extension/-/issues/84`search_file` does not match exactly `options`2022-03-18T07:51:48ZYannick DAYER`search_file` does not match exactly `options`Assuming an archive `archive.tar.gz` with the following file structure:
```
.
+-female
| |
| +- my_file.txt
|
+-male
|
+- my_file.txt
```
Using `search_file("archive.tar.gz", ["male/my_file.txt"])` will return `"archive.tar.gz/female...Assuming an archive `archive.tar.gz` with the following file structure:
```
.
+-female
| |
| +- my_file.txt
|
+-male
|
+- my_file.txt
```
Using `search_file("archive.tar.gz", ["male/my_file.txt"])` will return `"archive.tar.gz/female/my_file.txt"` instead of `"archive.tar.gz/male/my_file.txt"`
(Used in that way by vanilla-biometrics in `bob.bio.base.database.csv_dataset` where `female` and `male` are protocols)Yannick DAYERYannick DAYERhttps://gitlab.idiap.ch/bob/bob.learn.em/-/issues/39gmm methods and e_step/m_step functions do not share code but implement the s...2022-03-24T15:13:17ZAmir MOHAMMADIgmm methods and e_step/m_step functions do not share code but implement the same thingAmir MOHAMMADIAmir MOHAMMADIhttps://gitlab.idiap.ch/bob/bob.learn.em/-/issues/38More dask tests2022-03-24T18:21:54ZAmir MOHAMMADIMore dask tests- [ ] There are no dask array as input tests for gmms
- [ ] Dask tests should run under multiprocessing of distributed package to make sure real-world conditions are simulated.- [ ] There are no dask array as input tests for gmms
- [ ] Dask tests should run under multiprocessing of distributed package to make sure real-world conditions are simulated.Amir MOHAMMADIAmir MOHAMMADIhttps://gitlab.idiap.ch/bob/bob.learn.em/-/issues/37Excess memory usage in kmeans training2022-03-22T17:34:30ZAmir MOHAMMADIExcess memory usage in kmeans trainingWhen training on voxforge and 256 GMMs with dask partition size of 200, I get this error:
```
bob.learn.em/bob/learn/em/k_means.py", line 78, in e_step
np.eye(n_clusters)[closest_k_indices][:, :, None] * data[:, None],
numpy.core._ex...When training on voxforge and 256 GMMs with dask partition size of 200, I get this error:
```
bob.learn.em/bob/learn/em/k_means.py", line 78, in e_step
np.eye(n_clusters)[closest_k_indices][:, :, None] * data[:, None],
numpy.core._exceptions.MemoryError: Unable to allocate 7.39 GiB for an array with shape (64546, 256, 60) and data type float64
```Yannick DAYERYannick DAYERhttps://gitlab.idiap.ch/bob/bob.bio.face/-/issues/77Output of `dataset.all_samples` is inconsistent2022-03-10T18:55:22ZManuel Günthersiebenkopf@googlemail.comOutput of `dataset.all_samples` is inconsistentThe output of the method `all_samples` of different databases returns different things. While the default `CSVDataset` https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/997e6d6dda44c928c1792518a2b625726efde0e1/bob/bio/base/database/csv_dat...The output of the method `all_samples` of different databases returns different things. While the default `CSVDataset` https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/997e6d6dda44c928c1792518a2b625726efde0e1/bob/bio/base/database/csv_dataset.py#L744 returns a list of `Sample` (more precisely a list of `DelayedSample`), some other datasets implemented in here return a list of `SampleSet`. Examples are:
* https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/38a910ac1df0ba14e8262f957ae0e666a3e2f616/bob/bio/face/database/ijbc.py#L296
* https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/38a910ac1df0ba14e8262f957ae0e666a3e2f616/bob/bio/face/database/rfw.py#L424
* https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/38a910ac1df0ba14e8262f957ae0e666a3e2f616/bob/bio/face/database/gbu.py#L238
But I am sure that I was missing some.
Is there any plan in changing this inconsistency? The name of the function suggests to extract a list of `Sample`, so we would likely want to adapt the implementations of the datasets listed here...https://gitlab.idiap.ch/bob/bob.learn.em/-/issues/36get_centroids_distance gets called twice during e_step in kmeans2022-03-21T10:48:27ZAmir MOHAMMADIget_centroids_distance gets called twice during e_step in kmeanshttps://gitlab.idiap.ch/bob/bob.db.base/-/issues/28Deprecate this package2022-04-27T20:13:01ZAmir MOHAMMADIDeprecate this packageThere is only `bob.db.atnt` left, after that, we can deprecate this package.There is only `bob.db.atnt` left, after that, we can deprecate this package.The Great Deprecation