bob.bio.base issueshttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues2020-10-16T14:49:57Zhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/106Using Bob as a library: Don't force HDF5 serialization2020-10-16T14:49:57ZJaden DIEFENBAUGHUsing Bob as a library: Don't force HDF5 serializationThere are many, many places in `bob.bio.base` & the associated ecosystem where it is assumed the user wants to serialize information to an HDF5 file (for example, [`bob.bio.base.PCA`'s `train_projector()`](https://www.idiap.ch/software/b...There are many, many places in `bob.bio.base` & the associated ecosystem where it is assumed the user wants to serialize information to an HDF5 file (for example, [`bob.bio.base.PCA`'s `train_projector()`](https://www.idiap.ch/software/bob/docs/bob/bob.bio.base/master/_modules/bob/bio/base/algorithm/PCA.html#PCA.train_projector) always writes to an HDF5 file). This is an issue when using Bob tools in different use-cases & environments, as there's no guarantee that a user wants to write to an HDF5 file. Sometimes the user _can't_ write to files, such as in BEAT, which is the specific use-case that concerns me.
(Disk) serialization should at least be opt-in, and the data that was previously saved to disk by default should be returned by the function instead. For the above PCA example, this would change `train_projector()` to return the variances by default, and optionally write them to disk. Changes like this is the bare minimum needed to use these Bob tools in BEAT.
Honestly, though, serialization endpoints (disk, network, whatever) in general should be separated from individual Bob tools. A preprocessor/extractor/algorithm/whatever should have a method for general serialization as well as a method for rehydrating the instance using this data (this is already present in many places, but is just hard-coded to write to an HDF5 file). Some `bob.serialization` package could handle writing this data to disks/caches/networks/whatever.
What does everyone think?Bob 9.0.0https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/99Splitting the data one by one instead of chunk by chunk2020-04-23T14:55:50ZAmir MOHAMMADISplitting the data one by one instead of chunk by chunkCurrently bob.bio.base (when submitting a job to gridtk) splits the data like this:
```
mylist = range(10)
parallel_jobs = 2
list1 == range(5)
list2 == range(5,10)
```
This is really cumbersome when you have an unbalanced database.
For e...Currently bob.bio.base (when submitting a job to gridtk) splits the data like this:
```
mylist = range(10)
parallel_jobs = 2
list1 == range(5)
list2 == range(5,10)
```
This is really cumbersome when you have an unbalanced database.
For example right now I have a video database where the beginning samples have only 1 frame and quickly finish processing but the rest of the data have 20 frames in each samples which take 20 times more to process.
I was wondering if it is possible to split the data like the following when running in parallel:
```
mylist = range(10)
parallel_jobs = 2
list1 == range(0,10,2)
list2 == range(1,10,2)
```https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/93The current implementation of the toolchain is not flexible2020-04-23T14:55:30ZManuel Günthersiebenkopf@googlemail.comThe current implementation of the toolchain is not flexibleI woke up this morning at 5, and I couldn't sleep anymore. Instead, my mind led me to the conclusion that the current implementation of the toolchain is anything but flexible. Other biometrics, for example fingerprints, might require a d...I woke up this morning at 5, and I couldn't sleep anymore. Instead, my mind led me to the conclusion that the current implementation of the toolchain is anything but flexible. Other biometrics, for example fingerprints, might require a different set of tools. Modifying the toolchain in `bob.bio.gmm` was a complete hack, and it will not be possible for other researchers (I am not even sure for the Idiapers) to build a new toolchain.
I am thinking that it should be possible to write a generic `Toolchain` class to handle the submission and execution of jobs. This issue is just a reminder for myself to think about a better solution than the current.
I will come back to you when I have a more detailed plan. In the meantime, any comments are welcome.Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/70Create/Improve a high-level interface creation guide2020-04-23T14:52:59ZAmir MOHAMMADICreate/Improve a high-level interface creation guideWe have a guide here: https://gitlab.idiap.ch/biometric/software/wikis/database_creation_guide but it is outdated and full of grammar and spelling errors.
The docs on here should be revised and if necessary be integrated with that guide...We have a guide here: https://gitlab.idiap.ch/biometric/software/wikis/database_creation_guide but it is outdated and full of grammar and spelling errors.
The docs on here should be revised and if necessary be integrated with that guide.
@heusch and @onikisins might be interested in doing this in the hackathonMay 2017 HackathonOlegs NIKISINSOlegs NIKISINShttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/101A programmatic way to use bob.bio.base2020-04-23T14:52:02ZTiago de Freitas PereiraA programmatic way to use bob.bio.baseThe verify.py script is very handy to trigger experiments.
Not very recently @andre.anjos pushed a mechanism that allows us to provide configuration files as input and this made the work of preparing and triggering experiments even simpl...The verify.py script is very handy to trigger experiments.
Not very recently @andre.anjos pushed a mechanism that allows us to provide configuration files as input and this made the work of preparing and triggering experiments even simpler and cleaner.
Imagine that your work consists of more than one hundred experiments with different databases (splits), preprocessors, feature extractors, etc.
You can handle this complexity by spliting your configuration file in several parts and have something like this:
```
$ verify.py config1.py config2.py config3.py
```
Although this makes the work clearner, this doesn't avoid you to call verify.py several times.
A way to handle that is to have some sort of shell scripts that orchestrate these calls (or python scripts that generate the verify.py strings).
Doesn't look very clean.
The script verify.py is not just a command line View for the experiment triggering; it handles the argument parsing, input validation and the experiment triggering.
Today, it's not possible to detach the argument parsing from the input validation and the experiment triggering which makes impossible to programmatically trigger experiments.
We need some View-Controller design for this task.
For instance, in the View, we could wrap the content of the argument parsing in a dictionary (or use docopt that does this for free) and pass it to the Controller that will handle with the input validation and experiment triggering (it's just one way to handle this).
What would you think?
thanks
I'm willing to give a shot in this direction.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/60Reporting the failed samples in the score files2019-11-11T18:39:44ZAmir MOHAMMADIReporting the failed samples in the score filesAccording to the [ISO/IEC 19795-1](https://www.iec-normen.de/dokumente/preview-pdf/info_isoiec19795-1%7Bed1.0%7Den.pdf), several performance measures exist:
```
4.6.1 failure-to-enrol rate FTE
proportion of the population for whom the...According to the [ISO/IEC 19795-1](https://www.iec-normen.de/dokumente/preview-pdf/info_isoiec19795-1%7Bed1.0%7Den.pdf), several performance measures exist:
```
4.6.1 failure-to-enrol rate FTE
proportion of the population for whom the system fails to complete the enrolment process
NOTE The observed failure-to-enrol rate is measured on test crew enrolments. The predicted/expected failure-to-
enrol rate will apply to the entire target population.
4.6.2 failure-to-acquire rate FTA
proportion of verification or identification attempts for which the system fails to capture or locate an image or
signal of sufficient quality
NOTE The observed failure-to-acquire rate is distinct from the predicted/expected failure-to-acquire rate (the former
may be used to estimate the latter).
4.6.3 false non-match rate FNMR
proportion of genuine attempt samples falsely declared not to match the template of the same characteristic
from the same user supplying the sample
NOTE The measured/observed false non-match rate is distinct from the predicted/expected false non-match rate (the
former may be used to estimate the latter).
4.6.4 false match rate FMR
proportion of zero-effort impostor attempt samples falsely declared to match the compared non-self template
NOTE The measured/observed false match rate is distinct from the predicted/expected false match rate (the former
may be used to estimate the latter).
4.6.5 false reject rate FRR
proportion of verification transactions with truthful claims of identity that are incorrectly denied
4.6.6 false accept rate FAR
proportion of verification transactions with wrongful claims of identity that are incorrectly confirmed
```
And this is how to calculate them:
```
FRR = FTA + FNMR * (1 – FTA)
FAR = FMR * (1 – FTA)
```
However: "*Comparison of systems having different failure-to-enrol rates may require use of generalized false reject (GFRR) and
false accept rates (GFAR) which combine enrolment, sample acquisition and matching errors. The method of
generalization should be appropriate to the evaluation.*"
Where one possible solution is: "*A typical generalization is to treat a failure-to-enrol as if
the enrolment completed, but all subsequent verification or identification transactions by that enrolee, or
against their template, fail. The method of generalization shall be reported.*"
which I think is good enough for us.
`bob.bio.base` handles failed samples with the `--allow-missing-files` option but the problem is normally you want to see traces of these failed samples in the score files too so that you can calculate the **FTA**. (edit these samples have NaN scores so you can calculate FTA)
I think the best way to do this would be to report `numpy.nan` in score files when something goes wrong.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/112Severe Issue `bob bio metric`2018-07-10T07:19:04ZTiago de Freitas PereiraSevere Issue `bob bio metric`Guys,
The majority (if not all) of the metrics implemented in `bob bio` relies on a `dev` `eval` sets.
Excluding the `EPC` curve, whose `dev` and `eval` sets are necessary for the computation of the curve, this is not a requirement for...Guys,
The majority (if not all) of the metrics implemented in `bob bio` relies on a `dev` `eval` sets.
Excluding the `EPC` curve, whose `dev` and `eval` sets are necessary for the computation of the curve, this is not a requirement for the rest of the metrics in biometrics (and doesn't make sense).
Furthermore, most of the databases we have don't have the pair `dev` and `eval`.
I guess only the ones we created the protocol ourselves have these sets.
How can I use `bob bio metric` if my dataset has only a dev set?
For instance, if I try to plot a ROC curve I get:
```
bob bio roc ./scores-dev -o my-roc.pdf
Error: Invalid value for "scores": The number of provided scores must be > 0 and a multiple of 2 because the following files are required:
- 1 development file(s)
- 1 evaluation file(s)
```
This is a severe issue.
For instance, I can't use these scripts for my work (only through a hack)
ping @theophile.gentilhomme and @amohammadihttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/14The script verify.py needs an argument to pass environment variables to the job2018-06-03T12:14:18ZAndré AnjosThe script verify.py needs an argument to pass environment variables to the job*Created by: tiagofrepereira2012*
gridtk jman has this feature.
It is possible to set environment variables to the grid submission using the argument `--environment`.
Would be useful to have the same feature in the `verify.py` script...*Created by: tiagofrepereira2012*
gridtk jman has this feature.
It is possible to set environment variables to the grid submission using the argument `--environment`.
Would be useful to have the same feature in the `verify.py` script.
A patch is on the wayhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/13score fusion script lacks documentation and testing2018-06-03T12:14:18ZAndré Anjosscore fusion script lacks documentation and testing*Created by: siebenkopf*
There is a script ``./bin/fusion_llr.py`` that is not documented anywhere, and it seems to be untested. Fix that!*Created by: siebenkopf*
There is a script ``./bin/fusion_llr.py`` that is not documented anywhere, and it seems to be untested. Fix that!https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/8Evaluation.py script has no EPC curve2018-06-03T12:14:18ZAndré AnjosEvaluation.py script has no EPC curve*Created by: tiagofrepereira2012*
I just noticed that the `./evaluate.py` script has no option to plot the EPC.
Is there any special reason for that?
I know that this script works fine if we provide, as input, only `--dev-files` and...*Created by: tiagofrepereira2012*
I just noticed that the `./evaluate.py` script has no option to plot the EPC.
Is there any special reason for that?
I know that this script works fine if we provide, as input, only `--dev-files` and to plot the EPC we need both (dev and eval).
If this is the reason to not have it, I can just output a warning saying that (if you set the `--epc` with no `--eval-files`).
Anyways, I can implement it, no problem.
https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/19Add the IDIAP SGE GPU queue2018-06-03T12:14:17ZAndré AnjosAdd the IDIAP SGE GPU queue*Created by: tiagofrepereira2012*
It is necessary to add specific queues for the IDIAP SGE GPU queues.
I already have them in my fork, I will just sync.*Created by: tiagofrepereira2012*
It is necessary to add specific queues for the IDIAP SGE GPU queues.
I already have them in my fork, I will just sync.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/39It is time for a new release2018-06-03T12:14:16ZTiago de Freitas PereiraIt is time for a new releaseHi people,
I'm just finishing to publish some packages from `bob`.
So, It is time for new release of `bob.bio.*` (`bob.io.base`, `bob.bio.face`, `bob.bio.video`, `bob.bio.vein`, `bob.bio.spear`, `bob.db.bio_filelist` and `bob.bio.c...Hi people,
I'm just finishing to publish some packages from `bob`.
So, It is time for new release of `bob.bio.*` (`bob.io.base`, `bob.bio.face`, `bob.bio.video`, `bob.bio.vein`, `bob.bio.spear`, `bob.db.bio_filelist` and `bob.bio.csu`).
I've seem some people working on master branches and this is not healthy at all.
Can I do it tomorrow?? We need to end this cycle of development.
Cheers
@bob
Refactoring 2016 and gitlab migration milestoneTiago de Freitas PereiraTiago de Freitas Pereirahttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/37Some database tests are missing2018-06-03T12:14:16ZTiago de Freitas PereiraSome database tests are missingIt is necessary to move the `bob.bio.db` tests to `bob.bio.base`It is necessary to move the `bob.bio.db` tests to `bob.bio.base`Refactoring 2016 and gitlab migration milestoneTiago de Freitas PereiraTiago de Freitas Pereira2016-09-16https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/35Option to invert the sign of the scores in ./bin/evaluate.py script2018-06-03T12:14:16ZTiago de Freitas PereiraOption to invert the sign of the scores in ./bin/evaluate.py scriptThis issue was raised in the bob-devel group.
The script evaluate.py expects that the genuine scores are higher than the impostor scores.
However, if you work with similarity measures, where the expectation is to have genuine scores ...This issue was raised in the bob-devel group.
The script evaluate.py expects that the genuine scores are higher than the impostor scores.
However, if you work with similarity measures, where the expectation is to have genuine scores lower than the impostors, this script will not work properly.
Would be nice to have an option in this script that just multiplies the scores by -1 if activatedOlegs NIKISINSOlegs NIKISINShttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/50Why is the BioDatabase.model_ids_with_protocol function required?2018-06-03T12:14:15ZManuel Günthersiebenkopf@googlemail.comWhy is the BioDatabase.model_ids_with_protocol function required?I am trying to implement a new database interface for one of our local databases, which uses the `BioFileSet` interface. I have created a class derived from `BioDatabase`, and I am implementing the functions.
While implementing the data...I am trying to implement a new database interface for one of our local databases, which uses the `BioFileSet` interface. I have created a class derived from `BioDatabase`, and I am implementing the functions.
While implementing the database, I was stumbling upon a *pure virtual* function that I don't understand: `model_ids_with_protocol`. Why do I have to implement this function, and why can't I just simply implement the `model_ids` function?
When I see it correctly, the `model_ids_with_protocol` function is called from within `model_ids`: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/database/database.py#L378
passing a the member variable `self.protocol` as the protocol.
This is IMHO completely useless. The `self.protocol` is available in the derived class as well, and it can be used inside there, without needing to pass it as a parameter. The exact same is true for the functions `objects`: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/database/database.py#L381, `object_sets`: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/database/database.py#L565 and the `tmodel_ids_with_protocol`: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/database/database.py#L732
So, is there any reasoning behind having the `protocol` parameters in these functions, other than 'It has been that way in the bob.db.verification.database interface' (where it was required)?
@amohammadihttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/48BioDatabase.model_ids should accept **kwargs2018-06-03T12:14:15ZAmir MOHAMMADIBioDatabase.model_ids should accept **kwargsfor potential filtering of model_ids specific to a database.
Just like the way that BioDatabase.objects takes **kwargs.for potential filtering of model_ids specific to a database.
Just like the way that BioDatabase.objects takes **kwargs.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/83The of module `imp` is deprecated2018-05-16T16:13:52ZAndré AnjosThe of module `imp` is deprecatedSince python 3.3 came out, the module `imp` is being deprecated in favor of `importlib`. In python2 one can use an external package to mimick its behaviour (https://pypi.python.org/pypi/importlib2). This package is unfortunately not avai...Since python 3.3 came out, the module `imp` is being deprecated in favor of `importlib`. In python2 one can use an external package to mimick its behaviour (https://pypi.python.org/pypi/importlib2). This package is unfortunately not available in conda yet.Tiago de Freitas PereiraTiago de Freitas Pereirahttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/98Scoring phase is too non-verbose even when `-vvv` is passed2017-12-05T10:25:54ZAndré AnjosScoring phase is too non-verbose even when `-vvv` is passedWhile executing `verify.py`, the preprocessing and extraction of each sample is carefully reported when in `-vvv` mode. The same does not happen when scoring. There is absolutely no output.
It would be good if the scoring bit output the...While executing `verify.py`, the preprocessing and extraction of each sample is carefully reported when in `-vvv` mode. The same does not happen when scoring. There is absolutely no output.
It would be good if the scoring bit output the number of models to be scored and, as it advances from one model to another, a printout of the model-identifier that was just scored.
@mguenther: is that doable?Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/80Drop dependency on Latex2017-09-26T10:20:20ZAmir MOHAMMADIDrop dependency on LatexMatplotlib provides its own mini Tex: http://matplotlib.org/users/mathtext.html
Can that used be instead of Latex? we get so many questions about this on the mailing list.Matplotlib provides its own mini Tex: http://matplotlib.org/users/mathtext.html
Can that used be instead of Latex? we get so many questions about this on the mailing list.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/79`Algorithm.read_probe` should not exist2017-08-07T13:07:35ZManuel Günthersiebenkopf@googlemail.com`Algorithm.read_probe` should not existThis is something that I had in mind when designing the `Algorithm`, but which I postponed far too long by now, and it regularly bites me:
In the `Algorithm`, the `read_probe` function should not exist. Instead, the according functions ...This is something that I had in mind when designing the `Algorithm`, but which I postponed far too long by now, and it regularly bites me:
In the `Algorithm`, the `read_probe` function should not exist. Instead, the according functions of the `Extractor.read_feature` or `Algorithm.read_feature` should be called instead. For model enrollment, this is already done: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/tools/algorithm.py#L247
For probes, I was too lazy to implement this correctly, and added an `Algorithm.read_probe` function instead. In most cases, this works as expected. But when data structures are more complex, then the algorithm needs to know, how to read the features. For example, in `bob.bio.face.algoritm.GaborJet`, the `read_probe` function: https://gitlab.idiap.ch/bob/bob.bio.face/blob/master/bob/bio/face/algorithm/GaborJet.py#L204
is an exact copy of the `bob.bio.face.extractor.GridGraph.read_feature` function: https://gitlab.idiap.ch/bob/bob.bio.face/blob/master/bob/bio/face/extractor/GridGraph.py#L221
The reason, why I was too lazy to implement it correctly is that during scoring, the probes are read in the innermost function: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/tools/scoring.py#L16 and I didn't want to pass the `reader` through all the function calls. Stupid me. Since now so many more packages have been written, which use the bad implementation, fixing this is a major issue:
* [x] Fix the scoring script to use the correct `reader`
* [x] Remove the `read_probe` from the `Algorithm` base class and all derived classes in all packages
* [x] Fix the `verify.py` script and all derived scripts to pass the `extractor` to the scoring function
* [ ] Check all the test cases and assure that `Algorithm.read_probe` is replaced with the according call to `Extractor.read_feature` or `Algorithm.read_feature`Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.com