bob.bio.base issueshttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues2017-08-07T13:07:35Zhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/73Incorporate a general overview of biometric verification and illustrate biome...2017-08-07T13:07:35ZVedrana KRIVOKUCAIncorporate a general overview of biometric verification and illustrate biometric verification experiment flow in bob.bio.base docA general overview of the structure of a typical biometric verification system should be added as introductory material inside the bob.bio.base documentation on running biometric recognition experiments.
Also, there should be some dia...A general overview of the structure of a typical biometric verification system should be added as introductory material inside the bob.bio.base documentation on running biometric recognition experiments.
Also, there should be some diagrams (+ accompanying explanations) to facilitate better understanding of the biometric verification experiment flow in bob.bio.base.
After much deliberation, I also believe we should be using the word "verification" instead of the more general word "recognition", simply because our main script is called verify.py (not recognise.py or identify.py).May 2017 HackathonManuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/70Create/Improve a high-level interface creation guide2020-04-23T14:52:59ZAmir MOHAMMADICreate/Improve a high-level interface creation guideWe have a guide here: https://gitlab.idiap.ch/biometric/software/wikis/database_creation_guide but it is outdated and full of grammar and spelling errors.
The docs on here should be revised and if necessary be integrated with that guide...We have a guide here: https://gitlab.idiap.ch/biometric/software/wikis/database_creation_guide but it is outdated and full of grammar and spelling errors.
The docs on here should be revised and if necessary be integrated with that guide.
@heusch and @onikisins might be interested in doing this in the hackathonMay 2017 HackathonOlegs NIKISINSOlegs NIKISINShttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/106Using Bob as a library: Don't force HDF5 serialization2020-10-16T14:49:57ZJaden DIEFENBAUGHUsing Bob as a library: Don't force HDF5 serializationThere are many, many places in `bob.bio.base` & the associated ecosystem where it is assumed the user wants to serialize information to an HDF5 file (for example, [`bob.bio.base.PCA`'s `train_projector()`](https://www.idiap.ch/software/b...There are many, many places in `bob.bio.base` & the associated ecosystem where it is assumed the user wants to serialize information to an HDF5 file (for example, [`bob.bio.base.PCA`'s `train_projector()`](https://www.idiap.ch/software/bob/docs/bob/bob.bio.base/master/_modules/bob/bio/base/algorithm/PCA.html#PCA.train_projector) always writes to an HDF5 file). This is an issue when using Bob tools in different use-cases & environments, as there's no guarantee that a user wants to write to an HDF5 file. Sometimes the user _can't_ write to files, such as in BEAT, which is the specific use-case that concerns me.
(Disk) serialization should at least be opt-in, and the data that was previously saved to disk by default should be returned by the function instead. For the above PCA example, this would change `train_projector()` to return the variances by default, and optionally write them to disk. Changes like this is the bare minimum needed to use these Bob tools in BEAT.
Honestly, though, serialization endpoints (disk, network, whatever) in general should be separated from individual Bob tools. A preprocessor/extractor/algorithm/whatever should have a method for general serialization as well as a method for rehydrating the instance using this data (this is already present in many places, but is just hard-coded to write to an HDF5 file). Some `bob.serialization` package could handle writing this data to disks/caches/networks/whatever.
What does everyone think?Bob 9.0.0https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/112Severe Issue `bob bio metric`2018-07-10T07:19:04ZTiago de Freitas PereiraSevere Issue `bob bio metric`Guys,
The majority (if not all) of the metrics implemented in `bob bio` relies on a `dev` `eval` sets.
Excluding the `EPC` curve, whose `dev` and `eval` sets are necessary for the computation of the curve, this is not a requirement for...Guys,
The majority (if not all) of the metrics implemented in `bob bio` relies on a `dev` `eval` sets.
Excluding the `EPC` curve, whose `dev` and `eval` sets are necessary for the computation of the curve, this is not a requirement for the rest of the metrics in biometrics (and doesn't make sense).
Furthermore, most of the databases we have don't have the pair `dev` and `eval`.
I guess only the ones we created the protocol ourselves have these sets.
How can I use `bob bio metric` if my dataset has only a dev set?
For instance, if I try to plot a ROC curve I get:
```
bob bio roc ./scores-dev -o my-roc.pdf
Error: Invalid value for "scores": The number of provided scores must be > 0 and a multiple of 2 because the following files are required:
- 1 development file(s)
- 1 evaluation file(s)
```
This is a severe issue.
For instance, I can't use these scripts for my work (only through a hack)
ping @theophile.gentilhomme and @amohammadihttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/101A programmatic way to use bob.bio.base2020-04-23T14:52:02ZTiago de Freitas PereiraA programmatic way to use bob.bio.baseThe verify.py script is very handy to trigger experiments.
Not very recently @andre.anjos pushed a mechanism that allows us to provide configuration files as input and this made the work of preparing and triggering experiments even simpl...The verify.py script is very handy to trigger experiments.
Not very recently @andre.anjos pushed a mechanism that allows us to provide configuration files as input and this made the work of preparing and triggering experiments even simpler and cleaner.
Imagine that your work consists of more than one hundred experiments with different databases (splits), preprocessors, feature extractors, etc.
You can handle this complexity by spliting your configuration file in several parts and have something like this:
```
$ verify.py config1.py config2.py config3.py
```
Although this makes the work clearner, this doesn't avoid you to call verify.py several times.
A way to handle that is to have some sort of shell scripts that orchestrate these calls (or python scripts that generate the verify.py strings).
Doesn't look very clean.
The script verify.py is not just a command line View for the experiment triggering; it handles the argument parsing, input validation and the experiment triggering.
Today, it's not possible to detach the argument parsing from the input validation and the experiment triggering which makes impossible to programmatically trigger experiments.
We need some View-Controller design for this task.
For instance, in the View, we could wrap the content of the argument parsing in a dictionary (or use docopt that does this for free) and pass it to the Controller that will handle with the input validation and experiment triggering (it's just one way to handle this).
What would you think?
thanks
I'm willing to give a shot in this direction.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/99Splitting the data one by one instead of chunk by chunk2020-04-23T14:55:50ZAmir MOHAMMADISplitting the data one by one instead of chunk by chunkCurrently bob.bio.base (when submitting a job to gridtk) splits the data like this:
```
mylist = range(10)
parallel_jobs = 2
list1 == range(5)
list2 == range(5,10)
```
This is really cumbersome when you have an unbalanced database.
For e...Currently bob.bio.base (when submitting a job to gridtk) splits the data like this:
```
mylist = range(10)
parallel_jobs = 2
list1 == range(5)
list2 == range(5,10)
```
This is really cumbersome when you have an unbalanced database.
For example right now I have a video database where the beginning samples have only 1 frame and quickly finish processing but the rest of the data have 20 frames in each samples which take 20 times more to process.
I was wondering if it is possible to split the data like the following when running in parallel:
```
mylist = range(10)
parallel_jobs = 2
list1 == range(0,10,2)
list2 == range(1,10,2)
```https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/98Scoring phase is too non-verbose even when `-vvv` is passed2017-12-05T10:25:54ZAndré AnjosScoring phase is too non-verbose even when `-vvv` is passedWhile executing `verify.py`, the preprocessing and extraction of each sample is carefully reported when in `-vvv` mode. The same does not happen when scoring. There is absolutely no output.
It would be good if the scoring bit output the...While executing `verify.py`, the preprocessing and extraction of each sample is carefully reported when in `-vvv` mode. The same does not happen when scoring. There is absolutely no output.
It would be good if the scoring bit output the number of models to be scored and, as it advances from one model to another, a printout of the model-identifier that was just scored.
@mguenther: is that doable?Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/93The current implementation of the toolchain is not flexible2020-04-23T14:55:30ZManuel Günthersiebenkopf@googlemail.comThe current implementation of the toolchain is not flexibleI woke up this morning at 5, and I couldn't sleep anymore. Instead, my mind led me to the conclusion that the current implementation of the toolchain is anything but flexible. Other biometrics, for example fingerprints, might require a d...I woke up this morning at 5, and I couldn't sleep anymore. Instead, my mind led me to the conclusion that the current implementation of the toolchain is anything but flexible. Other biometrics, for example fingerprints, might require a different set of tools. Modifying the toolchain in `bob.bio.gmm` was a complete hack, and it will not be possible for other researchers (I am not even sure for the Idiapers) to build a new toolchain.
I am thinking that it should be possible to write a generic `Toolchain` class to handle the submission and execution of jobs. This issue is just a reminder for myself to think about a better solution than the current.
I will come back to you when I have a more detailed plan. In the meantime, any comments are welcome.Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/85Only 4-column score files can be written2017-07-18T15:57:39ZManuel Günthersiebenkopf@googlemail.comOnly 4-column score files can be writtenSo far, only 4-column score files can be written. The problem is that sometimes you might need to have the `model_id` for the gallery in you score files. These are currently ignored.So far, only 4-column score files can be written. The problem is that sometimes you might need to have the `model_id` for the gallery in you score files. These are currently ignored.Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/83The of module `imp` is deprecated2018-05-16T16:13:52ZAndré AnjosThe of module `imp` is deprecatedSince python 3.3 came out, the module `imp` is being deprecated in favor of `importlib`. In python2 one can use an external package to mimick its behaviour (https://pypi.python.org/pypi/importlib2). This package is unfortunately not avai...Since python 3.3 came out, the module `imp` is being deprecated in favor of `importlib`. In python2 one can use an external package to mimick its behaviour (https://pypi.python.org/pypi/importlib2). This package is unfortunately not available in conda yet.Tiago de Freitas PereiraTiago de Freitas Pereirahttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/80Drop dependency on Latex2017-09-26T10:20:20ZAmir MOHAMMADIDrop dependency on LatexMatplotlib provides its own mini Tex: http://matplotlib.org/users/mathtext.html
Can that used be instead of Latex? we get so many questions about this on the mailing list.Matplotlib provides its own mini Tex: http://matplotlib.org/users/mathtext.html
Can that used be instead of Latex? we get so many questions about this on the mailing list.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/79`Algorithm.read_probe` should not exist2017-08-07T13:07:35ZManuel Günthersiebenkopf@googlemail.com`Algorithm.read_probe` should not existThis is something that I had in mind when designing the `Algorithm`, but which I postponed far too long by now, and it regularly bites me:
In the `Algorithm`, the `read_probe` function should not exist. Instead, the according functions ...This is something that I had in mind when designing the `Algorithm`, but which I postponed far too long by now, and it regularly bites me:
In the `Algorithm`, the `read_probe` function should not exist. Instead, the according functions of the `Extractor.read_feature` or `Algorithm.read_feature` should be called instead. For model enrollment, this is already done: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/tools/algorithm.py#L247
For probes, I was too lazy to implement this correctly, and added an `Algorithm.read_probe` function instead. In most cases, this works as expected. But when data structures are more complex, then the algorithm needs to know, how to read the features. For example, in `bob.bio.face.algoritm.GaborJet`, the `read_probe` function: https://gitlab.idiap.ch/bob/bob.bio.face/blob/master/bob/bio/face/algorithm/GaborJet.py#L204
is an exact copy of the `bob.bio.face.extractor.GridGraph.read_feature` function: https://gitlab.idiap.ch/bob/bob.bio.face/blob/master/bob/bio/face/extractor/GridGraph.py#L221
The reason, why I was too lazy to implement it correctly is that during scoring, the probes are read in the innermost function: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/tools/scoring.py#L16 and I didn't want to pass the `reader` through all the function calls. Stupid me. Since now so many more packages have been written, which use the bad implementation, fixing this is a major issue:
* [x] Fix the scoring script to use the correct `reader`
* [x] Remove the `read_probe` from the `Algorithm` base class and all derived classes in all packages
* [x] Fix the `verify.py` script and all derived scripts to pass the `extractor` to the scoring function
* [ ] Check all the test cases and assure that `Algorithm.read_probe` is replaced with the according call to `Extractor.read_feature` or `Algorithm.read_feature`Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/77For a given biometric sample, only one feature can be extracted, and only one...2017-08-07T13:07:35ZManuel Günthersiebenkopf@googlemail.comFor a given biometric sample, only one feature can be extracted, and only one identity can be assignedI will describe the issue in terms of face recognition, but similarities can be drawn to other biometrics, for example speaker recognition.
Currently, we assume that in a given image, there is only one identity in an image. When applyin...I will describe the issue in terms of face recognition, but similarities can be drawn to other biometrics, for example speaker recognition.
Currently, we assume that in a given image, there is only one identity in an image. When applying face detection, we only take the face with the highest detection score. In an real open-set have datasets as I have used in my latest opes-set recognition challenge (see: http://vast.uccs.edu/Opensetface) , we have several faces per image, and we need to **detect** all faces (which might end up in an unknown number of misdetections). Finally, we need to extract a feature vector **for each detected bounding box**, and compare each feature vector to the gallery models.
So far, none of the concepts that we have implemented thus far is dealing with this kind of problem. I am also not so sure if we should implement a solution directly into `bob.bio.base`, or if I should try to implement a solution into a stand-alone package.
@amohammadi @andre.anjos @tiago.pereira What do you think?Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/75The documentation contains links to files, which only work, when the document...2017-08-07T13:07:35ZManuel Günthersiebenkopf@googlemail.comThe documentation contains links to files, which only work, when the documentation is compiled locallyAs mentioned in https://gitlab.idiap.ch/bob/bob.bio.base/merge_requests/73#note_16158, there are file links in the documentation, e.g., in the installation section: http://pythonhosted.org/bob.bio.base/installation.html#test-your-install...As mentioned in https://gitlab.idiap.ch/bob/bob.bio.base/merge_requests/73#note_16158, there are file links in the documentation, e.g., in the installation section: http://pythonhosted.org/bob.bio.base/installation.html#test-your-installation which link to local files (sorry, GitLab does not allow me to get link to the line in the `.rst` file: gitlab.idiap.ch/bob/bob.bio.base/blob/master/doc/installation.rst).
The note above can be simply removed from the `.rst`. It should be OK not to have too many options to change the `ATNT_DIRECTORY`. People, who have installed bob system-wide will not be able to change the linked file anyways.
There might be more cases with links to local files, though in a quick glance I could not identify one (at least not here in `bob.bio.base`). If we find those file links, please feel free to re-open this bug (after we have closed it), and replace those local file links with links to the corresponding file in the GitLab source.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/68ROC, and CMC curves do not follow the standard of the Handbook of Face Recogn...2017-08-07T13:07:35ZManuel Günthersiebenkopf@googlemail.comROC, and CMC curves do not follow the standard of the Handbook of Face RecognitionIn the above mentioned book, the axis are given between `min_far` and 1, while we plot in percentages.
Also, there is no way to provide a lowest FAR value other than `0.01%` (or `1e-4` without percentages) for plotting ROC curves.
For ...In the above mentioned book, the axis are given between `min_far` and 1, while we plot in percentages.
Also, there is no way to provide a lowest FAR value other than `0.01%` (or `1e-4` without percentages) for plotting ROC curves.
For the DET and EPC curves, we also use percentages. However, these plots are not given in the Handbook, so we can use them freely. I would suggest to remove the percentages there, too, just to be consistent.Manuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/60Reporting the failed samples in the score files2019-11-11T18:39:44ZAmir MOHAMMADIReporting the failed samples in the score filesAccording to the [ISO/IEC 19795-1](https://www.iec-normen.de/dokumente/preview-pdf/info_isoiec19795-1%7Bed1.0%7Den.pdf), several performance measures exist:
```
4.6.1 failure-to-enrol rate FTE
proportion of the population for whom the...According to the [ISO/IEC 19795-1](https://www.iec-normen.de/dokumente/preview-pdf/info_isoiec19795-1%7Bed1.0%7Den.pdf), several performance measures exist:
```
4.6.1 failure-to-enrol rate FTE
proportion of the population for whom the system fails to complete the enrolment process
NOTE The observed failure-to-enrol rate is measured on test crew enrolments. The predicted/expected failure-to-
enrol rate will apply to the entire target population.
4.6.2 failure-to-acquire rate FTA
proportion of verification or identification attempts for which the system fails to capture or locate an image or
signal of sufficient quality
NOTE The observed failure-to-acquire rate is distinct from the predicted/expected failure-to-acquire rate (the former
may be used to estimate the latter).
4.6.3 false non-match rate FNMR
proportion of genuine attempt samples falsely declared not to match the template of the same characteristic
from the same user supplying the sample
NOTE The measured/observed false non-match rate is distinct from the predicted/expected false non-match rate (the
former may be used to estimate the latter).
4.6.4 false match rate FMR
proportion of zero-effort impostor attempt samples falsely declared to match the compared non-self template
NOTE The measured/observed false match rate is distinct from the predicted/expected false match rate (the former
may be used to estimate the latter).
4.6.5 false reject rate FRR
proportion of verification transactions with truthful claims of identity that are incorrectly denied
4.6.6 false accept rate FAR
proportion of verification transactions with wrongful claims of identity that are incorrectly confirmed
```
And this is how to calculate them:
```
FRR = FTA + FNMR * (1 – FTA)
FAR = FMR * (1 – FTA)
```
However: "*Comparison of systems having different failure-to-enrol rates may require use of generalized false reject (GFRR) and
false accept rates (GFAR) which combine enrolment, sample acquisition and matching errors. The method of
generalization should be appropriate to the evaluation.*"
Where one possible solution is: "*A typical generalization is to treat a failure-to-enrol as if
the enrolment completed, but all subsequent verification or identification transactions by that enrolee, or
against their template, fail. The method of generalization shall be reported.*"
which I think is good enough for us.
`bob.bio.base` handles failed samples with the `--allow-missing-files` option but the problem is normally you want to see traces of these failed samples in the score files too so that you can calculate the **FTA**. (edit these samples have NaN scores so you can calculate FTA)
I think the best way to do this would be to report `numpy.nan` in score files when something goes wrong.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/50Why is the BioDatabase.model_ids_with_protocol function required?2018-06-03T12:14:15ZManuel Günthersiebenkopf@googlemail.comWhy is the BioDatabase.model_ids_with_protocol function required?I am trying to implement a new database interface for one of our local databases, which uses the `BioFileSet` interface. I have created a class derived from `BioDatabase`, and I am implementing the functions.
While implementing the data...I am trying to implement a new database interface for one of our local databases, which uses the `BioFileSet` interface. I have created a class derived from `BioDatabase`, and I am implementing the functions.
While implementing the database, I was stumbling upon a *pure virtual* function that I don't understand: `model_ids_with_protocol`. Why do I have to implement this function, and why can't I just simply implement the `model_ids` function?
When I see it correctly, the `model_ids_with_protocol` function is called from within `model_ids`: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/database/database.py#L378
passing a the member variable `self.protocol` as the protocol.
This is IMHO completely useless. The `self.protocol` is available in the derived class as well, and it can be used inside there, without needing to pass it as a parameter. The exact same is true for the functions `objects`: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/database/database.py#L381, `object_sets`: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/database/database.py#L565 and the `tmodel_ids_with_protocol`: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/database/database.py#L732
So, is there any reasoning behind having the `protocol` parameters in these functions, other than 'It has been that way in the bob.db.verification.database interface' (where it was required)?
@amohammadihttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/48BioDatabase.model_ids should accept **kwargs2018-06-03T12:14:15ZAmir MOHAMMADIBioDatabase.model_ids should accept **kwargsfor potential filtering of model_ids specific to a database.
Just like the way that BioDatabase.objects takes **kwargs.for potential filtering of model_ids specific to a database.
Just like the way that BioDatabase.objects takes **kwargs.https://gitlab.idiap.ch/bob/bob.bio.base/-/issues/39It is time for a new release2018-06-03T12:14:16ZTiago de Freitas PereiraIt is time for a new releaseHi people,
I'm just finishing to publish some packages from `bob`.
So, It is time for new release of `bob.bio.*` (`bob.io.base`, `bob.bio.face`, `bob.bio.video`, `bob.bio.vein`, `bob.bio.spear`, `bob.db.bio_filelist` and `bob.bio.c...Hi people,
I'm just finishing to publish some packages from `bob`.
So, It is time for new release of `bob.bio.*` (`bob.io.base`, `bob.bio.face`, `bob.bio.video`, `bob.bio.vein`, `bob.bio.spear`, `bob.db.bio_filelist` and `bob.bio.csu`).
I've seem some people working on master branches and this is not healthy at all.
Can I do it tomorrow?? We need to end this cycle of development.
Cheers
@bob
Refactoring 2016 and gitlab migration milestoneTiago de Freitas PereiraTiago de Freitas Pereirahttps://gitlab.idiap.ch/bob/bob.bio.base/-/issues/37Some database tests are missing2018-06-03T12:14:16ZTiago de Freitas PereiraSome database tests are missingIt is necessary to move the `bob.bio.db` tests to `bob.bio.base`It is necessary to move the `bob.bio.db` tests to `bob.bio.base`Refactoring 2016 and gitlab migration milestoneTiago de Freitas PereiraTiago de Freitas Pereira2016-09-16