Commit 347bcbf2 authored by Theophile GENTILHOMME's avatar Theophile GENTILHOMME
Browse files

Fix doc and user guide (mostly removed things)

parent 3fe154b9
Pipeline #17701 failed with stage
in 21 minutes and 29 seconds
......@@ -451,13 +451,9 @@ def det_axis(v, **kwargs):
def cmc(cmc_scores, logx=True, **kwargs):
"""Plots the (cumulative) match characteristics and returns the maximum rank.
This function plots a CMC curve using the given CMC scores, which can be read
from the our score files using the
:py:func:`bob.measure.load.cmc_four_column` or
:py:func:`bob.measure.load.cmc_five_column` methods. The structure of the
``cmc_scores`` parameter is relatively complex. It contains a list of pairs
of lists. For each probe object, a pair of list negative and positive scores
is required.
This function plots a CMC curve using the given CMC scores (:py:class:`list`:
A list of tuples, where each tuple contains the
``negative`` and ``positive`` scores for one probe of the database).
......@@ -101,11 +101,6 @@ defined in the first equation.
by the user and normally sits in files that may need some parsing before
these vectors can be extracted.
While it is not possible to provide a parser for every individual file that
may be generated in different experimental frameworks, we do provide a few
parsers for formats we use the most. Please refer to the documentation of
:py:mod:`bob.measure.load` for a list of formats and details.
In the remainder of this section we assume you have successfully parsed and
loaded your scores in two 1D float64 vectors and are ready to evaluate the
performance of the classifier.
......@@ -387,11 +382,6 @@ which defines a pair of positive and negative scores **per probe**:
Usually, there is only a single positive score per probe, but this is not a fixed restriction.
.. note::
The complex data structure can be read from our default 4 or 5 column score
files using the :py:func:`bob.measure.load.cmc` function.
Detection & Identification Curve
......@@ -440,178 +430,6 @@ look at the implementations at :py:mod:`bob.measure.plot` to understand how to
use the |project| methods to compute the curves and interlace that in the way
that best suits you.
Full applications
We do provide a few scripts that can be used to quickly evaluate a set of
scores. We present these scripts in this section. The scripts take as input
either a 4-column or 5-column data format as specified in the documentation of
:py:func:`bob.measure.load.four_column` or
To calculate the threshold using a certain criterion (EER, min.HTER or weighted
Error Rate) on a set, after setting up |project|, just do:
.. code-block:: sh
$ development-scores-4col.txt
Threshold: -0.004787956164
FAR : 6.731% (35/520)
FRR : 6.667% (26/390)
HTER: 6.699%
The output will present the threshold together with the FAR, FRR and HTER on
the given set, calculated using such a threshold. The relative counts of FAs
and FRs are also displayed between parenthesis.
To evaluate the performance of a new score file with a given threshold, use the
application ````:
.. code-block:: sh
$ -0.0047879 test-scores-4col.txt
FAR : 2.115% (11/520)
FRR : 7.179% (28/390)
HTER: 4.647%
In this case, only the error figures are presented. You can conduct the
evaluation and plotting of development (and test set data) using our combined
```` script. You pass both sets and it does the rest:
.. code-block:: sh
$ development-scores-4col.txt test-scores-4col.txt
[Min. criterion: EER] Threshold on Development set: -4.787956e-03
| Development | Test
FAR | 6.731% (35/520) | 2.500% (13/520)
FRR | 6.667% (26/390) | 6.154% (24/390)
HTER | 6.699% | 4.327%
[Min. criterion: Min. HTER] Threshold on Development set: 3.411070e-03
| Development | Test
FAR | 4.231% (22/520) | 1.923% (10/520)
FRR | 7.949% (31/390) | 7.692% (30/390)
HTER | 6.090% | 4.808%
[Plots] Performance curves => 'curves.pdf'
Inside that script we evaluate 2 different thresholds based on the EER and the
minimum HTER on the development set and apply the output to the test set. As
can be seen from the toy-example above, the system generalizes reasonably well.
A single PDF file is generated containing an EPC as well as ROC and DET plots
of such a system.
Use the ``--help`` option on the above-cited scripts to find-out about more
Score file conversion
Sometimes, it is required to export the score files generated by Bob to a
different format, e.g., to be able to generate a plot comparing Bob's systems
with other systems. In this package, we provide source code to convert between
different types of score files.
Bob to OpenBR
One of the supported formats is the matrix format that the National Institute
of Standards and Technology (NIST) uses, and which is supported by OpenBR_.
The scores are stored in two binary matrices, where the first matrix (usually
with a ``.mtx`` filename extension) contains the raw scores, while a second
mask matrix (extension ``.mask``) contains information, which scores are
positives, and which are negatives.
To convert from Bob's four column or five column score file to a pair of these
matrices, you can use the :py:func:`bob.measure.openbr.write_matrix` function.
In the simplest way, this function takes a score file
``'five-column-sore-file'`` and writes the pair ``'openbr.mtx', 'openbr.mask'``
of OpenBR compatible files:
.. code-block:: py
>>> bob.measure.openbr.write_matrix('five-column-sore-file', 'openbr.mtx', 'openbr.mask', score_file_format = '5column')
In this way, the score file will be parsed and the matrices will be written in
the same order that is obtained from the score file.
For most of the applications, this should be sufficient, but as the identity
information is lost in the matrix files, no deeper analysis is possible anymore
when just using the matrices. To enforce an order of the models and probes
inside the matrices, you can use the ``model_names`` and ``probe_names``
parameters of :py:func:`bob.measure.openbr.write_matrix`:
* The ``probe_names`` parameter lists the ``path`` elements stored in the score
files, which are the fourth column in a ``5column`` file, and the third
column in a ``4column`` file, see :py:func:`bob.measure.load.five_column` and
* The ``model_names`` parameter is a bit more complicated. In a ``5column``
format score file, the model names are defined by the second column of that
file, see :py:func:`bob.measure.load.five_column`. In a ``4column`` format
score file, the model information is not contained, but only the client
information of the model. Hence, for the ``4column`` format, the
``model_names`` actually lists the client ids found in the first column, see
.. warning::
The model information is lost, but required to write the matrix files. In
the ``4column`` format, we use client ids instead of the model
information. Hence, when several models exist per client, this function
will not work as expected.
Additionally, there are fields in the matrix files, which define the gallery
and probe list files that were used to generate the matrix. These file names
can be selected with the ``gallery_file_name`` and ``probe_file_name`` keyword
parameters of :py:func:`bob.measure.openbr.write_matrix`.
Finally, OpenBR defines a specific ``'search'`` score file format, which is
designed to be used to compute CMC curves. The score matrix contains
descendingly sorted and possibly truncated list of scores, i.e., for each
probe, a sorted list of all scores for the models is generated. To generate
these special score file format, you can specify the ``search`` parameter. It
specifies the number of highest scores per probe that should be kept. If the
``search`` parameter is set to a negative value, all scores will be kept. If
the ``search`` parameter is higher as the actual number of models, ``NaN``
scores will be appended, and the according mask values will be set to ``0``
(i.e., to be ignored).
OpenBR to Bob
On the other hand, you might also want to generate a Bob-compatible (four or
five column) score file based on a pair of OpenBR matrix and mask files. This
is possible by using the :py:func:`bob.measure.openbr.write_score_file`
function. At the basic, it takes the given pair of matrix and mask files, as
well as the desired output score file:
.. code-block:: py
>>> bob.measure.openbr.write_score_file('openbr.mtx', 'openbr.mask', 'four-column-sore-file')
This score file is sufficient to compute a CMC curve (see `CMC`_), however it
does not contain relevant client ids or paths for models and probes.
Particularly, it assumes that each client has exactly one associated model.
To add/correct these information, you can use additional parameters to
:py:func:`bob.measure.openbr.write_score_file`. Client ids of models and
probes can be added using the ``models_ids`` and ``probes_ids`` keyword
arguments. The length of these lists must be identical to the number of models
and probes as given in the matrix files, **and they must be in the same order
as used to compute the OpenBR matrix**. This includes that the same
same-client and different-client pairs as indicated by the OpenBR mask will be
generated, which will be checked inside the function.
To add model and probe path information, the ``model_names`` and
``probe_names`` parameters, which need to have the same size and order as the
``models_ids`` and ``probes_ids``. These information are simply stored in the
score file, and no further check is applied.
.. note:: The ``model_names`` parameter is used only when writing score files in ``score_file_format='5column'``, in the ``'4column'`` format, this parameter is ignored.
.. include:: links.rst
.. Place youre references here:
......@@ -64,21 +64,6 @@ Generic
Loading data
.. autosummary::
......@@ -98,19 +83,10 @@ Plotting
OpenBR conversions
.. autosummary::
.. automodule:: bob.measure
.. automodule:: bob.measure.load
.. automodule:: bob.measure.calibration
.. automodule:: bob.measure.plot
.. automodule:: bob.measure.openbr
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment