From 347bcbf24eacf9bf5a84b64e98e790cd58c2acaa Mon Sep 17 00:00:00 2001 From: Theophile GENTILHOMME <tgentilhomme@jurasix08.idiap.ch> Date: Fri, 16 Mar 2018 11:45:42 +0100 Subject: [PATCH] Fix doc and user guide (mostly removed things) --- bob/measure/plot.py | 10 +-- doc/guide.rst | 182 -------------------------------------------- doc/py_api.rst | 24 ------ 3 files changed, 3 insertions(+), 213 deletions(-) diff --git a/bob/measure/plot.py b/bob/measure/plot.py index 190bcde..cb3c84a 100644 --- a/bob/measure/plot.py +++ b/bob/measure/plot.py @@ -451,13 +451,9 @@ def det_axis(v, **kwargs): def cmc(cmc_scores, logx=True, **kwargs): """Plots the (cumulative) match characteristics and returns the maximum rank. - This function plots a CMC curve using the given CMC scores, which can be read - from the our score files using the - :py:func:`bob.measure.load.cmc_four_column` or - :py:func:`bob.measure.load.cmc_five_column` methods. The structure of the - ``cmc_scores`` parameter is relatively complex. It contains a list of pairs - of lists. For each probe object, a pair of list negative and positive scores - is required. + This function plots a CMC curve using the given CMC scores (:py:class:`list`: + A list of tuples, where each tuple contains the + ``negative`` and ``positive`` scores for one probe of the database). Parameters: diff --git a/doc/guide.rst b/doc/guide.rst index 63b819d..e6f694b 100644 --- a/doc/guide.rst +++ b/doc/guide.rst @@ -101,11 +101,6 @@ defined in the first equation. by the user and normally sits in files that may need some parsing before these vectors can be extracted. - While it is not possible to provide a parser for every individual file that - may be generated in different experimental frameworks, we do provide a few - parsers for formats we use the most. Please refer to the documentation of - :py:mod:`bob.measure.load` for a list of formats and details. - In the remainder of this section we assume you have successfully parsed and loaded your scores in two 1D float64 vectors and are ready to evaluate the performance of the classifier. @@ -387,11 +382,6 @@ which defines a pair of positive and negative scores **per probe**: Usually, there is only a single positive score per probe, but this is not a fixed restriction. -.. note:: - - The complex data structure can be read from our default 4 or 5 column score - files using the :py:func:`bob.measure.load.cmc` function. - Detection & Identification Curve ================================ @@ -440,178 +430,6 @@ look at the implementations at :py:mod:`bob.measure.plot` to understand how to use the |project| methods to compute the curves and interlace that in the way that best suits you. - -Full applications ------------------ - -We do provide a few scripts that can be used to quickly evaluate a set of -scores. We present these scripts in this section. The scripts take as input -either a 4-column or 5-column data format as specified in the documentation of -:py:func:`bob.measure.load.four_column` or -:py:func:`bob.measure.load.five_column`. - -To calculate the threshold using a certain criterion (EER, min.HTER or weighted -Error Rate) on a set, after setting up |project|, just do: - -.. code-block:: sh - - $ bob_eval_threshold.py development-scores-4col.txt - Threshold: -0.004787956164 - FAR : 6.731% (35/520) - FRR : 6.667% (26/390) - HTER: 6.699% - -The output will present the threshold together with the FAR, FRR and HTER on -the given set, calculated using such a threshold. The relative counts of FAs -and FRs are also displayed between parenthesis. - -To evaluate the performance of a new score file with a given threshold, use the -application ``bob_apply_threshold.py``: - -.. code-block:: sh - - $ bob_apply_threshold.py -0.0047879 test-scores-4col.txt - FAR : 2.115% (11/520) - FRR : 7.179% (28/390) - HTER: 4.647% - -In this case, only the error figures are presented. You can conduct the -evaluation and plotting of development (and test set data) using our combined -``bob_compute_perf.py`` script. You pass both sets and it does the rest: - -.. code-block:: sh - - $ bob_compute_perf.py development-scores-4col.txt test-scores-4col.txt - [Min. criterion: EER] Threshold on Development set: -4.787956e-03 - | Development | Test - -------+-----------------+------------------ - FAR | 6.731% (35/520) | 2.500% (13/520) - FRR | 6.667% (26/390) | 6.154% (24/390) - HTER | 6.699% | 4.327% - [Min. criterion: Min. HTER] Threshold on Development set: 3.411070e-03 - | Development | Test - -------+-----------------+------------------ - FAR | 4.231% (22/520) | 1.923% (10/520) - FRR | 7.949% (31/390) | 7.692% (30/390) - HTER | 6.090% | 4.808% - [Plots] Performance curves => 'curves.pdf' - -Inside that script we evaluate 2 different thresholds based on the EER and the -minimum HTER on the development set and apply the output to the test set. As -can be seen from the toy-example above, the system generalizes reasonably well. -A single PDF file is generated containing an EPC as well as ROC and DET plots -of such a system. - -Use the ``--help`` option on the above-cited scripts to find-out about more -options. - - -Score file conversion ---------------------- - -Sometimes, it is required to export the score files generated by Bob to a -different format, e.g., to be able to generate a plot comparing Bob's systems -with other systems. In this package, we provide source code to convert between -different types of score files. - -Bob to OpenBR -============= - -One of the supported formats is the matrix format that the National Institute -of Standards and Technology (NIST) uses, and which is supported by OpenBR_. -The scores are stored in two binary matrices, where the first matrix (usually -with a ``.mtx`` filename extension) contains the raw scores, while a second -mask matrix (extension ``.mask``) contains information, which scores are -positives, and which are negatives. - -To convert from Bob's four column or five column score file to a pair of these -matrices, you can use the :py:func:`bob.measure.openbr.write_matrix` function. -In the simplest way, this function takes a score file -``'five-column-sore-file'`` and writes the pair ``'openbr.mtx', 'openbr.mask'`` -of OpenBR compatible files: - -.. code-block:: py - - >>> bob.measure.openbr.write_matrix('five-column-sore-file', 'openbr.mtx', 'openbr.mask', score_file_format = '5column') - -In this way, the score file will be parsed and the matrices will be written in -the same order that is obtained from the score file. - -For most of the applications, this should be sufficient, but as the identity -information is lost in the matrix files, no deeper analysis is possible anymore -when just using the matrices. To enforce an order of the models and probes -inside the matrices, you can use the ``model_names`` and ``probe_names`` -parameters of :py:func:`bob.measure.openbr.write_matrix`: - -* The ``probe_names`` parameter lists the ``path`` elements stored in the score - files, which are the fourth column in a ``5column`` file, and the third - column in a ``4column`` file, see :py:func:`bob.measure.load.five_column` and - :py:func:`bob.measure.load.four_column`. -* The ``model_names`` parameter is a bit more complicated. In a ``5column`` - format score file, the model names are defined by the second column of that - file, see :py:func:`bob.measure.load.five_column`. In a ``4column`` format - score file, the model information is not contained, but only the client - information of the model. Hence, for the ``4column`` format, the - ``model_names`` actually lists the client ids found in the first column, see - :py:func:`bob.measure.load.four_column`. - - .. warning:: - The model information is lost, but required to write the matrix files. In - the ``4column`` format, we use client ids instead of the model - information. Hence, when several models exist per client, this function - will not work as expected. - -Additionally, there are fields in the matrix files, which define the gallery -and probe list files that were used to generate the matrix. These file names -can be selected with the ``gallery_file_name`` and ``probe_file_name`` keyword -parameters of :py:func:`bob.measure.openbr.write_matrix`. - -Finally, OpenBR defines a specific ``'search'`` score file format, which is -designed to be used to compute CMC curves. The score matrix contains -descendingly sorted and possibly truncated list of scores, i.e., for each -probe, a sorted list of all scores for the models is generated. To generate -these special score file format, you can specify the ``search`` parameter. It -specifies the number of highest scores per probe that should be kept. If the -``search`` parameter is set to a negative value, all scores will be kept. If -the ``search`` parameter is higher as the actual number of models, ``NaN`` -scores will be appended, and the according mask values will be set to ``0`` -(i.e., to be ignored). - - -OpenBR to Bob -============= - -On the other hand, you might also want to generate a Bob-compatible (four or -five column) score file based on a pair of OpenBR matrix and mask files. This -is possible by using the :py:func:`bob.measure.openbr.write_score_file` -function. At the basic, it takes the given pair of matrix and mask files, as -well as the desired output score file: - -.. code-block:: py - - >>> bob.measure.openbr.write_score_file('openbr.mtx', 'openbr.mask', 'four-column-sore-file') - -This score file is sufficient to compute a CMC curve (see `CMC`_), however it -does not contain relevant client ids or paths for models and probes. -Particularly, it assumes that each client has exactly one associated model. - -To add/correct these information, you can use additional parameters to -:py:func:`bob.measure.openbr.write_score_file`. Client ids of models and -probes can be added using the ``models_ids`` and ``probes_ids`` keyword -arguments. The length of these lists must be identical to the number of models -and probes as given in the matrix files, **and they must be in the same order -as used to compute the OpenBR matrix**. This includes that the same -same-client and different-client pairs as indicated by the OpenBR mask will be -generated, which will be checked inside the function. - -To add model and probe path information, the ``model_names`` and -``probe_names`` parameters, which need to have the same size and order as the -``models_ids`` and ``probes_ids``. These information are simply stored in the -score file, and no further check is applied. - -.. note:: The ``model_names`` parameter is used only when writing score files in ``score_file_format='5column'``, in the ``'4column'`` format, this parameter is ignored. - - .. include:: links.rst .. Place youre references here: diff --git a/doc/py_api.rst b/doc/py_api.rst index 356a55a..cb8bf9d 100644 --- a/doc/py_api.rst +++ b/doc/py_api.rst @@ -64,21 +64,6 @@ Generic bob.measure.rmse bob.measure.get_config -Loading data ------------- - -.. autosummary:: - bob.measure.load.open_file - bob.measure.load.scores - bob.measure.load.split - bob.measure.load.cmc - bob.measure.load.four_column - bob.measure.load.split_four_column - bob.measure.load.cmc_four_column - bob.measure.load.five_column - bob.measure.load.split_five_column - bob.measure.load.cmc_five_column - Calibration ----------- @@ -98,19 +83,10 @@ Plotting bob.measure.plot.cmc bob.measure.plot.detection_identification_curve -OpenBR conversions ------------------- - -.. autosummary:: - bob.measure.openbr.write_matrix - bob.measure.openbr.write_score_file - Details ------- .. automodule:: bob.measure -.. automodule:: bob.measure.load .. automodule:: bob.measure.calibration .. automodule:: bob.measure.plot -.. automodule:: bob.measure.openbr -- GitLab