Commit b6a0a8c3 authored by Yannick DAYER's avatar Yannick DAYER

[doc] Errors correction

parent d9054b0a
Pipeline #47718 failed with stage
in 3 minutes and 10 seconds
...@@ -16,10 +16,10 @@ ...@@ -16,10 +16,10 @@
=========================== ===========================
As noted before, this package is part of the ``bob.pad`` packages, which in As noted before, this package is part of the ``bob.pad`` packages, which in
turn are part of the signal-processing and machine learning toolbox Bob_. To turn are part of the signal processing and machine learning toolbox Bob_. To
install Bob_, please read the `Installation Instructions <bobinstall_>`_. install Bob_, please read the `Installation Instructions <bobinstall_>`_.
Then, to install the ``bob.pad`` packages and in turn maybe the database Then, to install the ``bob.pad`` packages and in turn, maybe the database
packages that you want to use, use conda_ to install them: packages that you want to use, use conda_ to install them:
.. code-block:: sh .. code-block:: sh
...@@ -61,7 +61,7 @@ Databases ...@@ -61,7 +61,7 @@ Databases
With ``bob.pad`` you will run biometric recognition experiments using databases that contain presentation attacks. With ``bob.pad`` you will run biometric recognition experiments using databases that contain presentation attacks.
Though the PAD protocols are implemented in ``bob.pad``, the original data are **not included**. Though the PAD protocols are implemented in ``bob.pad``, the original data are **not included**.
To download the original data of the databases, please refer to the according Web-pages. To download the original data of the databases, please refer to the corresponding Web-pages.
For a list of supported databases including their download URLs, For a list of supported databases including their download URLs,
please refer to the `spoofing_databases <https://gitlab.idiap.ch/bob/bob/wikis/Packages>`_. please refer to the `spoofing_databases <https://gitlab.idiap.ch/bob/bob/wikis/Packages>`_.
......
...@@ -10,7 +10,7 @@ Introduction to presentation attack detection ...@@ -10,7 +10,7 @@ Introduction to presentation attack detection
============================================= =============================================
Presentation Attack Detection, or PAD, is a branch of biometrics aiming at detecting an attempt to dupe a biometric recognition system by modifying the sample presented to the sensor. Presentation Attack Detection, or PAD, is a branch of biometrics aiming at detecting an attempt to dupe a biometric recognition system by modifying the sample presented to the sensor.
The goal of PAD is to develop countermeasures to presentation attacks that are able to detect wether a biometric sample is a `bonafide` sample, or a presentation attack. The goal of PAD is to develop countermeasures to presentation attacks that can detect whether a biometric sample is a `bonafide` sample or a presentation attack.
For an introduction to biometrics, take a look at the :ref:`documentation of bob.bio.base <bob.bio.base.biometrics_intro>`. For an introduction to biometrics, take a look at the :ref:`documentation of bob.bio.base <bob.bio.base.biometrics_intro>`.
...@@ -21,10 +21,10 @@ Presentation attack ...@@ -21,10 +21,10 @@ Presentation attack
=================== ===================
Biometric recognition systems contain different points of attack. Attacks on certain points are either called direct or indirect attacks. Biometric recognition systems contain different points of attack. Attacks on certain points are either called direct or indirect attacks.
An indirect attack would consist of modifying data after the capture, in any of the steps between the capture and the decision stages. To prevent such attacks is relevant of classical cyber security, hardware and data protection. An indirect attack would consist of modifying data after the capture, in any of the steps between the capture and the decision stages. To prevent such attacks is relevant to classical cybersecurity, hardware protection, and data protection.
Presentation Attacks (PA), on the other hand, are the only direct attacks that can be performed on a biometric system, and countering those attacks is relevant to biometrics. Presentation Attacks (PA), on the other hand, are the only direct attacks that can be performed on a biometric system, and countering those attacks is relevant to biometrics.
For a face recognition system, for example, one of the possible presentation attack would be to wear a mask resembling another individual so that the system identifies the attacker as that other person. For a face recognition system, for example, one of the possible presentation attacks would be to wear a mask resembling another individual so that the system identifies the attacker as that other person.
New PAI (Presentation Attack Instrument) can be developed to counteract the countermeasures put in place in the first place, so the field is in constant evolution, to adapt to new threats and try to anticipate them. New PAI (Presentation Attack Instrument) can be developed to counteract the countermeasures put in place in the first place, so the field is in constant evolution, to adapt to new threats and try to anticipate them.
...@@ -39,7 +39,7 @@ This means that multiple cases are possible and should be detected by a biometri ...@@ -39,7 +39,7 @@ This means that multiple cases are possible and should be detected by a biometri
- An Attacker presents itself without trying to pass for another subject, the sample is categorized as **ZEI** (**Zero Effort Impostor**) sample, and should be rejected by the system (negative), - An Attacker presents itself without trying to pass for another subject, the sample is categorized as **ZEI** (**Zero Effort Impostor**) sample, and should be rejected by the system (negative),
- And the special case in PAD versus "standard" biometric systems: an Attacker uses a `Presentation Attack Instrument` (`PAI`) to pass as a genuine subject. This is a **PA** (**Presentation Attack**) sample, and should be rejected (negative). - And the special case in PAD versus "standard" biometric systems: an Attacker uses a `Presentation Attack Instrument` (`PAI`) to pass as a genuine subject. This is a **PA** (**Presentation Attack**) sample, and should be rejected (negative).
The term 'bona fide' is used for biometric samples presented without intention to change their identity (Genuine samples and ZEI samples). The term 'bonafide' is used for biometric samples presented without intention to change their identity (Genuine samples and ZEI samples).
.. figure:: img/pad-classes.png .. figure:: img/pad-classes.png
:figwidth: 75% :figwidth: 75%
...@@ -47,8 +47,8 @@ The term 'bona fide' is used for biometric samples presented without intention t ...@@ -47,8 +47,8 @@ The term 'bona fide' is used for biometric samples presented without intention t
:alt: Four different samples organized to display the different classes of PAD. :alt: Four different samples organized to display the different classes of PAD.
Categorization of samples in terms of biometric recognition and PAD systems. Categorization of samples in terms of biometric recognition and PAD systems.
A PAD system makes the distinction between the left samples (`bona-fide`, positives) and the right samples (presentation attack, negatives). A PAD system makes the distinction between the left samples (`bonafide`, positives) and the right samples (`presentation attack`, negatives).
A biometric recognition system, genuine samples are the positives, and both types of impostors are the negatives. In a biometric recognition system, genuine samples are the positives, and both types of impostors are the negatives.
Typical implementations of PAD Typical implementations of PAD
...@@ -60,19 +60,19 @@ PAD for face recognition is the most advanced in this field, face PAD systems ca ...@@ -60,19 +60,19 @@ PAD for face recognition is the most advanced in this field, face PAD systems ca
- **Type of light**: Some PAD systems work on visible light, using samples captured by a standard camera. A more advanced system would require a specific sensor to capture, for example, infrared light. - **Type of light**: Some PAD systems work on visible light, using samples captured by a standard camera. A more advanced system would require a specific sensor to capture, for example, infrared light.
- **User interaction**: Another way of asserting the authenticity of a sample is to request the presented user to respond to a challenge, like smiling or blinking at a specific moment. - **User interaction**: Another way of asserting the authenticity of a sample is to request the presented user to respond to a challenge, like smiling or blinking at a specific moment.
PAD system using a frame-based approach on visible light with no user interaction are the least robust but are more developed, as they can be easily integrated with existing biometric systems. PAD systems using a frame-based approach on visible light with no user interaction are the least robust but are more developed, as they can be easily integrated with existing biometric systems.
Evaluation of PAD systems Evaluation of PAD systems
========================= =========================
To evaluate a biometric system with PAD, a set of samples is fed to the system. Each samples is scored, and a post processing step is used to analyse those scores. To evaluate a biometric system with PAD, a set of samples is fed to the system. Each sample is scored, and a post-processing step is used to analyze those scores.
Licit scenario Licit scenario
-------------- --------------
When no PA samples are in the input set (only Genuine and ZEI samples), the situation is the same as a simple biometric experiment and is called `licit` scenario. See :ref:`biometric introduction<bob.bio.base.biometrics_intro>`. When no PA samples are in the input set (only Genuine and ZEI samples), the situation is the same as a simple biometric experiment and is called a `licit` scenario. See :ref:`biometric introduction<bob.bio.base.biometrics_intro>`.
Spoof scenario Spoof scenario
...@@ -81,11 +81,11 @@ Spoof scenario ...@@ -81,11 +81,11 @@ Spoof scenario
If no ZEI samples are present in the set (only Genuine and PA samples), the evaluation of a PAD system is seen as a two classes problem, and the same metrics as in a biometric evaluation can be used to assess its performance, where: If no ZEI samples are present in the set (only Genuine and PA samples), the evaluation of a PAD system is seen as a two classes problem, and the same metrics as in a biometric evaluation can be used to assess its performance, where:
- the False Positive Rate is called IAPMR (Impostor Attack Presentation Match Rate), - the False Positive Rate is called IAPMR (Impostor Attack Presentation Match Rate),
- the False Negative Rate is called FNMR (False Non Match Rate), - the False Negative Rate is called FNMR (False Non-Match Rate),
The ROC and DET can be plotted to represent the performance of the system over a range of operation points. The ROC and DET can be plotted to represent the performance of the system over a range of operation points.
This two classes case is referred as the `spoof` scenario. This two-classes case is referred to as the `spoof` scenario.
PAD evaluation PAD evaluation
...@@ -93,18 +93,18 @@ PAD evaluation ...@@ -93,18 +93,18 @@ PAD evaluation
When a mix of Zero Effort Impostor and PA are present in the input set, two possibilities arise. When a mix of Zero Effort Impostor and PA are present in the input set, two possibilities arise.
The bona fide (Genuine and ZEI) samples are treated as `positives` and PA samples are considered `negatives` (This will show the ability of the system to detect PA). The bonafide (Genuine and ZEI) samples are treated as `positives` and PA samples are considered `negatives` (This will show the ability of the system to detect PA).
The problem becomes binary, allowing the use of similar metrics as before, albeit with different denominations: The problem becomes binary, allowing the use of similar metrics as before, albeit with different denominations:
- the False Positive Rate is named APCER (Attack Presentation Classification Error Rate), - the False Positive Rate is named APCER (Attack Presentation Classification Error Rate),
- the False Negative Rate is named BPCER (Bona fide Presentation Classification Error Rate), - the False Negative Rate is named BPCER (Bonafide Presentation Classification Error Rate),
- the Half Total Error Rate is named ACER (Average Classification Error Rate). - the Half Total Error Rate is named ACER (Average Classification Error Rate).
The ZEI and PA samples can also be considered two separate negative classes, leading to a ternary classification with one positive class (genuine samples) and two distinct negative classes: ZEI and PA. The ZEI and PA samples can also be considered two separate negative classes, leading to a ternary classification with one positive class (genuine samples) and two distinct negative classes: ZEI and PA.
The EPS (Expected Performance and Spoofability) framework was introduced to assess the reliability of a biometric system with PAD by defining two parameters determining how much importance is given to each class of samples: The EPS (Expected Performance and Spoofability) framework was introduced to assess the reliability of a biometric system with PAD by defining two parameters determining how much importance is given to each class of samples:
- ω represents the importance of the PA scores with respect to the ZEI scores. - ω represents the importance of the PA scores against the ZEI scores.
- β represents the importance of the negative classes (PA and ZEI scores) relative to the positive class (Genuine). - β represents the importance of the negative classes (PA and ZEI scores) relative to the positive class (Genuine).
From the scores and those two parameters, the following metrics can be measured: From the scores and those two parameters, the following metrics can be measured:
......
...@@ -19,14 +19,14 @@ The database interface definition follows closely the one in :ref:`bob.bio.base. ...@@ -19,14 +19,14 @@ The database interface definition follows closely the one in :ref:`bob.bio.base.
- :py:meth:`database.fit_samples` returns the samples (or delayed samples) used to train the classifier; - :py:meth:`database.fit_samples` returns the samples (or delayed samples) used to train the classifier;
- :py:meth:`database.predict_samples` returns the samples that will be used for evaluating the system. This is where the group (`dev` or `eval`) is specified. - :py:meth:`database.predict_samples` returns the samples that will be used for evaluating the system. This is where the group (`dev` or `eval`) is specified.
A difference with the bob.bio.base database interface is the presence of an ``attack_type`` annotation. It stores the type of PAI to allow the scoring each different types of attack separately. A difference with the bob.bio.base database interface is the presence of an ``attack_type`` annotation. It stores the type of PAI to allow the scoring of each different type of attack separately.
File list interface File list interface
------------------- -------------------
A class with those methods returning the corresponding data can be implemented for each dataset, but an easier way to do it is with the `file list` interface. A class with those methods returning the corresponding data can be implemented for each dataset, but an easier way to do it is with the `file list` interface.
This allows the creation of multiple protocols and various groups by editing some csv files. This allows the creation of multiple protocols and various groups by editing some CSV files.
The dataset configuration file will then be as simple as: The dataset configuration file will then be as simple as:
...@@ -64,11 +64,11 @@ The files must follow the following structure and naming: ...@@ -64,11 +64,11 @@ The files must follow the following structure and naming:
+-- for_real.csv +-- for_real.csv
+-- for_attack.csv +-- for_attack.csv
The content of the files in the ``train`` folder are used when a protocol contains data for training the classifier. The content of the files in the ``train`` folder is used when a protocol contains data for training the classifier.
The files in the ``eval`` folder are optional and are used in case a protocol contains data for evaluation. The files in the ``eval`` folder are optional and are used in case a protocol contains data for evaluation.
These csv files should contain at least the path to raw data and an identifier to the subject in the image (subject). These CSV files should contain at least the path to raw data and an identifier to the identity of the subject in the image (subject field).
The structure of each csv file should be as below: The structure of each CSV file should be as below:
.. code-block:: text .. code-block:: text
...@@ -79,7 +79,7 @@ The structure of each csv file should be as below: ...@@ -79,7 +79,7 @@ The structure of each csv file should be as below:
... ...
Metadata can be shipped within the Samples (e.g gender, age, annotations, ...) by adding a column in the csv file for each metadata: Metadata can be shipped within the Samples (e.g gender, age, annotations, ...) by adding a column in the CSV file for each metadata:
.. code-block:: text .. code-block:: text
......
...@@ -12,7 +12,7 @@ ...@@ -12,7 +12,7 @@
To easily run experiments in PAD, we offer a generic command called ``bob pad pipelines``. To easily run experiments in PAD, we offer a generic command called ``bob pad pipelines``.
Such CLI command is an entry point to several pipelines, and this documentation will focus on the one called **vanilla-pad**. Such CLI command is an entry point to several pipelines, and this documentation will focus on the one called **vanilla-pad**.
The following will introduce how a simple experiment can be run with this tool, from the samples data to a set of metrics and plots, as defined in :ref:`bob.pad.base.pad_intro`. The following will introduce how a simple experiment can be run with this tool, from the sample data to a set of metrics and plots, as defined in :ref:`bob.pad.base.pad_intro`.
Running a biometric experiment with vanilla-pad Running a biometric experiment with vanilla-pad
...@@ -25,13 +25,13 @@ A PAD experiment consists of taking a set of biometric `bonafide` and `impostor` ...@@ -25,13 +25,13 @@ A PAD experiment consists of taking a set of biometric `bonafide` and `impostor`
:align: center :align: center
:alt: Data is fed to the pipeline either for training (to fit) or for evaluation (to transform and predict). :alt: Data is fed to the pipeline either for training (to fit) or for evaluation (to transform and predict).
The pipeline of transformer(s) and classifier can be trained (fit) or used to generate a score for each input sample. The pipeline of Transformer(s) and Classifier can be trained (fit) or used to generate a score for each input sample.
Similarly to ``vanilla-biometrics``, the ``vanilla-pad`` command needs a pipeline argument to specify which experiment to run and a database argument to indicate what data will be used. These can be given with the ``-p`` (``--pipeline``) and ``-d`` (``--database``) options, respectively:: Similarly to ``vanilla-biometrics``, the ``vanilla-pad`` command needs a pipeline configuration argument to specify which experiment to run and a database argument to indicate what data will be used. These can be given with the ``-p`` (``--pipeline``) and ``-d`` (``--database``) options, respectively::
$ bob pad vanilla-pad [OPTIONS] -p <pipeline> -d <database> $ bob pad vanilla-pad [OPTIONS] -p <pipeline> -d <database>
The different available options can be listed by passing the ``--help`` option to the command:: The different available options can be listed by giving the ``--help`` flag to the command::
$ bob pad vanilla-pad --help $ bob pad vanilla-pad --help
...@@ -69,7 +69,7 @@ The Vanilla PAD pipeline is the backbone of any experiment in this library. It i ...@@ -69,7 +69,7 @@ The Vanilla PAD pipeline is the backbone of any experiment in this library. It i
Transformers Transformers
------------ ------------
A Transformer is an class that implements the fit and transform methods, which allow the application of an operation on a sample of data. A Transformer is a class that implements the fit and transform methods, which allow the application of an operation on a sample of data.
For more details, see :ref:`bob.bio.base.transformer`. For more details, see :ref:`bob.bio.base.transformer`.
Here is a basic stateless Transformer class: Here is a basic stateless Transformer class:
...@@ -117,7 +117,7 @@ Here is the minimal structure of a classifier: ...@@ -117,7 +117,7 @@ Here is the minimal structure of a classifier:
Running an experiment Running an experiment
===================== =====================
Two part of an experiment have to be executed: Two parts of an experiment have to be executed:
- **Fit**: labeled data is fed to the system to train the algorithm to recognize attacks and licit proprieties. - **Fit**: labeled data is fed to the system to train the algorithm to recognize attacks and licit proprieties.
- **Predict**: assessing a series of test samples for authenticity, generating a score for each one. - **Predict**: assessing a series of test samples for authenticity, generating a score for each one.
...@@ -145,7 +145,7 @@ The pipeline can then be executed with the command:: ...@@ -145,7 +145,7 @@ The pipeline can then be executed with the command::
$ bob pad vanilla-pad -d my_database_config.py -p my_pipeline_config.py -o output_dir $ bob pad vanilla-pad -d my_database_config.py -p my_pipeline_config.py -o output_dir
When executed with vanilla-pad, every training sample will pass through the pipeline, executing the ``fit`` methods. When executed with vanilla-pad, every training sample will pass through the pipeline, executing the ``fit`` methods.
Then, every samples of the `dev` set (and/or the `eval` set) will be given to the `transform` method of ``my_transformer`` and the result is passed to the `predict` method of ``my_classifier``. Then, every sample of the `dev` set (and/or the `eval` set) will be given to the `transform` method of ``my_transformer`` and the result is passed to the `predict` method of ``my_classifier``.
The output of the classifier (scores) is written to a file. The output of the classifier (scores) is written to a file.
.. todo:: .. todo::
...@@ -187,9 +187,9 @@ Scores ...@@ -187,9 +187,9 @@ Scores
Executing the vanilla-pad pipeline results in a list of scores, one for each Executing the vanilla-pad pipeline results in a list of scores, one for each
input sample compared against each registered model. input sample compared against each registered model.
Depending on the chosen ScoreWriter, these scores can be in csv, 4 columns, or Depending on the chosen ScoreWriter, these scores can be in CSV, 4 columns, or
5 columns format, or in a custom user-defined format. 5 columns format, or in a custom user-defined format.
By default the scores are written in the specified output directory (pointed to By default, the scores are written in the specified output directory (pointed to
vanilla-pad with the ``-o`` option), and in the 4 columns format. vanilla-pad with the ``-o`` option), and in the 4 columns format.
The scores represent the performance of a system on that data, but are not The scores represent the performance of a system on that data, but are not
...@@ -257,7 +257,7 @@ file containing the plots. ...@@ -257,7 +257,7 @@ file containing the plots.
Available plots for a spoofing scenario (command ``bob pad``) are: Available plots for a spoofing scenario (command ``bob pad``) are:
* ``hist`` (Bona fide and PA histograms along with threshold criterion) * ``hist`` (Bonafide and PA histograms along with threshold criterion)
* ``epc`` (expected performance curve) * ``epc`` (expected performance curve)
...@@ -293,7 +293,7 @@ Use the ``--help`` option on the above-cited commands to find-out about more ...@@ -293,7 +293,7 @@ Use the ``--help`` option on the above-cited commands to find-out about more
options. options.
For example, to generate a EPC curve from development and evaluation datasets: For example, to generate an EPC curve from development and evaluation datasets:
.. code-block:: sh .. code-block:: sh
...@@ -310,6 +310,6 @@ datasets. For example, to generate EPSC curve: ...@@ -310,6 +310,6 @@ datasets. For example, to generate EPSC curve:
.. note:: .. note::
IAPMR curve can be plotted along with EPC and EPSC using option IAPMR curve can be plotted along with EPC and EPSC using the ``--iapmr``
``--iapmr``. 3D EPSC can be generated using the ``--three-d``. See metrics option. 3D EPSC can be generated using the ``--three-d``. See ``metrics
--help for further options. --help`` for further options.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment