Commit b9402ce7 authored by Amir MOHAMMADI's avatar Amir MOHAMMADI
Browse files

Merge branch 'verification-overview' into 'master'

Incorporate a general overview of biometric verification and illustrate biometric verification experiment flow in bob.bio.base doc

Closes #73

See merge request !73
parents 22082f92 ce52d50c
Pipeline #9595 passed with stages
in 10 minutes and 47 seconds
......@@ -5,51 +5,18 @@
.. _bob.bio.base.experiments:
=========================================
==========================================
Running Biometric Recognition Experiments
=========================================
Now, you are almost ready to run your first biometric recognition experiment.
Just a little bit of theory, and then: off we go.
Structure of a Biometric Recognition Experiment
-----------------------------------------------
Each biometric recognition experiment that is run with ``bob.bio`` is divided into several steps.
The steps are:
1. Data preprocessing: Raw data is preprocessed, e.g., for face recognition, faces are detected, images are aligned and photometrically enhanced.
2. Feature extractor training: Feature extraction parameters are learned.
3. Feature extraction: Features are extracted from the preprocessed data.
4. Feature projector training: Parameters of a subspace-projection of the features are learned.
5. Feature projection: The extracted features are projected into a subspace.
6. Model enroller training: The ways how to enroll models from extracted or projected features is learned.
7. Model enrollment: One model is enrolled from the features of one or more images.
8. Scoring: The verification scores between various models and probe features are computed.
9. Evaluation: The computed scores are evaluated and curves are plotted.
These 9 steps are divided into four distinct groups, which are discussed in more detail later:
* Preprocessing (only step 1)
* Feature extraction (steps 2 and 3)
* Biometric recognition (steps 4 to 8)
* Evaluation (step 9)
The communication between two steps is file-based, usually using a binary HDF5_ interface, which is implemented in the :py:class:`bob.io.base.HDF5File` class.
The output of one step usually serves as the input of the subsequent step(s).
Depending on the algorithm, some of the steps are not applicable/available.
E.g. most of the feature extractors do not need a special training step, or some algorithms do not require a subspace projection.
In these cases, the according steps are skipped.
``bob.bio`` takes care that always the correct files are forwarded to the subsequent steps.
==========================================
Now, you are ready to run your first biometric recognition experiment.
.. _running_part_1:
Running Experiments (part I)
----------------------------
To run an experiment, we provide a generic script ``verify.py``, which is highly parametrizable.
To run an experiment, we provide a generic script ``verify.py``, which is highly parameterizable.
To get a complete list of command line options, please run:
.. code-block:: sh
......@@ -83,7 +50,7 @@ To get a list of registered resources, please call:
Each package in ``bob.bio`` defines its own resources, and the printed list of registered resources differs according to the installed packages.
If only ``bob.bio.base`` is installed, no databases and only one preprocessor will be listed.
To see more details about the resources, i.e., the full constructor call fo the respective class, use the ``--details`` (or shortly ``-d``) option, and to sub-select only specific types of resources, use the ``--types`` (or ``-t``) option:
To see more details about the resources, i.e., the full constructor call for the respective class, use the ``--details`` (or shortly ``-d``) option, and to sub-select only specific types of resources, use the ``--types`` (or ``-t``) option:
.. code-block:: sh
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -9,28 +9,24 @@
===========================================
The ``bob.bio`` packages provide open source tools to run comparable and reproducible biometric recognition experiments.
To design a biometric recognition experiment, one has to choose:
To design a biometric recognition experiment, you must choose:
* a databases containing the original data, and a protocol that defines how to use the data,
* a data preprocessing algorithm, i.e., face detection for face recognition experiments or voice activity detection for speaker recognition,
* the type of features to extract from the preprocessed data,
* the biometric recognition algorithm to employ,
* the score fusion to combine outputs from different systems, and
* the way to evaluate the results
* A database to use for the raw biometric data and a protocol that defines how to use that data,
* A data preprocessing algorithm to clean up the raw biometric data,
* A feature extractor to extract the desired type of features from the preprocessed data,
* A biometric matching algorithm,
* An evaluation method to make sense of the matching scores.
For any of these parts, several different types are implemented in the ``bob.bio`` packages, and basically any combination of the five parts can be executed.
For each type, several meta-parameters can be tested.
This results in a nearly infinite amount of possible experiments that can be run using the current setup.
But it is also possible to use your own database, preprocessor, feature extractor, or biometric recognition algorithm and test this against the baseline algorithms implemented in the our packages.
The ``bob.bio`` packages contain several implementations of each of the above steps, so you can either choose from the existing methods or use your own.
.. note::
The ``bob.bio`` packages are derived from the former `FaceRecLib <http://pypi.python.org/pypi/facereclib>`__, which is herewith outdated.
This package :py:mod:`bob.bio.base` includes the basic definition of a biometric recognition experiment, as well as a generic script, which can execute the full biometric experiment in a single command line.
Changing the employed tolls such as the database, protocol, preprocessor, feature extractor or recognition algorithm is as simple as changing a command line parameter.
The :py:mod:`bob.bio.base` package includes the basic definition of a biometric recognition experiment, as well as a generic script, which can execute the full biometric experiment in a single command line.
Changing the employed tools, such as the database, protocol, preprocessor, feature extractor or matching algorithm is as simple as changing a parameter in a configuration file or on the command line.
The implementation of (most of) the tools is separated into other packages in the ``bob.bio`` namespace.
All these packages can be easily combined.
All of these packages can be easily combined.
Here is a growing list of derived packages:
* :ref:`bob.bio.spear <bob.bio.spear>` Tools to run speaker recognition experiments, including voice activity detection, Cepstral feature extraction, and speaker databases
......@@ -39,8 +35,6 @@ Here is a growing list of derived packages:
* :ref:`bob.bio.gmm <bob.bio.gmm>` Algorithms based on Gaussian Mixture Modeling (GMM) such as Inter-Session Variability modeling (ISV) or Total Variability modeling (TV, aka. I-Vector)
* `bob.bio.csu <http://pypi.python.org/pypi/bob.bio.csu>`__ for wrapper classes of the `CSU Face Recognition Resources <http://www.cs.colostate.edu/facerec>`__ (see `Installation Instructions <http://pythonhosted.org/bob.bio.csu/installation.html>`__ of ``bob.bio.csu``).
If you are interested, please continue reading:
===========
Users Guide
......@@ -50,6 +44,7 @@ Users Guide
:maxdepth: 2
installation
struct_bio_rec_sys
experiments
implementation
filelist-guide
......
......@@ -13,9 +13,10 @@ turn are part of the signal-processing and machine learning toolbox Bob_. To
install Bob_, please read the `Installation Instructions <bobinstall_>`_.
.. note::
Currently, running Bob_ under MS Windows in not yet supported. However, we
found that running Bob_ in a virtual Unix environment such as the one
provided by VirtualBox_ is a good alternative.
Running Bob_ under MS Windows in not yet supported. However, we found that
running Bob_ in a virtual Unix environment such as the one provided by
VirtualBox_ is a good alternative.
Then, to install the ``bob.bio`` packages and in turn maybe the database
packages that you want to use, use conda_ to install them:
......
.. _bob.bio.base.struct_bio_rec_sys:
============================================
Structure of a Biometric Recognition System
============================================
This section will familiarize you with the structure of a typical biometric recognition system to help you understand and use the ``bob.bio`` framework to set up your own biometric recognition experiments.
"Biometric recognition" refers to the process of establishing a person's identity based on their biometric data.
A biometric recognition system can operate in one of two modes: *verification* or *identification*.
A *verification* system establishes whether or not a person is who they say they are (i.e., the person claims an identity and the system tries to prove whether or not that claim is true).
On the other hand, an *identification* system attempts to establish a person's identity from scratch (i.e., the system tries to associate a person with an identity from a set of identities in the system's database). When we are talking about neither verification nor identification in particular, the generic term *recognition* is used.
A biometric recognition system has two stages:
1. **Enrollment:** A person's biometric data is enrolled to the system's biometric database.
2. **Recognition:** A person's newly acquired biometric data (which we call a *probe*) is compared to the enrolled biometric data (which we refer to as a *model*), and a match score is generated. The match score tells us how similar the model and the probe are. Based on match scores, we then decide whether or not the model and probe come from the same person (verification) or which gallery identity should to be assigned to the input biometric (identification).
Fig. 1 shows the enrollment and verification stages in a typical biometric *verification* system:
.. figure:: /img/bio_ver_sys.svg
:align: center
Enrollment and verification in a typical biometric verification system.
Fig. 2 shows the enrollment and identification stages in a typical biometric *identification* system:
.. figure:: /img/bio_ident_sys.svg
:align: center
Enrollment and identification in a typical biometric identification system.
In the figures above:
* The "Pre-processor" cleans up the raw biometric data to make recognition easier (e.g., crops the face image to get rid of the background).
* The "Feature Extractor" extracts the most important features for recognition, from the pre-processed biometric data.
* The "Model Database" stores each person's extracted feature set in the form of a representative model for that person in the system database, typically alongside the person's ID.
* The "Matcher" compares a new biometric feature set (probe) to one (for verification) or all (for identification) models in the database, and outputs a similarity score for each comparison.
* For *verification*, the "Decision Maker" decides whether or not the probe and the model from the database match, based on whether the similarity score is above or below a pre-defined match threshold. For *identification*, the "Decision Maker" decides which model from the database best represents the identity of the probe, based on which model most closely matches the probe.
Biometric Recognition Experiments in the ``bob.bio`` Framework
---------------------------------------------------------------
The ``bob.bio`` framework has the capability to perform both *verification* and *identification* experiments, depending on the user's requirements. To talk about the framework in generic terms, we will henceforth use the term *recognition*.
In general, the goal of a biometric recognition experiment is to quantify the recognition accuracy of a biometric recognition system, e.g., we wish to find out how good the system is at deciding whether or not two biometric samples come from the same person.
To conduct a biometric recognition experiment, we need biometric data. So, we use a biometric database. A biometric database generally consists of multiple samples of a particular biometric, from multiple people. For example, a face database could contain 5 different images of a person's face, from 100 people. The dataset is split up into samples used for enrollment, and samples used for probing. We enroll a model for each identity from one or more of its faces. We then simulate "genuine" recognition attempts by comparing each person's probe samples to their enrolled models. We simulate "impostor" recognition attempts by comparing the same probe samples to models of different people. Which data is used for training, enrollment and probing is defined by the evaluation *protocol* of the database. The protocol also defines, which models should be compared to which probes.
In ``bob.bio``, biometric recognition experiments are split up into four main stages, similar to the stages in a typical verification or identification system as illustrated in Fig. 1 and Fig. 2, respectively:
1. Data preprocessing
2. Feature extraction
3. Matching
4. Decision making
Each of these stages is discussed below:
Data Preprocessing:
~~~~~~~~~~~~~~~~~~~
Biometric measurements are often noisy, containing redundant information that is not necessary (and can be misleading) for recognition. For example, face images contain non-face background information, vein images can be unevenly illuminated, speech signals can be littered with background noise, etc. The aim of the data preprocessing stage is to clean up the raw biometric data so that it is in the best possible state to make recognition easier. For example, biometric data is cropped from the background, the images are photometrically enhanced, etc.
All the biometric samples in the input biometric database go through the preprocessing stage. The results are stored in a directory entitled "preprocessed". This process is illustrated in Fig. 3:
.. figure:: /img/preprocessor.svg
:align: center
Preprocessing stage in ``bob.bio``'s biometric recognition experiment framework.
Feature Extraction:
~~~~~~~~~~~~~~~~~~~
Although the preprocessing stage produces cleaner biometric data, the resulting data is usually very large and still contains much redundant information. The aim of the feature extraction stage is to extract features that are necessary for recognizing a person.
All the biometric features stored in the "preprocessed" directory go through the feature extraction stage. The results are stored in a directory entitled "extracted". This process is illustrated in Fig. 4:
.. figure:: /img/extractor.svg
:align: center
Feature extraction stage in ``bob.bio``'s biometric recognition experiment framework.
.. note::
Prior to the feature extraction there is an *optional* feature extractor training stage (to help the extractor to learn which features to extract) that uses the training data provided by the database.
Matching:
~~~~~~~~~
The matching stage in ``bob.bio`` is referred to as the "Algorithm". The Algorithm stage consists of three main parts:
(i) An optional "projection" stage after the feature extraction, as illustrated in Fig. 5. This would be used if, for example, you wished to project your extracted biometric features into a lower-dimensional subspace prior to recognition.
.. figure:: /img/algorithm_projection.svg
:align: center
The projection part of the Algorithm stage in ``bob.bio``'s biometric recognition experiment framework.
.. note::
In most cases when a feature projection is applied, there is a feature projection training stage that works on the training data provided by the database.
In the example above, prior to the "projection" stage, the subspace projection matrix is computed from the extracted training features.
(ii) Enrollment: The enrollment part of the Algorithm stage essentially works as follows. One or more biometric samples per person are used to compute a representative "model" for that person, which essentially represents that person's identity. To determine which of a person's biometric samples should be used to generate their model, we query the protocol of our input biometric database. The model is then calculated using the corresponding biometric features extracted in the Feature Extraction stage (or, optionally, our "projected" features). Fig. 6 illustrates the enrollment part of the Algorithm module:
.. figure:: /img/algorithm_enrollment.svg
:align: center
The enrollment part of the Algorithm stage in ``bob.bio``'s biometric recognition experiment framework.
.. note::
There is sometimes a model enroller training stage prior to enrollment, which uses the databases training data. This is only necessary when you are trying to fit an existing model to a set of biometric features, e.g., fitting a UBM (Universal Background Model) to features extracted from a speech signal. In other cases, the model is calculated from the features themselves, e.g., by averaging the feature vectors from multiple samples of the same biometric, in which case model enroller training is not necessary.
(iii) Scoring: The scoring part of the Algorithm stage essentially works as follows. Each model is associated with a number of probes, so we first query the input biometric database to determine which biometric samples should be used as the probes for each model. Every model is then compared to its associated probes (some of which come from the same person, and some of which come from different people), and a score is calculated for each comparison. The score describes the similarity between the model and the probe (higher scores indicate greater similarity); for example, it can be computed as a negative distance between the model and probe features. Ideally, if the model and probe come from the same biometric (e.g., two images of the same finger), they should be very similar, and if they come from different sources (e.g., two images of different fingers) then their similarity should be low. Fig. 7 illustrates the scoring part of the Algorithm module:
.. figure:: /img/algorithm_scoring.svg
:align: center
The scoring part of the Algorithm stage in ``bob.bio``'s biometric recognition experiment framework.
Decision Making:
~~~~~~~~~~~~~~~~
The decision making stage in ``bob.bio`` is referred to as "Evaluation". If we wish to perform *verification*, then the aim of this stage will be to make a decision as to whether each score calculated in the Matching stage indicates a "Match" or "No Match" between the particular model and probe biometrics. If we wish to perform *identification*, then the aim of the evaluation stage will be to find the model which most closely matches the probe biometric.
Once a decision has been made, we can quantify the overall performance of the particular biometric recognition system in terms of common metrics like the False Match Rate (FMR), False Non Match Rate (FNMR), and Equal Error Rate (EER) for verification, and Identification Rate (IR) for identification. We can also view a visual representation of the performance in terms of plots like the Receiver Operating Characteristic (ROC) and Detection Error Trade-off (DET) for verification, Cumulative Match Characteristics (CMC) for closed-set identification, and Detection and Identification Rate (DIR) for open-set identification. Fig. 8 illustrates the Evaluation stage:
.. figure:: /img/evaluation.svg
:align: center
Evaluation stage in ``bob.bio``'s biometric recognition experiment framework.
.. note::
* The "Data Preprocessing" to "Matching" steps are carried out by ``bob.bio.base``'s ``verify.py`` script. The "Decision Making" step is carried out by ``bob.bio.base``'s ``evaluate.py`` script. These scripts will be discussed in the next sections.
* The communication between any two steps in the recognition framework is file-based, usually using a binary HDF5_ interface, which is implemented, for example, in the :py:class:`bob.io.base.HDF5File` class. One exception is the "Decision Making" step, which uses score file in text format, i.e., to allow to incorporate other systems' results, which are computed outside of ``bob.bio``, but uses the same database and evaluation protocol.
* The output of one step usually serves as the input of the subsequent step(s), as portrayed in Fig. 3 -- Fig. 8.
* ``bob.bio`` ensures that the correct files are always forwarded to the subsequent steps. For example, if you choose to implement a feature projection after the feature extraction stage, as illustrated in Fig. 5, ``bob.bio`` will make sure that the files in the "projected" directory are passed on as the input to the Enrollment stage; otherwise, the "extracted" directory will become the input to the Enrollment stage.
.. include:: links.rst
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment