-
Tiago de Freitas Pereira authoredTiago de Freitas Pereira authored
Implementation Details
The bob.bio
module is specifically designed to be as flexible as possible while trying to keep things simple.
Therefore, it uses python to implement tools such as preprocessors, feature extractors and recognition algorithms.
It is file based so any tool can implement its own way of reading and writing data, features or models.
Configurations are stored in configuration files, so it should be easy to test different parameters of your algorithms without modifying the code.
Base Classes
All tools implemented in the bob.bio
packages are based on some classes, which are defined in the bob.bio.base
package, and which are detailed below.
Most of the functionality is provided in the base classes, but any function can be overridden in the derived class implementations.
In the derived class constructors, the base class constructor needs to be called.
For automatically tracing the algorithms, all parameters that are passed to the derived class constructor should be passed to the base class constructor as a list of keyword arguments (which is indicated by ...
below).
This will assure that all parameters of the experiments are stored into the Experiment.info
file.
Note
All tools are based on reading, processing and writing files.
By default, any type of file is allowed to be handled, and file names are provided to the read_...
and write_...
functions as strings.
However, some of the extensions -- particularly the :ref:`bob.bio.video <bob.bio.video>` extension -- requires the read and write functions to handle files of type :py:class:`bob.io.base.HDF5File`.
If you plan to write your own tools, please assure that you are following the following structure.
Preprocessors
All preprocessor classes are derived from :py:class:`bob.bio.base.preprocessor.Preprocessor`. All of them implement the following two functions:
-
__init__(self, <parameters>)
: Initializes the preprocessing algorithm with the parameters it needs. The base class constructor is called in the derived class constructor, e.g. asbob.bio.base.preprocessor.Preprocessor.__init__(self, ...)
. -
__call__(self, original_data, annotations) -> data
: preprocesses the data given the dictionary of annotations (e.g.{'reye' : [re_y, re_x], 'leye': [le_y, le_x]}
for face images).Note
When the database does not provide annotations, the
annotations
parameter might beNone
.
By default, the data returned by the preprocessor is of type :py:class:`numpy.ndarray`. In that case, the base class IO functionality can be used. If a class returns data that is not of type :py:class:`numpy.ndarray`, it overwrites further functions from :py:class:`bob.bio.base.preprocessor.Preprocessor` that define the IO of your class:
-
write_data(data, data_file)
: Writes the given data (that has been generated using the__call__
function of this class) to file. -
read_data(data_file)
: Reads the preprocessed data from file.
By default, the original data is read by :py:func:`bob.io.base.load`. Hence, data is given as :py:class:`numpy.ndarray`s. When a different IO for the original data is required (for example to read videos in :py:class:`bob.bio.video.preprocessor.Video`), the following function is overridden:
-
read_original_data(filename)
: Reads the original data from file.
Extractors
Feature extractors should be derived from the :py:class:`bob.bio.base.extractor.Extractor` class. All extractor classes provide at least the functions:
-
__init__(self, <parameters>)
: Initializes the feature extraction algorithm with the parameters it needs. Calls the base class constructor, e.g. asbob.bio.base.extractor.Extractor.__init__(self, ...)
(there are more parameters to this constructor, see below). -
__call__(self, data) -> feature
: Extracts the feature from the given preprocessed data. By default, the returned feature should be a :py:class:`numpy.ndarray`.
If features are not of type :py:class:`numpy.ndarray`, the write_feature
function is overridden.
In this case, also the function to read that kind of features needs to be overridden:
-
write_feature(self, feature, feature_file)
: Writes the feature (as returned by the__call__
function) to the given file name. -
read_feature(self, feature_file) -> feature
: Reads the feature (as written by thesave_feature
function) from the given file name.
Note
If the feature is of a class that contains and is written via a save(bob.io.base.HDF5File)
method, the write_feature
function does not need to be overridden.
However, the read_feature
function is required in this case.
If the feature extraction process requires to read a trained extractor model from file, the following function is overloaded:
-
load(self, extractor_file)
: Loads the extractor from file. This function is called at least once before the__call__
function is executed.
It is also possible to train the extractor model before it is used.
In this case, two things are done.
First, the train
function is overridden:
-
train(self, image_list, extractor_file)
: Trains the feature extractor with the given list of images and writes theextractor_file
.
Second, this behavior is registered in the __init__
function by calling the base class constructor with more parameters: bob.bio.base.extractor.Extractor.__init__(self, requires_training=True, ...)
.
Given that the training algorithm needs to have the training data split by identity, the bob.bio.base.extractor.Extractor.__init__(self, requires_training=True, split_training_images_by_client = True, ...)
is used instead.
Algorithms
The implementation of recognition algorithm is as straightforward. All algorithms are derived from the :py:class:`bob.bio.base.algorithm.Algorithm` class. The constructor of this class has the following options, which are selected according to the current algorithm:
-
performs_projection
: If set toTrue
, features will be projected using theproject
function. With the defaultFalse
, theproject
function will not be called at all. -
requires_projector_training
: Ifperforms_projection
is enabled, this flag specifies if the projector needs training. IfTrue
(the default), thetrain_projector
function will be called. -
split_training_features_by_client
: If the projector training needs training images split up by client identity, this flag is enabled. In this case, thetrain_projector
function will receive a list of lists of features. If set toFalse
(the default), the training features are given in one list. -
use_projected_features_for_enrollment
: If features are projected, by default (True
) models are enrolled using the projected features. If the algorithm requires the original unprojected features to enroll the model,use_projected_features_for_enrollment=False
is selected. -
requires_enroller_training
: Enables the enroller training. By default (False
), no enroller training is performed, i.e., thetrain_enroller
function is not called. -
multiple_model_scoring
: The way to handle scoring when models store several features. Set this parameter toNone
when you implement your own functionality to handle models from several features (see below). -
multiple_probe_scoring
: The way to handle scoring when models store several features. Set this parameter toNone
when you handle scoring with multiple probes with your ownscore_for_multiple_probes
function (see below).
A recognition algorithm has to override at least three functions:
-
__init__(self, <parameters>)
: Initializes the face recognition algorithm with the parameters it needs. Calls the base class constructor, e.g. asbob.bio.base.algorithm.Algorithm.__init__(self, ...)
(there are more parameters to this constructor, see above). -
enroll(self, enroll_features) -> model
: Enrolls a model from the given vector of features (this list usually contains features from several files of one subject) and returns it. The returned model is either a :py:class:`numpy.ndarray` or an instance of a class that defines asave(bob.io.base.HDF5File)
method. If neither of the two options are appropriate, awrite_model
function is defined (see below). -
score(self, model, probe) -> value
: Computes a similarity or probability score that the given probe feature and the given model stem from the same identity.Note
When you use a distance measure in your scoring function, and lower distances represents higher probabilities of having the same identity, please return the negative distance.
Additionally, an algorithm may need to project the features before they can be used for enrollment or recognition. In this case, (some of) the function(s) are overridden:
-
train_projector(self, train_features, projector_file)
: Uses the given list of features and writes theprojector_file
.Warning
If you write this function, please assure that you use both
performs_projection=True
andrequires_projector_training=True
(for the latter, this is the default, but not for the former) during the base class constructor call in your__init__
function. If you need the training data to be sorted by clients, please usesplit_training_features_by_client=True
as well. Please also assure that you overload theproject
function. -
load_projector(self, projector_file)
: Loads the projector from the given file, i.e., as stored bytrain_projector
. This function is always called before theproject
,enroll
, andscore
functions are executed. -
project(self, feature) -> feature
: Projects the given feature and returns the projected feature, which should either be a :py:class:`numpy.ndarray` or an instance of a class that defines asave(bob.io.base.HDF5File)
method.Note
If you write this function, please assure that you use
performs_projection=True
during the base class constructor call in your__init__
function.
And once more, if the projected feature is not of type numpy.ndarray
, the following methods are overridden:
-
write_feature(feature, feature_file)
: Writes the feature (as returned by theproject
function) to file. -
read_feature(feature_file) -> feature
: Reads and returns the feature (as written by thewrite_feature
function).
Some tools also require to train the model enrollment functionality (or shortly the enroller
).
In this case, these functions are overridden:
-
train_enroller(self, training_features, enroller_file)
: Trains the model enrollment with the list of lists of features and writes theenroller_file
.Note
If you write this function, please assure that you use
requires_enroller_training=True
during the base class constructor call in your__init__
function. -
load_enroller(self, enroller_file)
: Loads the enroller from file. This function is always called before theenroll
andscore
functions are executed.
By default, it is assumed that both the models and the probe features are of type :py:class:`numpy.ndarray`.
If the score
function expects models and probe features to be of a different type, these functions are overridden:
-
write_model(self, model, model_file)
: writes the model (as returned by theenroll
function). -
read_model(self, model_file) -> model
: reads the model (as written by thewrite_model
function) from file. -
read_probe(self, probe_file) -> feature
: reads the probe feature from file.Note
In many cases, the
read_feature
andread_probe
functions are identical (if both are present).
Finally, the :py:class:`bob.bio.base.algorithm.Algorithm` class provides default implementations for the case that models store several features, or that several probe features should be combined into one score. These two functions are:
-
score_for_multiple_models(self, models, probe)
: In case your model store several features, call this function to compute the average (or min, max, ...) of the scores. -
score_for_multiple_probes(self, model, probes)
: By default, the average (or min, max, ...) of the scores for all probes are computed. Override this function in case you want different behavior.
Implemented Tools
In this base class, only one feature extractor and some recognition algorithms are defined.
However, implementations of the base classes can be found in all of the bob.bio
packages.
Here is a list of implementations:
- :ref:`bob.bio.base <bob.bio.base>` : :ref:`bob.bio.base.implemented`
- :ref:`bob.bio.face <bob.bio.face>` : :ref:`bob.bio.face.implemented`
- :ref:`bob.bio.video <bob.bio.video>` : :ref:`bob.bio.video.implemented`
- :ref:`bob.bio.gmm <bob.bio.gmm>` : :ref:`bob.bio.gmm.implemented`