Skip to content
Snippets Groups Projects
Commit ecfcbff8 authored by Amir MOHAMMADI's avatar Amir MOHAMMADI
Browse files

spell check and nitpick

parent e307cd16
No related branches found
No related tags found
1 merge request!24Re-write the user guide
......@@ -28,7 +28,7 @@ latent variables in the next E step [8]_.
*Machines* and *trainers* are the core components of Bob's machine learning
packages. *Machines* represent statistical models or other functions defined by
parameters that can be learnt by *trainers* or manually set. Below you will
parameters that can be learned by *trainers* or manually set. Below you will
find machine/trainer guides for learning techniques available in this package.
......@@ -43,14 +43,14 @@ K-Means
:math:`\mu` is a given mean (also called centroid) and
:math:`x_i` is an observation.
This implementation has two stopping criterias. The first one is when the
This implementation has two stopping criteria. The first one is when the
maximum number of iterations is reached; the second one is when the difference
between :math:`Js` of successive iterations are lower than a convergence
threshold.
In this implementation, the training consists in the definition of the
statistical model, called machine, (:py:class:`bob.learn.em.KMeansMachine`) and
this statistical model is learnt via a trainer
this statistical model is learned via a trainer
(:py:class:`bob.learn.em.KMeansTrainer`).
Follow bellow an snippet on how to train a KMeans using Bob_.
......@@ -127,7 +127,7 @@ dataset [10]_. This optimization is done by the **Expectation-Maximization**
A very nice explanation of EM algorithm for the maximum likelihood estimation
can be found in this
`Mathematical Monk <https://www.youtube.com/watch?v=AnbiNaVp3eQ>`_ youtube
`Mathematical Monk <https://www.youtube.com/watch?v=AnbiNaVp3eQ>`_ YouTube
video.
Follow bellow an snippet on how to train a GMM using the maximum likelihood
......@@ -190,7 +190,7 @@ A compact way to write relevance MAP adaptation is by using GMM supervector
notation (this will be useful in the next subsections). The GMM supervector
notation consists of taking the parameters of :math:`\Theta` (weights, means
and covariance matrices) of a GMM and create a single vector or matrix to
represent each of them. For each gaussian compoenent :math:`c`, we can
represent each of them. For each Gaussian component :math:`c`, we can
represent the MAP adaptation as the following :math:`\mu_i = m + d_i`, where
:math:`m` is our prior and :math:`d_i` is the class offset.
......@@ -255,7 +255,7 @@ and 10 in [Reynolds2000]_ (also called, zeroth, first and second order GMM
statistics).
Given a GMM (:math:`\Theta`) and a set of samples :math:`x_t` this component
accumulates statistics for each gaussian component :math:`c`.
accumulates statistics for each Gaussian component :math:`c`.
Follow bellow a 1-1 relationship between statistics in [Reynolds2000]_ and the
properties in :py:class:`bob.learn.em.GMMStats`:
......@@ -285,7 +285,7 @@ prior GMM.
... [-0.3, -0.1, 0],
... [1.2, 1.4, 1],
... [0.8, 1., 1]], dtype='float64')
>>> # Creating a fake prior with 2 gaussians of dimension 3
>>> # Creating a fake prior with 2 Gaussians of dimension 3
>>> prior_gmm = bob.learn.em.GMMMachine(2, 3)
>>> prior_gmm.means = numpy.vstack((numpy.random.normal(0, 0.5, (1, 3)),
... numpy.random.normal(1, 0.5, (1, 3))))
......@@ -306,10 +306,10 @@ Inter-Session Variability
=========================
.. _isv:
Inter-Session Variability (ISV) modelling [3]_ [2]_ is a session variability
modelling technique built on top of the Gaussian mixture modelling approach. It
Inter-Session Variability (ISV) modeling [3]_ [2]_ is a session variability
modeling technique built on top of the Gaussian mixture modeling approach. It
hypothesizes that within-class variations are embedded in a linear subspace in
the GMM means subspace and these variations can be supressed by an offset w.r.t
the GMM means subspace and these variations can be suppressed by an offset w.r.t
each mean during the MAP adaptation.
In this generative model each sample is assumed to have been generated by a GMM
......@@ -319,10 +319,10 @@ mean supervector with the following shape:
:math:`D_z{i}` is the class offset (with all session effects suppressed).
All possible sources of session variations is embedded in this matrix
:math:`U`. Follow bellow an intuition of what is modelled with :math:`U` in the
:math:`U`. Follow bellow an intuition of what is modeled with :math:`U` in the
Iris flower `dataset <https://en.wikipedia.org/wiki/Iris_flower_data_set>`_.
The arrows :math:`U_{1}`, :math:`U_{2}` and :math:`U_{3}` are the directions of
the within class variations, with respect to each gaussian component, that will
the within class variations, with respect to each Gaussian component, that will
be suppressed a posteriori.
.. plot:: plot/plot_ISV.py
......@@ -332,7 +332,7 @@ be suppressed a posteriori.
The ISV statistical model is stored in this container
:py:class:`bob.learn.em.ISVBase` and the training is performed by
:py:class:`bob.learn.em.ISVTrainer`. The snippet bellow shows how to train a
Intersession variability modelling.
Intersession variability modeling.
.. doctest::
......@@ -387,15 +387,15 @@ Joint Factor Analysis
Joint Factor Analysis (JFA) [1]_ [2]_ is an extension of ISV. Besides the
within-class assumption (modeled with :math:`U`), it also hypothesize that
between class variations are embedded in a low rank retangular matrix
:math:`V`. In the supervector notation, this modelling has the following shape:
between class variations are embedded in a low rank rectangular matrix
:math:`V`. In the supervector notation, this modeling has the following shape:
:math:`\mu_{i, j} = m + Ux_{i, j} + Vy_{i} + D_z{i}`.
Follow bellow an intuition of what is modelled with :math:`U` and :math:`V` in
Follow bellow an intuition of what is modeled with :math:`U` and :math:`V` in
the Iris flower
`dataset <https://en.wikipedia.org/wiki/Iris_flower_data_set>`_. The arrows
:math:`V_{1}`, :math:`V_{2}` and :math:`V_{3}` are the directions of the
between class variations with respect to each gaussian component that will be
between class variations with respect to each Gaussian component that will be
added a posteriori.
......@@ -405,7 +405,7 @@ added a posteriori.
The JFA statistical model is stored in this container
:py:class:`bob.learn.em.JFABase` and the training is performed by
:py:class:`bob.learn.em.JFATrainer`. The snippet bellow shows how to train a
Intersession variability modelling.
Intersession variability modeling.
.. doctest::
:options: +NORMALIZE_WHITESPACE
......@@ -419,7 +419,7 @@ Intersession variability modelling.
>>> data_class2 = numpy.random.normal(-0.2, 0.2, (10, 3))
>>> data = [data_class1, data_class2]
>>> # Creating a fake prior with 2 gaussians of dimension 3
>>> # Creating a fake prior with 2 Gaussians of dimension 3
>>> prior_gmm = bob.learn.em.GMMMachine(2, 3)
>>> prior_gmm.means = numpy.vstack((numpy.random.normal(0, 0.5, (1, 3)),
... numpy.random.normal(1, 0.5, (1, 3))))
......@@ -460,12 +460,12 @@ Total variability Modelling
===========================
.. _ivector:
Total Variability (TV) modelling [4]_ is a front-end initially introduced for
Total Variability (TV) modeling [4]_ is a front-end initially introduced for
speaker recognition, which aims at describing samples by vectors of low
dimensionality called ``i-vectors``. The model consists of a subspace :math:`T`
and a residual diagonal covariance matrix :math:`\Sigma`, that are then used to
extract i-vectors, and is built upon the GMM approach. In the supervector
notation this modelling has the following shape: :math:`\mu = m + Tv`.
notation this modeling has the following shape: :math:`\mu = m + Tv`.
Follow bellow an intuition of the data from the Iris flower
`dataset <https://en.wikipedia.org/wiki/Iris_flower_data_set>`_, embedded in
......@@ -478,7 +478,7 @@ the iVector space.
The iVector statistical model is stored in this container
:py:class:`bob.learn.em.IVectorMachine` and the training is performed by
:py:class:`bob.learn.em.IVectorTrainer`. The snippet bellow shows how to train
a Total variability modelling.
a Total variability modeling.
.. doctest::
:options: +NORMALIZE_WHITESPACE
......@@ -593,7 +593,7 @@ diagonal covariance matrix :math:`\Sigma`, the model assumes that a sample
x_{i,j} = \mu + F h_{i} + G w_{i,j} + \epsilon_{i,j}
An Expectaction-Maximization algorithm can be used to learn the parameters of
An Expectation-Maximization algorithm can be used to learn the parameters of
this model :math:`\mu`, :math:`F` :math:`G` and :math:`\Sigma`. As these
parameters can be shared between classes, there is a specific container class
for this purpose, which is :py:class:`bob.learn.em.PLDABase`. The process is
......@@ -645,7 +645,7 @@ obtained by calling the
>>> loglike = plda.compute_log_likelihood(samples)
If separate models for different classes need to be enrolled, each of them with
a set of enrolment samples, then, several instances of
a set of enrollment samples, then, several instances of
:py:class:`bob.learn.em.PLDAMachine` need to be created and enrolled using
the :py:meth:`bob.learn.em.PLDATrainer.enroll()` method as follows.
......@@ -697,8 +697,8 @@ computed, which is defined in more formal way by:
Score Normalization
-------------------
Score normalisation aims to compensate statistical variations in output scores
due to changes in the conditions across different enrolment and probe samples.
Score normalization aims to compensate statistical variations in output scores
due to changes in the conditions across different enrollment and probe samples.
This is achieved by scaling distributions of system output scores to better
facilitate the application of a single, global threshold for authentication.
......@@ -709,10 +709,10 @@ Z-Norm
======
.. _znorm:
Given a score :math:`s_i`, Z-Norm [Auckenthaler2000]_ and [Mariethoz2005]_ (zero-normalisation) scales this value by the
mean (:math:`\mu`) and standard deviation (:math:`\sigma`) of an impostor score
distribution. This score distribution can be computed before hand and it is
defined as the following.
Given a score :math:`s_i`, Z-Norm [Auckenthaler2000]_ and [Mariethoz2005]_
(zero-normalization) scales this value by the mean (:math:`\mu`) and standard
deviation (:math:`\sigma`) of an impostor score distribution. This score
distribution can be computed before hand and it is defined as the following.
.. math::
......@@ -734,13 +734,13 @@ T-Norm
======
.. _tnorm:
T-norm [Auckenthaler2000]_ and [Mariethoz2005]_ (Test-normalization) operates in a probe-centric manner. If in the
Z-Norm :math:`\mu` and :math:`\sigma` are estimated using an impostor set of
models and its scores, the t-norm computes these statistics using the current
probe sample against at set of models in a co-hort :math:`\Theta_{c}`. A co-
hort can be any semantic organization that is sensible to your recognition
task, such as sex (male and females), ethnicity, age, etc and is defined as the
following.
T-norm [Auckenthaler2000]_ and [Mariethoz2005]_ (Test-normalization) operates
in a probe-centric manner. If in the Z-Norm :math:`\mu` and :math:`\sigma` are
estimated using an impostor set of models and its scores, the t-norm computes
these statistics using the current probe sample against at set of models in a
co-hort :math:`\Theta_{c}`. A co-hort can be any semantic organization that is
sensible to your recognition task, such as sex (male and females), ethnicity,
age, etc and is defined as the following.
.. math::
......@@ -771,8 +771,9 @@ ZT-Norm
=======
.. _ztnorm:
ZT-Norm [Auckenthaler2000]_ and [Mariethoz2005]_ consists in the application of :ref:`Z-Norm <znorm>` followed by a
:ref:`T-Norm <tnorm>` and it is implemented in :py:func:`bob.learn.em.ztnorm`.
ZT-Norm [Auckenthaler2000]_ and [Mariethoz2005]_ consists in the application of
:ref:`Z-Norm <znorm>` followed by a :ref:`T-Norm <tnorm>` and it is implemented
in :py:func:`bob.learn.em.ztnorm`.
Follow bellow an example of score normalization using
:py:func:`bob.learn.em.ztnorm`.
......
......@@ -9,12 +9,12 @@ numpy.random.seed(2) # FIXING A SEED
def train_ubm(features, n_gaussians):
"""
Train UBM
**Parameters**
features: 2D numpy array with the features
n_gaussians: Number of Gaussians
"""
input_size = features.shape[1]
......@@ -48,12 +48,12 @@ def train_ubm(features, n_gaussians):
def isv_train(features, ubm):
"""
Train U matrix
**Parameters**
features: List of :py:class:`bob.learn.em.GMMStats` organized by class
n_gaussians: UBM (:py:class:`bob.learn.em.GMMMachine`)
"""
stats = []
......
......@@ -26,7 +26,7 @@ bob.learn.em.train(map_trainer, map_machine, data, max_iterations=200,
figure, ax = plt.subplots()
#plt.scatter(data[:, 0], data[:, 1], c="olivedrab", label="new data")
# plt.scatter(data[:, 0], data[:, 1], c="olivedrab", label="new data")
plt.scatter(setosa[:, 0], setosa[:, 1], c="darkcyan", label="setosa")
plt.scatter(versicolor[:, 0], versicolor[:, 1],
c="goldenrod", label="versicolor")
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment