Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
bob.learn.em
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
bob
bob.learn.em
Commits
ecfcbff8
Commit
ecfcbff8
authored
7 years ago
by
Amir MOHAMMADI
Browse files
Options
Downloads
Patches
Plain Diff
spell check and nitpick
parent
e307cd16
No related branches found
No related tags found
1 merge request
!24
Re-write the user guide
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
doc/guide.rst
+40
-39
40 additions, 39 deletions
doc/guide.rst
doc/plot/plot_ISV.py
+6
-6
6 additions, 6 deletions
doc/plot/plot_ISV.py
doc/plot/plot_MAP.py
+1
-1
1 addition, 1 deletion
doc/plot/plot_MAP.py
with
47 additions
and
46 deletions
doc/guide.rst
+
40
−
39
View file @
ecfcbff8
...
...
@@ -28,7 +28,7 @@ latent variables in the next E step [8]_.
*Machines* and *trainers* are the core components of Bob's machine learning
packages. *Machines* represent statistical models or other functions defined by
parameters that can be learn
t
by *trainers* or manually set. Below you will
parameters that can be learn
ed
by *trainers* or manually set. Below you will
find machine/trainer guides for learning techniques available in this package.
...
...
@@ -43,14 +43,14 @@ K-Means
:math:`\mu` is a given mean (also called centroid) and
:math:`x_i` is an observation.
This implementation has two stopping criteria
s
. The first one is when the
This implementation has two stopping criteria. The first one is when the
maximum number of iterations is reached; the second one is when the difference
between :math:`Js` of successive iterations are lower than a convergence
threshold.
In this implementation, the training consists in the definition of the
statistical model, called machine, (:py:class:`bob.learn.em.KMeansMachine`) and
this statistical model is learn
t
via a trainer
this statistical model is learn
ed
via a trainer
(:py:class:`bob.learn.em.KMeansTrainer`).
Follow bellow an snippet on how to train a KMeans using Bob_.
...
...
@@ -127,7 +127,7 @@ dataset [10]_. This optimization is done by the **Expectation-Maximization**
A very nice explanation of EM algorithm for the maximum likelihood estimation
can be found in this
`Mathematical Monk <https://www.youtube.com/watch?v=AnbiNaVp3eQ>`_
y
ou
t
ube
`Mathematical Monk <https://www.youtube.com/watch?v=AnbiNaVp3eQ>`_
Y
ou
T
ube
video.
Follow bellow an snippet on how to train a GMM using the maximum likelihood
...
...
@@ -190,7 +190,7 @@ A compact way to write relevance MAP adaptation is by using GMM supervector
notation (this will be useful in the next subsections). The GMM supervector
notation consists of taking the parameters of :math:`\Theta` (weights, means
and covariance matrices) of a GMM and create a single vector or matrix to
represent each of them. For each
g
aussian compo
e
nent :math:`c`, we can
represent each of them. For each
G
aussian component :math:`c`, we can
represent the MAP adaptation as the following :math:`\mu_i = m + d_i`, where
:math:`m` is our prior and :math:`d_i` is the class offset.
...
...
@@ -255,7 +255,7 @@ and 10 in [Reynolds2000]_ (also called, zeroth, first and second order GMM
statistics).
Given a GMM (:math:`\Theta`) and a set of samples :math:`x_t` this component
accumulates statistics for each
g
aussian component :math:`c`.
accumulates statistics for each
G
aussian component :math:`c`.
Follow bellow a 1-1 relationship between statistics in [Reynolds2000]_ and the
properties in :py:class:`bob.learn.em.GMMStats`:
...
...
@@ -285,7 +285,7 @@ prior GMM.
... [-0.3, -0.1, 0],
... [1.2, 1.4, 1],
... [0.8, 1., 1]], dtype='float64')
>>> # Creating a fake prior with 2
g
aussians of dimension 3
>>> # Creating a fake prior with 2
G
aussians of dimension 3
>>> prior_gmm = bob.learn.em.GMMMachine(2, 3)
>>> prior_gmm.means = numpy.vstack((numpy.random.normal(0, 0.5, (1, 3)),
... numpy.random.normal(1, 0.5, (1, 3))))
...
...
@@ -306,10 +306,10 @@ Inter-Session Variability
=========================
.. _isv:
Inter-Session Variability (ISV) model
l
ing [3]_ [2]_ is a session variability
model
l
ing technique built on top of the Gaussian mixture model
l
ing approach. It
Inter-Session Variability (ISV) modeling [3]_ [2]_ is a session variability
modeling technique built on top of the Gaussian mixture modeling approach. It
hypothesizes that within-class variations are embedded in a linear subspace in
the GMM means subspace and these variations can be supressed by an offset w.r.t
the GMM means subspace and these variations can be sup
p
ressed by an offset w.r.t
each mean during the MAP adaptation.
In this generative model each sample is assumed to have been generated by a GMM
...
...
@@ -319,10 +319,10 @@ mean supervector with the following shape:
:math:`D_z{i}` is the class offset (with all session effects suppressed).
All possible sources of session variations is embedded in this matrix
:math:`U`. Follow bellow an intuition of what is model
l
ed with :math:`U` in the
:math:`U`. Follow bellow an intuition of what is modeled with :math:`U` in the
Iris flower `dataset <https://en.wikipedia.org/wiki/Iris_flower_data_set>`_.
The arrows :math:`U_{1}`, :math:`U_{2}` and :math:`U_{3}` are the directions of
the within class variations, with respect to each
g
aussian component, that will
the within class variations, with respect to each
G
aussian component, that will
be suppressed a posteriori.
.. plot:: plot/plot_ISV.py
...
...
@@ -332,7 +332,7 @@ be suppressed a posteriori.
The ISV statistical model is stored in this container
:py:class:`bob.learn.em.ISVBase` and the training is performed by
:py:class:`bob.learn.em.ISVTrainer`. The snippet bellow shows how to train a
Intersession variability model
l
ing.
Intersession variability modeling.
.. doctest::
...
...
@@ -387,15 +387,15 @@ Joint Factor Analysis
Joint Factor Analysis (JFA) [1]_ [2]_ is an extension of ISV. Besides the
within-class assumption (modeled with :math:`U`), it also hypothesize that
between class variations are embedded in a low rank retangular matrix
:math:`V`. In the supervector notation, this model
l
ing has the following shape:
between class variations are embedded in a low rank re
c
tangular matrix
:math:`V`. In the supervector notation, this modeling has the following shape:
:math:`\mu_{i, j} = m + Ux_{i, j} + Vy_{i} + D_z{i}`.
Follow bellow an intuition of what is model
l
ed with :math:`U` and :math:`V` in
Follow bellow an intuition of what is modeled with :math:`U` and :math:`V` in
the Iris flower
`dataset <https://en.wikipedia.org/wiki/Iris_flower_data_set>`_. The arrows
:math:`V_{1}`, :math:`V_{2}` and :math:`V_{3}` are the directions of the
between class variations with respect to each
g
aussian component that will be
between class variations with respect to each
G
aussian component that will be
added a posteriori.
...
...
@@ -405,7 +405,7 @@ added a posteriori.
The JFA statistical model is stored in this container
:py:class:`bob.learn.em.JFABase` and the training is performed by
:py:class:`bob.learn.em.JFATrainer`. The snippet bellow shows how to train a
Intersession variability model
l
ing.
Intersession variability modeling.
.. doctest::
:options: +NORMALIZE_WHITESPACE
...
...
@@ -419,7 +419,7 @@ Intersession variability modelling.
>>> data_class2 = numpy.random.normal(-0.2, 0.2, (10, 3))
>>> data = [data_class1, data_class2]
>>> # Creating a fake prior with 2
g
aussians of dimension 3
>>> # Creating a fake prior with 2
G
aussians of dimension 3
>>> prior_gmm = bob.learn.em.GMMMachine(2, 3)
>>> prior_gmm.means = numpy.vstack((numpy.random.normal(0, 0.5, (1, 3)),
... numpy.random.normal(1, 0.5, (1, 3))))
...
...
@@ -460,12 +460,12 @@ Total variability Modelling
===========================
.. _ivector:
Total Variability (TV) model
l
ing [4]_ is a front-end initially introduced for
Total Variability (TV) modeling [4]_ is a front-end initially introduced for
speaker recognition, which aims at describing samples by vectors of low
dimensionality called ``i-vectors``. The model consists of a subspace :math:`T`
and a residual diagonal covariance matrix :math:`\Sigma`, that are then used to
extract i-vectors, and is built upon the GMM approach. In the supervector
notation this model
l
ing has the following shape: :math:`\mu = m + Tv`.
notation this modeling has the following shape: :math:`\mu = m + Tv`.
Follow bellow an intuition of the data from the Iris flower
`dataset <https://en.wikipedia.org/wiki/Iris_flower_data_set>`_, embedded in
...
...
@@ -478,7 +478,7 @@ the iVector space.
The iVector statistical model is stored in this container
:py:class:`bob.learn.em.IVectorMachine` and the training is performed by
:py:class:`bob.learn.em.IVectorTrainer`. The snippet bellow shows how to train
a Total variability model
l
ing.
a Total variability modeling.
.. doctest::
:options: +NORMALIZE_WHITESPACE
...
...
@@ -593,7 +593,7 @@ diagonal covariance matrix :math:`\Sigma`, the model assumes that a sample
x_{i,j} = \mu + F h_{i} + G w_{i,j} + \epsilon_{i,j}
An Expecta
c
tion-Maximization algorithm can be used to learn the parameters of
An Expectation-Maximization algorithm can be used to learn the parameters of
this model :math:`\mu`, :math:`F` :math:`G` and :math:`\Sigma`. As these
parameters can be shared between classes, there is a specific container class
for this purpose, which is :py:class:`bob.learn.em.PLDABase`. The process is
...
...
@@ -645,7 +645,7 @@ obtained by calling the
>>> loglike = plda.compute_log_likelihood(samples)
If separate models for different classes need to be enrolled, each of them with
a set of enrolment samples, then, several instances of
a set of enrol
l
ment samples, then, several instances of
:py:class:`bob.learn.em.PLDAMachine` need to be created and enrolled using
the :py:meth:`bob.learn.em.PLDATrainer.enroll()` method as follows.
...
...
@@ -697,8 +697,8 @@ computed, which is defined in more formal way by:
Score Normalization
-------------------
Score normali
s
ation aims to compensate statistical variations in output scores
due to changes in the conditions across different enrolment and probe samples.
Score normali
z
ation aims to compensate statistical variations in output scores
due to changes in the conditions across different enrol
l
ment and probe samples.
This is achieved by scaling distributions of system output scores to better
facilitate the application of a single, global threshold for authentication.
...
...
@@ -709,10 +709,10 @@ Z-Norm
======
.. _znorm:
Given a score :math:`s_i`, Z-Norm [Auckenthaler2000]_ and [Mariethoz2005]_
(zero-normalisation) scales this value by the
mean (:math:`\mu`) and standard deviation (:math:`\sigma`) of an impostor score
d
istribution. This score distribution can be computed before hand and it is
defined as the following.
Given a score :math:`s_i`, Z-Norm [Auckenthaler2000]_ and [Mariethoz2005]_
(zero-normalization) scales this value by the mean (:math:`\mu`) and standard
d
eviation (:math:`\sigma`) of an impostor score distribution. This score
distribution can be computed before hand and it is
defined as the following.
.. math::
...
...
@@ -734,13 +734,13 @@ T-Norm
======
.. _tnorm:
T-norm [Auckenthaler2000]_ and [Mariethoz2005]_ (Test-normalization) operates
in a probe-centric manner. If in the
Z-Norm :math:`\mu` and :math:`\sigma` are
estimated using an impostor set of
models and its scores, the t-norm computes
these statistics using the current
probe sample against at set of models in a
co-hort :math:`\Theta_{c}`. A co-
hort can be any semantic organization that is
sensible to your recognition
task, such as sex (male and females), ethnicity,
age, etc and is defined as the
following.
T-norm [Auckenthaler2000]_ and [Mariethoz2005]_ (Test-normalization) operates
in a probe-centric manner. If in the
Z-Norm :math:`\mu` and :math:`\sigma` are
estimated using an impostor set of
models and its scores, the t-norm computes
these statistics using the current
probe sample against at set of models in a
co-hort :math:`\Theta_{c}`. A co-
hort can be any semantic organization that is
sensible to your recognition
task, such as sex (male and females), ethnicity,
age, etc and is defined as the
following.
.. math::
...
...
@@ -771,8 +771,9 @@ ZT-Norm
=======
.. _ztnorm:
ZT-Norm [Auckenthaler2000]_ and [Mariethoz2005]_ consists in the application of :ref:`Z-Norm <znorm>` followed by a
:ref:`T-Norm <tnorm>` and it is implemented in :py:func:`bob.learn.em.ztnorm`.
ZT-Norm [Auckenthaler2000]_ and [Mariethoz2005]_ consists in the application of
:ref:`Z-Norm <znorm>` followed by a :ref:`T-Norm <tnorm>` and it is implemented
in :py:func:`bob.learn.em.ztnorm`.
Follow bellow an example of score normalization using
:py:func:`bob.learn.em.ztnorm`.
...
...
This diff is collapsed.
Click to expand it.
doc/plot/plot_ISV.py
+
6
−
6
View file @
ecfcbff8
...
...
@@ -9,12 +9,12 @@ numpy.random.seed(2) # FIXING A SEED
def
train_ubm
(
features
,
n_gaussians
):
"""
Train UBM
**Parameters**
features: 2D numpy array with the features
n_gaussians: Number of Gaussians
"""
input_size
=
features
.
shape
[
1
]
...
...
@@ -48,12 +48,12 @@ def train_ubm(features, n_gaussians):
def
isv_train
(
features
,
ubm
):
"""
Train U matrix
**Parameters**
features: List of :py:class:`bob.learn.em.GMMStats` organized by class
n_gaussians: UBM (:py:class:`bob.learn.em.GMMMachine`)
"""
stats
=
[]
...
...
This diff is collapsed.
Click to expand it.
doc/plot/plot_MAP.py
+
1
−
1
View file @
ecfcbff8
...
...
@@ -26,7 +26,7 @@ bob.learn.em.train(map_trainer, map_machine, data, max_iterations=200,
figure
,
ax
=
plt
.
subplots
()
#plt.scatter(data[:, 0], data[:, 1], c="olivedrab", label="new data")
#
plt.scatter(data[:, 0], data[:, 1], c="olivedrab", label="new data")
plt
.
scatter
(
setosa
[:,
0
],
setosa
[:,
1
],
c
=
"
darkcyan
"
,
label
=
"
setosa
"
)
plt
.
scatter
(
versicolor
[:,
0
],
versicolor
[:,
1
],
c
=
"
goldenrod
"
,
label
=
"
versicolor
"
)
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment