The most simple way to download the latest stable version of the package is to use the Download button above and extract the archive into a directory of your choice.
If y want, you can also check out the latest development branch of this package using::
Afterwards, please open a terminal in this directory and call::
...
...
@@ -122,6 +122,6 @@ Of course, you can try out different combinations of digits for experiments 1 an
Getting Help
------------
In case you experience problems with the code, or with downloading the required databases and/or software, please contact manuel.guenther@idiap.ch or file a bug report under https://gitlab.idiap.ch/biometric/xbob-boosting.
In case you experience problems with the code, or with downloading the required databases and/or software, please contact manuel.guenther@idiap.ch or file a bug report under https://github.com/bioidiap/bob.learn.boosting.
weak_trainer (a :py:class:`xbob.boosting.trainer.LUTTrainer` or :py:class:`xbob.boosting.trainer.StumpTrainer`): The class to train weak machines.
Keyword parameters
weak_trainer : :py:class:`bob.learn.boosting.LUTTrainer` or :py:class:`bob.learn.boosting.StumpTrainer`
The class to train weak machines.
loss_function (a class derived from :py:class:`xbob.boosting.loss.LossFunction`): The function to define the weights for the weak machines.
loss_function : a class derived from :py:class:`bob.learn.boosting.LossFunction`
The function to define the weights for the weak machines.
"""
...
...
@@ -38,17 +41,20 @@ class Boosting:
Keyword parameters:
training_features (uint16 <#samples, #features> or float <#samples, #features>): Features extracted from the training samples.
training_targets (float <#samples, #outputs>): The values that the boosted classifier should reach for the given samples.
training_features : uint16 <#samples, #features> or float <#samples, #features>)
Features extracted from the training samples.
number_of_rounds (int): The number of rounds of boosting, i.e., the number of weak classifiers to select.
training_targets : float <#samples, #outputs>
The values that the boosted classifier should reach for the given samples.
boosted_machine (:py:class:`xbob.boosting.machine.BoostedMachine` or None): the machine to add the weak machines to. If not given, a new machine is created.
number_of_rounds : int
The number of rounds of boosting, i.e., the number of weak classifiers to select.
Returns
boosted_machine :py:class:`bob.learn.boosting.BoostedMachine` or None
The machine to add the weak machines to. If not given, a new machine is created.
:py:class:`xbob.boosting.machine.BoostedMachine` The boosted machine that is combination of the weak classifiers.
"""The test script to perform the binary classification on the digits from the MNIST dataset.
The MNIST data is exported using the xbob.db.mnist module which provide the train and test
partitions for the digits. Pixel values of grey scale images are used as features and the
available algorithms for classification are Lut based Boosting and Stump based Boosting.
The MNIST data is exported using a module similar to the xbob.db.mnist module which provide the train and test partitions for the digits.
Pixel values of grey scale images are used as features and the available algorithms for classification are Lut based Boosting and Stump based Boosting.
Thus it conducts only one binary classifcation test.
parser.add_argument('-r','--number-of-boosting-rounds',type=int,default=100,help="The number of boosting rounds, i.e., the number of weak classifiers.")
parser.add_argument('-s','--feature-selection-style',default='independent',choices={'independent','shared'},help="The feature selection style (only for multivariate classification with the LUT trainer).")
parser.add_argument('-s','--feature-selection-style',default='independent',choices=('independent','shared'),help="The feature selection style (only for multivariate classification with the LUT trainer).")
parser.add_argument('-d','--digits',type=int,nargs="+",choices=range(10),default=[5,6],help="Select the digits you want to compare.")
parser.add_argument('-a','--all-digits',action='store_true',help="Use all digits")
As an example for the classification task, we perform a classification of hand-written digits using the `MNIST <http://yann.lecun.com/exdb/mnist>`_ database.
There, images of single hand-written digits are stored, and a training and test set is provided, which we can access with our `xbob.db.mnist <http://pypi.python.org/pypi/xbob.db.mnist>`_ database interface.
.. note::
In fact, to minimize the dependencies to other packages, the ``xbob.db.mnist`` database interface is replaced by a local interface.
In our experiments, we simply use the pixel gray values as features.
Since the gray values are discrete in range :math:`[0, 255]`, we can employ both the stump decision classifiers and the look-up-table's.
Nevertheless, other discrete features, like Local Binary Patterns (LBP) could be used as well.
...
...
@@ -103,14 +106,13 @@ One exemplary test case in details
----------------------------------
Having a closer look into the example script, there are several steps that are performed.
The first step is generating the training examples from the ``xbob.db.mnist`` database interface.
The first step is generating the training examples from the MNIST database interface.
Here, we describe the more complex way, i.e., the multi-variate case.
.. doctest::
>>> # open the database interface (will download the digits from the webpage)
>>> db = xbob.db.mnist.Database()
Downloading the mnist database from http://yann.lecun.com/exdb/mnist/ ...
Threshold :math:`\theta`, polarity :math:`phi` and index :math:`m` are parameters of the classifier, which are trained using the :py:class:`xbob.boosting.trainer.StumpTrainer`.
Threshold :math:`\theta`, polarity :math:`phi` and index :math:`m` are parameters of the classifier, which are trained using the :py:class:`bob.learn.boosting.StumpTrainer`.
For a given training set :math:`\{\vec x_p \mid p=1,\dots,P\}` and according target values :math:`\{t_p \mid p=1,\dots,P\}`, the threshold :math:`\theta_m` is computed for each input index :math:`m`, such that the lowest classification error is obtained, and the :math:`m` with the lowest training classification error is taken.
The polarity :math:`\phi` is set to :math:`-1`, if values lower than the threshold should be considered as positive examples, or to :math:`+1` otherwise.
To compute the classification error for a given :math:`\theta_m`, the gradient of a loss function is taken into consideration.
For the stump trainer, usually the :py:class:`xbob.boosting.loss.ExponentialLoss` is considered as the loss function.
For the stump trainer, usually the :py:class:`bob.learn.boosting.ExponentialLoss` is considered as the loss function.
Look-Up-Table classifier
........................
The second classifier, which can handle univariate and multivariate classification and regression tasks, is the :py:class:`xbob.boosting.machine.LUTMachine`.
The second classifier, which can handle univariate and multivariate classification and regression tasks, is the :py:class:`bob.learn.boosting.LUTMachine`.
This classifier is designed to handle input vectors with **discrete** values only.
Again, the decision of the weak classifier is based on a single element of the input vector :math:`\vec x`.
...
...
@@ -63,7 +63,7 @@ In the univariate case, for each of the possible discrete values of :math:`x_m`,
.. math::
W(\vec x) = LUT[x_m]
This look-up-table LUT and the feature index :math:`m` is trained by the :py:class:`xbob.boosting.trainer.LUTTrainer`.
This look-up-table LUT and the feature index :math:`m` is trained by the :py:class:`bob.learn.boosting.LUTTrainer`.
In the multivariate case, each output :math:`W^o` is handled independently, i.e., a separate look-up-table :math:`LUT^o` and a separate feature index :math:`m^o` is assigned for each output dimension :math:`o`:
...
...
@@ -71,16 +71,16 @@ In the multivariate case, each output :math:`W^o` is handled independently, i.e.
W^o(\vec x) = LUT^o[x_{m^o}]
.. note::
As a variant, the feature index :math:`m^o` can be selected to be ``shared`` for all outputs, see :py:class:`xbob.boosting.trainer.LUTTrainer` for details.
As a variant, the feature index :math:`m^o` can be selected to be ``shared`` for all outputs, see :py:class:`bob.learn.boosting.LUTTrainer` for details.
A weak look-up-table classifier is learned using the :py:class:`xbob.boosting.trainer.LUTTrainer`.
A weak look-up-table classifier is learned using the :py:class:`bob.learn.boosting.LUTTrainer`.
Strong classifier
-----------------
The strong classifier, which is of type :py:class:`xbob.boosting.machine.BoostedMachine`, is a weighted combination of weak classifiers, which are usually of the same type.
It can be trained with the :py:class:`xbob.boosting.trainer.Boosting` trainer, which takes a list of training samples, and a list of univariate or multivariate target vectors.
The strong classifier, which is of type :py:class:`bob.learn.boosting.BoostedMachine`, is a weighted combination of weak classifiers, which are usually of the same type.
It can be trained with the :py:class:`bob.learn.boosting.Boosting` trainer, which takes a list of training samples, and a list of univariate or multivariate target vectors.
In several rounds, the trainer computes (here, only the univariate case is considered, but the multivariate case is similar -- simply replace scores by score vectors.):
1. The classification results (the so-called *scores*) for the current strong classifier:
...
...
@@ -120,11 +120,11 @@ Loss functions
As shown above, the loss functions define, how well the currently predicted scores :math:`s_p` fit to the target values :math:`t_p`.
Depending on the desired task, and on the type of classifier, different loss functions might be used:
1. The :py:class:`xbob.boosting.loss.ExponentialLoss` can be used for the binary classification task, i.e., when target values are in :math:`{+1, -1}`
1. The :py:class:`bob.learn.boosting.ExponentialLoss` can be used for the binary classification task, i.e., when target values are in :math:`{+1, -1}`
2. The :py:class:`xbob.boosting.loss.LogitLoss` can be used for the multi-variate classification task, i.e., when target vectors have entries from :math:`{+1, 0}`
2. The :py:class:`bob.learn.boosting.LogitLoss` can be used for the multi-variate classification task, i.e., when target vectors have entries from :math:`{+1, 0}`
3. The :py:class:`xbob.boosting.loss.JesorskyLoss` can be used for the particular multi-variate regression task of learning the locations of facial features.
3. The :py:class:`bob.learn.boosting.JesorskyLoss` can be used for the particular multi-variate regression task of learning the locations of facial features.
Other loss functions, e.g., using the Euclidean distance for regression, should be easily implementable.
This section includes information for using the Python API of ``xbob.boosting``.
This section includes information for using the Python API of ``bob.learn.boosting``.
Machines
........
The :py:mod:`xbob.boosting.machine` sub-module contains classifiers that can predict classes for given input values.
The strong classifier is the :py:class:`xbob.boosting.machine.BoostedMachine`, which is a weighted combination of :py:class:`xbob.boosting.machine.WeakMachine`.
Weak machines might be a :py:class:`xbob.boosting.machine.LUTMachine` or a :py:class:`xbob.boosting.machine.StumpMachine`.
The :py:mod:`bob.learn.boosting.machine` sub-module contains classifiers that can predict classes for given input values.
The strong classifier is the :py:class:`bob.learn.boosting.BoostedMachine`, which is a weighted combination of :py:class:`bob.learn.boosting.WeakMachine`.
Weak machines might be a :py:class:`bob.learn.boosting.LUTMachine` or a :py:class:`bob.learn.boosting.StumpMachine`.
Theoretically, the strong classifier can consist of different types of weak classifiers, but usually all weak classifiers have the same type.
.. automodule:: xbob.boosting.machine
.. automodule:: bob.learn.boosting.machine
Trainers
........
The :py:mod:`xbob.boosting.trainer` sub-module contains trainers that trains:
The :py:mod:`bob.learn.boosting.trainer` sub-module contains trainers that trains:
* :py:class:`xbob.boosting.trainer.Boosting` : a strong machine of type :py:class:`xbob.boosting.machine.BoostedMachine`
* :py:class:`xbob.boosting.trainer.LUTTrainer` : a weak machine of type :py:class:`xbob.boosting.machine.LUTMachine`
* :py:class:`xbob.boosting.trainer.StrumTrainer` : a weak machine of type :py:class:`xbob.boosting.machine.StumpMachine`
* :py:class:`bob.learn.boosting.Boosting` : a strong machine of type :py:class:`bob.learn.boosting.BoostedMachine`
* :py:class:`bob.learn.boosting.LUTTrainer` : a weak machine of type :py:class:`bob.learn.boosting.LUTMachine`
* :py:class:`bob.learn.boosting.StrumTrainer` : a weak machine of type :py:class:`bob.learn.boosting.StumpMachine`
.. automodule:: xbob.boosting.trainer
.. automodule:: bob.learn.boosting.trainer
Loss functions
..............
Loss functions are used to define new weights for the weak machines using the ``scipy.optimize.fmin_l_bfgs_b`` function.
A base class loss function :py:class:`xbob.boosting.loss.LossFunction` is called by that function, and derived classes implement the actual loss for a single sample.
A base class loss function :py:class:`bob.learn.boosting.LossFunction` is called by that function, and derived classes implement the actual loss for a single sample.
.. note::
Loss functions are designed to be used in combination with a specific weak trainer in specific cases.
Not all combinations of loss functions and weak trainers make sense.
Here is a list of useful combinations:
1. :py:class:`xbob.boosting.loss.ExponentialLoss` with :py:class:`xbob.boosting.trainer.StrumTrainer` (uni-variate classification only)
2. :py:class:`xbob.boosting.loss.LogitLoss` with :py:class:`xbob.boosting.trainer.StrumTrainer` or :py:class:`xbob.boosting.trainer.LUTTrainer` (uni-variate or multi-variate classification)
3. :py:class:`xbob.boosting.loss.TangentialLoss` with :py:class:`xbob.boosting.trainer.StrumTrainer` or :py:class:`xbob.boosting.trainer.LUTTrainer` (uni-variate or multi-variate classification)
4. :py:class:`xbob.boosting.loss.JesorskyLoss` with :py:class:`xbob.boosting.trainer.LUTTrainer` (multi-variate regression only)
1. :py:class:`bob.learn.boosting.ExponentialLoss` with :py:class:`bob.learn.boosting.StrumTrainer` (uni-variate classification only)
2. :py:class:`bob.learn.boosting.LogitLoss` with :py:class:`bob.learn.boosting.StrumTrainer` or :py:class:`bob.learn.boosting.LUTTrainer` (uni-variate or multi-variate classification)
3. :py:class:`bob.learn.boosting.TangentialLoss` with :py:class:`bob.learn.boosting.StrumTrainer` or :py:class:`bob.learn.boosting.LUTTrainer` (uni-variate or multi-variate classification)
4. :py:class:`bob.learn.boosting.JesorskyLoss` with :py:class:`bob.learn.boosting.LUTTrainer` (multi-variate regression only)