The package implements a generalized boosting framework, which incorporates different boosting approaches.
The Boosting algorithms implemented in this package are
The Boosting algorithms implemented in this package are:
1) Gradient Boost (generalized version of Adaboost) for univariate cases
2) TaylorBoost for univariate and multivariate cases
1) Gradient Boost [Fri00]_ (generalized version of Adaboost [FS99]_) for univariate cases using stump decision classifiers, as in [VJ04]_.
2) TaylorBoost [SMV11]_ for univariate and multivariate cases using Look-Up-Table based classifiers [Ata12]_
The weak classifiers associated with these boosting algorithms are
.. [Fri00] *Jerome H. Friedman*. **Greedy function approximation: a gradient boosting machine**. Annals of Statistics, 29:1189--1232, 2000.
.. [FS99] *Yoav Freund and Robert E. Schapire*. **A short introduction to boosting**. Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, September, 1999.
1) Stump classifiers
2) LUT based classifiers
.. [VJ04] *Paul Viola and Michael J. Jones*. **Robust real-time face detection**. International Journal of Computer Vision (IJCV), 57(2): 137--154, 2004.
.. [SMV11] *Mohammad J. Saberian, Hamed Masnadi-Shirazi, Nuno Vasconcelos*. **TaylorBoost: First and second-order boosting algorithms with explicit margin control**. IEEE Conference on Conference on Computer Vision and Pattern Recognition (CVPR), 2929--2934, 2011.
.. [Ata12] *Cosmin Atanasoaei*. **Multivariate boosting with look-up tables for face processing**. PhD Thesis, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland, 2012.
Check the following reference for the details:
Installation:
-------------
1. Viola, Paul, and Michael J. Jones. "Robust real-time face detection." International journal of computer vision 57.2 (2004): 137-154.
Bob
...
2. Saberian, Mohammad J., Hamed Masnadi-Shirazi, and Nuno Vasconcelos. "Taylorboost: First and second-order boosting algorithms with explicit margin control." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.
The boosting framework is dependent on the open source signal-processing and machine learning toolbox Bob_, which you need to download from its web page.
For more information, please read Bob's `installation instructions <https://github.com/idiap/bob/wiki/Packages>`_.
3. Cosmin Atanasoaei, "Multivariate Boosting with Look Up Table for face processing", PhD thesis (2012).
This package
............
The most simple way to download the latest stable version of the package is to use the Download button above and extract the archive into a directory of your choice.
If y want, you can also check out the latest development branch of this package using::
Once you have downloaded the package use the following two commands to install it:
Afterwards, please open a terminal in this directory and call::
$ python bootstrap.py
$ ./bin/buildout
These 2 commands should download and install all non-installed dependencies and get you a fully operational test and development environment.
These 2 commands should download and install all dependencies and get you a fully operational test and development environment.
Example
-------
To show an exemplary usage of the boosting algorithm, the binary and multi-variate classification of hand-written digits from the MNIST database is performed.
To show an exemplary usage of the boosting algorithm, binary and multi-variate classification of hand-written digits from the MNIST database is performed.
For simplicity, we just use the pixel gray values as (discrete) features to classify the digits.
In each boosting round, a single pixel location is selected.
In case of the stump classifier, this pixel value is compared to a threshold (which is determined during training), and one of the two classes is assigned.
In case of the LUT, for each value of the pixel the most probable digit is determined.
The LUT weak classifier selects a feature (i.e., a pixel location in the images) and determines the most probable digit for each pixel value.
Finally, the strong classifier combines several weak classifiers by a weighted sum of their predictions.
The script ``./bin/boosting_example.py`` is provided to perform all different examples.
This script has several command line parameters, which vary the behavior of the training and/or testing procedure.
All parameters have a long value (starting with ``--``) and a shotcut (starting with a single ``-``).
All parameters have a long value (starting with ``--``) and a shortcut (starting with a single ``-``).
These parameters are (see also ``./bin/boosting_example.py --help``):
To control the type of training, you can select:
...
...
@@ -51,7 +59,7 @@ To control the type of training, you can select:
* ``--trainer-type``: Select the type of weak classifier. Possible values are ``stump`` and ``lut``
* ``--loss-type``: Select the loss function. Possible values are ``tan``, ``log`` and ``exp``. By default, a loss function suitable to the trainer type is selected.
* ``--number-of-boosting-rounds``: The number of weak classifiers to select.
* ``--multi-variate`` (only valid for LUT trainer): Perform multi-vatriate classification, or binary (one-to-one) classification.
* ``--multi-variate`` (only valid for LUT trainer): Perform multi-variate classification, or binary (one-to-one) classification.
* ``--feature-selection-style`` (only valid for multi-variate training): Select the feature for each output ``independent``ly or ``shared``?
To control the experimentation, you can choose:
...
...
@@ -66,52 +74,54 @@ For information and debugging purposes, it might be interesting to use:
* ``--verbose`` (can be used several times): Increases the verbosity level from 0 (error) over 1 (warning) and 2 (info) to 3 (debug). Verbosity level 2 (``-vv``) is recommended.
* ``number-of-elements``: Reduce the number of elements per class (digit) to the given value.
Four different kinds of experimentations can be performed:
1. Uni-variate classification using the stump trainer:
if you want to see all the option associated with the command type:
$ ./bin/mnist_binary_one.py -h
.. note:
During the execution of the experiments, the warning message "L-BFGS returned warning '2': ABNORMAL_TERMINATION_IN_LNSRCH" might appear.
This warning message is normal and does not influence the results much.
To run the tests for all the combination of of ten digits use the following command:
.. note:
For experiment 1, the training terminates after 75 of 100 rounds since the computed weight for the weak classifier of that round is vanishing.
Hence, performing more boosting rounds will not change the strong classifier any more.
$ ./bin/mnist_binary_all.py
All experiments should be able to run using several minutes of execution time.
The results of the above experiments should be the following (split in the remaining classification error on the training set, and the error on the test set)
This command tests all the possible calumniation of digits which results in 45 different binary tests. The
accuracy of individual tests and the final average accuracy of all the tests is displayed on the console.
+------------+----------+----------+
| Experiment | Training | Test |
+------------+----------+----------+
| 1 | 91.04 % | 92.05 % |
+------------+----------+----------+
| 2 | 100.0 % | 95.35 % |
+------------+----------+----------+
| 3 | 97.59 % | 83.47 % |
+------------+----------+----------+
| 4 | 99.04 % | 86.25 % |
+------------+----------+----------+
b) The following command can be used for the multivariate digits test:
Of course, you can try out different combinations of digits for experiments 1 and 2.
$ ./bin/mnist_multi.py
Because of large number of samples and multivariate problem it requires times in days on a normal system. Use -h
option to see different option available with this command.
Getting Help
------------
In case you experience problems with the code, or with downloading the required databases and/or software, please contact manuel.guenther@idiap.ch or file a bug report under https://gitlab.idiap.ch/biometric/xbob-boosting.
parser.add_argument('-t','--trainer-type',default='stump',choices=TRAINER.keys(),help="The type of weak trainer used for boosting.")
parser.add_argument('-l','--loss-type',choices=LOSS.keys(),help="The type of loss function used in boosting to compute the weights for the weak classifiers.")
parser.add_argument('-r','--number-of-boosting-rounds',type=int,default=100,help="The number of boosting rounds, i.e., the number of weak classifiers.")
parser.add_argument('-s','--feature-selection-style',default='independent',choices={'indepenent','shared'},help="The feature selection style (only for multivariate classification with the LUT trainer).")
parser.add_argument('-s','--feature-selection-style',default='independent',choices={'independent','shared'},help="The feature selection style (only for multivariate classification with the LUT trainer).")
parser.add_argument('-d','--digits',type=int,nargs="+",choices=range(10),default=[5,6],help="Select the digits you want to compare.")
parser.add_argument('-a','--all-digits',action='store_true',help="Use all digits")
"""The test script to perform the multivariate classification on the digits from the MNIST dataset.
The MNIST data is exported using the xbob.db.mnist module which provide the train and test
partitions for the digits. LBP features are extracted and the available algorithms for
classification is Lut based Boosting.
"""
importxbob.db.mnist
importnumpy
importsys,getopt
importargparse
importstring
importbob
from..utilimportconfusion
from..featuresimportlocal_feature
from..coreimportboosting
importmatplotlib.pyplot
defmain():
parser=argparse.ArgumentParser(description=" The arguments for the boosting. ")
parser.add_argument('-r',default=20,dest="num_rnds",type=int,help="The number of round for the boosting")
parser.add_argument('-l',default='exp',dest="loss_type",type=str,choices={'log','exp'},help="The type of the loss function. Logit and Exponential functions are the avaliable options")
parser.add_argument('-s',default='indep',dest="selection_type",choices={'indep','shared'},type=str,help="The feature selection type for the LUT based trainer. For multivarite case the features can be selected by sharing or independently ")
parser.add_argument('-n',default=256,dest="num_entries",type=int,help="The number of entries in the LookUp table. It is the range of the feature values, e.g. if LBP features are used this values is 256.")
parser.add_argument('-f',default='lbp',dest="feature_type",type=str,choices={'lbp','mlbp','tlbp','dlbp'},help="The type of LBP features to be extracted from the image to perform the classification. The features are extracted from the block of varying scales")
parser.add_argument('-d',default=10,dest="num_digits",type=int,help="The number of digits to be considered for classification.")
args=parser.parse_args()
# download the dataset
db_object=xbob.db.mnist.Database()
# Hardcode the number of digits and the image size
num_digits=args.num_digits
img_size=28
# get the data (features and labels) for the selected digits from the xbob_db_mnist class functions
parser.add_argument('-t',default='StumpTrainer',dest="trainer_type",type=str,choices={'StumpTrainer','LutTrainer'},help="This is the type of trainer used for the boosting.")
parser.add_argument('-r',default=20,dest="num_rnds",type=int,help="The number of round for the boosting")
parser.add_argument('-l',default='exp',dest="loss_type",type=str,choices={'log','exp'},help="The type of the loss function. Logit and Exponential functions are the avaliable options")
parser.add_argument('-s',default='indep',dest="selection_type",choices={'indep','shared'},type=str,help="The feature selection type for the LUT based trainer. For multivarite case the features can be selected by sharing or independently ")
parser.add_argument('-n',default=256,dest="num_entries",type=int,help="The number of entries in the LookUp table. It is the range of the feature values, e.g. if LBP features are used this values is 256.")
args=parser.parse_args()
# Initializations
accu=0
test_num=0
# download the dataset
db_object=xbob.db.mnist.Database()
# select the digits to classify
fordigit1inrange(10):
fordigit2inrange(digit1+1,10):
test_num=test_num+1
# get the data (features and labels) for the selected digits from the xbob_db_mnist class functions
parser.add_argument('-t',default='StumpTrainer',dest="trainer_type",type=str,choices={'StumpTrainer','LutTrainer'},help="This is the type of trainer used for the boosting.")
parser.add_argument('-r',default=20,dest="num_rnds",type=int,help="The number of round for the boosting")
parser.add_argument('-l',default='exp',dest="loss_type",type=str,choices={'log','exp'},help="The type of the loss function. Logit and Exponential functions are the avaliable options")
parser.add_argument('-s',default='indep',dest="selection_type",choices={'indep','shared'},type=str,help="The feature selection type for the LUT based trainer. For multivarite case the features can be selected by sharing or independently ")
parser.add_argument('-n',default=256,dest="num_entries",type=int,help="The number of entries in the LookUp table. It is the range of the feature values, e.g. if LBP features are used this values is 256.")
parser.add_argument('-d1',default=1,dest="digit1",type=int,choices={0,1,2,3,4,5,6,7,8,9},help=" The first digit for the classficaton test.")
parser.add_argument('-d2',default=2,dest="digit2",type=int,choices={0,1,2,3,4,5,6,7,8,9},help=" The second digit for the classficaton test.")