Commit 4e95326a authored by Manuel Günther's avatar Manuel Günther
Browse files

Fixed conceptual bug in StumpTrainer; added tests for boosting; added MNIST example.

parent 6b00b655
=============================================================================
========================================================================================
Generalized Boosting Framework using Stump and Look Up Table (LUT) based Weak Classifier
=============================================================================
The package implements a generalized boosting framework which incorporate different
boosting approaches. The Boosting algorithms implemented in this package are
========================================================================================
The package implements a generalized boosting framework, which incorporates different boosting approaches.
The Boosting algorithms implemented in this package are
1) Gradient Boost (generalized version of Adaboost) for univariate cases
2) TaylorBoost for univariate and multivariate cases
The weak classifiers associated with these boosting algorithms are
The weak classifiers associated with these boosting algorithms are
1) Stump classifiers
2) LUT based classifiers
Check the following reference for the details:
Check the following reference for the details:
1. Viola, Paul, and Michael J. Jones. "Robust real-time face detection."
International journal of computer vision 57.2 (2004): 137-154.
1. Viola, Paul, and Michael J. Jones. "Robust real-time face detection." International journal of computer vision 57.2 (2004): 137-154.
2. Saberian, Mohammad J., Hamed Masnadi-Shirazi, and Nuno Vasconcelos. "Taylorboost:
First and second-order boosting algorithms with explicit margin control." Computer
Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.
2. Saberian, Mohammad J., Hamed Masnadi-Shirazi, and Nuno Vasconcelos. "Taylorboost: First and second-order boosting algorithms with explicit margin control." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.
3. Cosmin Atanasoaei, "Multivariate Boosting with Look Up Table for face processing",
PhD thesis (2012).
3. Cosmin Atanasoaei, "Multivariate Boosting with Look Up Table for face processing", PhD thesis (2012).
Testdata:
Installation:
----------
The test are performed on the MNIST digits dataset. The tests can be mainly divided into
two categories:
Once you have downloaded the package use the following two commands to install it:
1) Univariate Test: It corresponds to binary classification problem. The digits are tested
one-vs-one and one-vs-all. Both the boosting algorithm (Gradient Boost and Taylor boost)
can be used for testing this scenario.
$ python bootstrap.py
2) Multivariate Test: It is the multi class classification problem. All the 10 digit classification
is considered in a single test. Only Multivariate Taylor boosting can be used for testing this scenario.
$ ./bin/buildout
Installation:
----------
These 2 commands should download and install all non-installed dependencies and get you a fully operational test and development environment.
Once you have downloaded the package use the following two commands to install it:
$ python bootstrap.py
Example
-------
To show an exemplary usage of the boosting algorithm, the binary and multi-variate classification of hand-written digits from the MNIST database is performed.
For simplicity, we just use the pixel gray values as (discrete) features to classify the digits.
In each boosting round, a single pixel location is selected.
In case of the stump classifier, this pixel value is compared to a threshold (which is determined during training), and one of the two classes is assigned.
In case of the LUT, for each value of the pixel the most probable digit is determined.
$ ./bin/buildout
The script ``./bin/boosting_example.py`` is provided to perform all different examples.
This script has several command line parameters, which vary the behavior of the training and/or testing procedure.
All parameters have a long value (starting with ``--``) and a shotcut (starting with a single ``-``).
These parameters are (see also ``./bin/boosting_example.py --help``):
To control the type of training, you can select:
* ``--trainer-type``: Select the type of weak classifier. Possible values are ``stump`` and ``lut``
* ``--loss-type``: Select the loss function. Possible values are ``tan``, ``log`` and ``exp``. By default, a loss function suitable to the trainer type is selected.
* ``--number-of-boosting-rounds``: The number of weak classifiers to select.
* ``--multi-variate`` (only valid for LUT trainer): Perform multi-vatriate classification, or binary (one-to-one) classification.
* ``--feature-selection-style`` (only valid for multi-variate training): Select the feature for each output ``independent``ly or ``shared``?
To control the experimentation, you can choose:
* ``--digits``: The digits to classify. For multi-variate training, one classifier is trained for all given digits, while for uni-variate training all possible one-to-one classifiers are trained.
* ``--all``: Select all 10 digits.
* ``--classifier-file``: Save the trained classifier(s) into the given file and/or read the classifier(s) from this file.
* ``--force``: Overwrite the given classifier file if it already exists.
For information and debugging purposes, it might be interesting to use:
* ``--verbose`` (can be used several times): Increases the verbosity level from 0 (error) over 1 (warning) and 2 (info) to 3 (debug). Verbosity level 2 (``-vv``) is recommended.
* ``number-of-elements``: Reduce the number of elements per class (digit) to the given value.
Four different kinds of experimentations can be performed:
1. Uni-variate classification using the stump trainer:
$ ./bin/boosting_example.py -vv --trainer-type stump --digits 5 6 --classifier-file stump.hdf5
2. Uni-variate classification using the LUT trainer:
$ ./bin/boosting_example.py -vv --trainer-type lut --digits 5 6 --classifier-file lut_uni.hdf5
3. Multi-variate classification using LUT training and shared features.
$ ./bin/boosting_example.py -vv --trainer-type lut --all-digits ----classifier-file lut_shared.hdf5
4. Multi-variate classification using LUT training and independent features.
$ ./bin/boosting_example.py -vv --trainer-type lut --all-digits --classifier-file lut_shared.hdf5
These 2 commands should download and install all non-installed dependencies and
get you a fully operational test and development environment.
User Guide
----------
......@@ -55,10 +91,10 @@ User Guide
This section explains how to use the package in order to: a) test the MNIST dataset for binary classification
b) test the dataset for multi class classification.
a) The following command will run a single binary test for the digits specified and display the classification
a) The following command will run a single binary test for the digits specified and display the classification
accuracy on the console:
$ ./bin/mnist_binary_one.py
$ ./bin/mnist_binary_one.py
if you want to see all the option associated with the command type:
......@@ -66,16 +102,16 @@ if you want to see all the option associated with the command type:
To run the tests for all the combination of of ten digits use the following command:
$ ./bin/mnist_binary_all.py
$ ./bin/mnist_binary_all.py
This command tests all the possible calumniation of digits which results in 45 different binary tests. The
This command tests all the possible calumniation of digits which results in 45 different binary tests. The
accuracy of individual tests and the final average accuracy of all the tests is displayed on the console.
b) The following command can be used for the multivariate digits test:
$ ./bin/mnist_multi.py
$ ./bin/mnist_multi.py
Because of large number of samples and multivariate problem it requires times in days on a normal system. Use -h
option to see different option available with this command.
Because of large number of samples and multivariate problem it requires times in days on a normal system. Use -h
option to see different option available with this command.
......@@ -121,7 +121,11 @@ setup(
# scripts should be declared using this entry:
'console_scripts': [
],
'boosting_example.py = xbob.boosting.examples.mnist:main',
# 'mnist_binary_all.py = xbob.boosting.scripts.mnist_binary_all:main',
# 'mnist_binary_one.py = xbob.boosting.scripts.mnist_binary_one:main',
# 'mnist_multi.py = xbob.boosting.scripts.mnist_multi:main',
],
# tests that are _exported_ (that can be executed by other packages) can
# be signalized like this:
......
......@@ -5,4 +5,5 @@ from ._boosting import StumpMachine, LUTMachine, BoostedMachine, weighted_histog
import trainer
import loss
#import examples
#import tests
......@@ -32,6 +32,12 @@ void StumpMachine::forward3(const blitz::Array<double, 2>& features, blitz::Arra
}
}
void StumpMachine::forward4(const blitz::Array<double, 2>& features, blitz::Array<double,2> predictions) const{
for (int i = features.extent(0); i--;){
predictions(i,0) = _predict(features(i, (int)m_index));
}
}
double StumpMachine::forward1(const blitz::Array<uint16_t, 1>& features) const{
return _predict(features((int)m_index));
......@@ -43,6 +49,12 @@ void StumpMachine::forward3(const blitz::Array<uint16_t, 2>& features, blitz::Ar
}
}
void StumpMachine::forward4(const blitz::Array<uint16_t, 2>& features, blitz::Array<double,2> predictions) const{
for (int i = features.extent(0); i--;){
predictions(i,0) = _predict(features(i, (int)m_index));
}
}
blitz::Array<int32_t,1> StumpMachine::getIndices() const{
blitz::Array<int32_t, 1> ret(1);
......
......@@ -24,6 +24,10 @@ class StumpMachine : public WeakMachine{
virtual void forward3(const blitz::Array<double, 2>& features, blitz::Array<double,1> predictions) const;
virtual void forward3(const blitz::Array<uint16_t, 2>& features, blitz::Array<double,1> predictions) const;
// forwarding of multiple features
virtual void forward4(const blitz::Array<double, 2>& features, blitz::Array<double,2> predictions) const;
virtual void forward4(const blitz::Array<uint16_t, 2>& features, blitz::Array<double,2> predictions) const;
// the index used by this machine
virtual blitz::Array<int32_t,1> getIndices() const;
......
......@@ -17,9 +17,11 @@ using namespace boost::python;
// Stump machine access
static double f11(StumpMachine& s, const blitz::Array<double,1>& f){return s.forward1(f);}
static void f12(StumpMachine& s, const blitz::Array<double,2>& f, blitz::Array<double,1> p){s.forward3(f,p);}
static void f13(StumpMachine& s, const blitz::Array<double,2>& f, blitz::Array<double,2> p){s.forward4(f,p);}
static double f21(StumpMachine& s, const blitz::Array<uint16_t,1>& f){return s.forward1(f);}
static void f22(StumpMachine& s, const blitz::Array<uint16_t,2>& f, blitz::Array<double,1> p){s.forward3(f,p);}
static void f23(StumpMachine& s, const blitz::Array<uint16_t,2>& f, blitz::Array<double,2> p){s.forward4(f,p);}
// boosted machine access, which allows multi-threading
static double forward1(const BoostedMachine& self, const blitz::Array<uint16_t, 1>& features){bob::python::no_gil t; return self.forward1(features);}
......@@ -79,8 +81,10 @@ BOOST_PYTHON_MODULE(_boosting) {
.def(init<bob::io::HDF5File&>((arg("self"),arg("file")), "Creates a new machine from file."))
.def("__call__", &f11, (arg("self"), arg("features")), "Returns the prediction for the given feature vector.")
.def("__call__", &f12, (arg("self"), arg("features"), arg("predictions")), "Computes the predictions for the given feature set (uni-variate only).")
.def("__call__", &f13, (arg("self"), arg("features"), arg("predictions")), "Computes the predictions for the given feature set (uni-variate only).")
.def("__call__", &f21, (arg("self"), arg("features")), "Returns the prediction for the given feature vector.")
.def("__call__", &f22, (arg("self"), arg("features"), arg("predictions")), "Computes the predictions for the given feature set (uni-variate only).")
.def("__call__", &f23, (arg("self"), arg("features"), arg("predictions")), "Computes the predictions for the given feature set (uni-variate only).")
.def("load", &StumpMachine::load, "Reads a Machine from file")
.def("save", &StumpMachine::save, "Writes the machine to file")
......
......@@ -10,13 +10,16 @@ Thus it conducts only one binary classifcation test.
"""
import xbob.db.mnist
import numpy
import argparse
import xbob.db.mnist
import os
import bob
import xbob.db.mnist
import xbob.boosting
import logging
logger = logging.getLogger('bob')
TRAINER = {
'stump' : xbob.boosting.trainer.StumpTrainer,
......@@ -33,17 +36,91 @@ def command_line_arguments():
"""Defines the command line options."""
parser = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument('-t', '--trainer-type', default = 'stump', choices = TRAINER.keys(), help = "The type of weak trainer used for boosting." )
parser.add_argument('-l', '--loss-type', default = 'exp', choices = LOSS.keys(), help = "The type of loss function used in boosting to compute the weights for the weak classifiers.")
parser.add_argument('-r', '--number-of-boosting-rounds', type = int, default = 20, help = "The number of boosting rounds, i.e., the number of weak classifiers.")
parser.add_argument('-l', '--loss-type', choices = LOSS.keys(), help = "The type of loss function used in boosting to compute the weights for the weak classifiers.")
parser.add_argument('-r', '--number-of-boosting-rounds', type = int, default = 100, help = "The number of boosting rounds, i.e., the number of weak classifiers.")
parser.add_argument('-m', '--multi-variate', action = 'store_true', help = "Perform multi-variate training?")
parser.add_argument('-s', '--feature-selection-style', default = 'independent', choices = {'indepenent', 'shared'}, help = "The feature selection style (only for multivariate classification with the LUT trainer).")
parser.add_argument('-d', '--digits', type = int, nargs="+", choices=range(10), default=[5,6], help = "Select the digits you want to compare.")
parser.add_argument('-a', '--all-digits', action='store_true', help = "Use all digits")
parser.add_argument('-n', '--number-of-elements', type = int, help = "For testing purposes: limit the number of training and test examples for each class.")
parser.add_argument('-c', '--classifier-file', help = "If selected, the strong classifier will be stored in this file (or loaded from it if it already exists).")
return parser.parse_args()
parser.add_argument('-F', '--force', action='store_true', help = "Re-train the strong classifier, even if the --classifier-file already exists.")
parser.add_argument('-v', '--verbose', action = 'count', default = 0, help = "Increase the verbosity level (up too three times)")
args = parser.parse_args()
if args.trainer_type == 'stump' and args.multi_variate:
raise ValueError("The stump trainer cannot handle multi-variate training.")
if args.all_digits:
args.digits = range(10)
if len(args.digits) < 2:
raise ValueError("Please select at least two digits to classify, or --all to classify all digits")
if args.loss_type is None:
args.loss_type = 'exp' if args.trainer_type == 'stump' else 'log'
logger.setLevel({
0: logging.ERROR,
1: logging.WARNING,
2: logging.INFO,
3: logging.DEBUG
}[args.verbose])
return args
def align(input, output, digits, multi_variate = False):
if multi_variate:
# just one classifier, with multi-variate output
input = numpy.vstack(input).astype(numpy.uint16)
# create output data
target = - numpy.ones((input.shape[0], len(output)))
output = numpy.hstack(output)
for i,d in enumerate(digits):
target[output == d, i] = 1
return {'multi' : (input, target)}
else:
# create pairs of one-to-one classifiers
problems = {}
for i, d1 in enumerate(digits):
for j, d2 in enumerate(digits[i+1:]):
key = "%d-vs-%d" % (d1, d2)
cur_input = numpy.vstack([input[i], input[j+1]]).astype(numpy.uint16)
target = numpy.ones((cur_input.shape[0]))
target[output[i].shape[0]:target.shape[0]] = -1
problems[key] = (cur_input, target)
return problems
def read_data(db, which, digits, count, multi_variate):
input = []
output = []
for d in digits:
digit_data = db.data(which, labels = d)
if count is not None:
digit_data = (digit_data[0][:count], digit_data[1][:count])
input.append(digit_data[0])
output.append(digit_data[1])
return align(input, output, digits, multi_variate)
def performance(targets, labels, key, multi_variate):
difference = targets == labels
if multi_variate:
sum = numpy.sum(difference, 1)
print "Classified", numpy.sum(sum == difference.shape[1]), "of", difference.shape[0], "elements correctly"
accuracy = float(numpy.sum(sum == difference.shape[1])) / difference.shape[0]
else:
print "Classified", numpy.sum(difference), "of", difference.shape[0], "elements correctly"
accuracy = float(numpy.sum(difference)) / difference.shape[0]
print "The classification accuracy for", key, "is", accuracy * 100, "%"
def main():
......@@ -51,55 +128,84 @@ def main():
args = command_line_arguments()
# open connection to the MNIST database
db = xbob.db.mnist.Database()
db = xbob.db.mnist.Database("Database")
# perform training, if desired
if args.force and os.path.exists(args.classifier_file):
os.remove(args.classifier_file)
if args.classifier_file is None or not os.path.exists(args.classifier_file):
# get the training data
training_features, training_labels = db_object.data('train', labels = args.digits)
print training_labels
fea_test, label_test = db_object.data('test', labels = args.digits)
# Format the label data into int and change the class labels to -1 and +1
label_train = label_train.astype(int)
label_test = label_test.astype(int)
label_train[label_train == digit1] = 1
label_test[label_test == digit1] = 1
label_train[label_train == digit2] = -1
label_test[label_test == digit2] = -1
print label_train.shape
print label_test.shape
# Initialize the trainer with 'LutTrainer' or 'StumpTrainer'
boost_trainer = boosting.Boost(args.trainer_type)
# Set the parameters for the boosting
boost_trainer.num_rnds = args.num_rnds
boost_trainer.loss_type = args.loss_type
boost_trainer.selection_type = args.selection_type
boost_trainer.num_entries = args.num_entries
# Perform boosting of the feature set samp
machine = boost_trainer.train(fea_train, label_train)
# Classify the test samples (testsamp) using the boosited classifier generated above
pred_scores, prediction_labels = machine.classify(fea_test)
# calculate the accuracy in percentage for the curent classificaiton test
#label_test = label_test[:,numpy.newaxis]
accuracy = 100*float(sum(prediction_labels == label_test))/(len(label_test))
print "The accuracy of binary classification test with digits %d and %d is %f " % (digit1, digit2, accuracy)
# get the (aligned) training data
logger.info("Reading training data")
training_data = read_data(db, "train", args.digits, args.number_of_elements, args.multi_variate)
# get weak trainer according to command line options
if args.trainer_type == 'stump':
weak_trainer = xbob.boosting.trainer.StumpTrainer()
elif args.trainer_type == 'lut':
weak_trainer = xbob.boosting.trainer.LUTTrainer(
256,
training_data.values()[0][0].shape[1],
training_data.values()[0][1].shape[1] if args.multi_variate else 1,
args.feature_selection_style
)
# get the loss function
loss_function = LOSS[args.loss_type]()
# create strong trainer
trainer = xbob.boosting.trainer.Boosting(weak_trainer, loss_function, args.number_of_boosting_rounds)
strong_classifiers = {}
for key in sorted(training_data.keys()):
training_input, training_target = training_data[key]
if args.multi_variate:
logger.info("Starting training with %d training samples and %d outputs" % (training_target.shape[0], training_target.shape[1]))
else:
logger.info("Starting training with %d training samples for %s" % (training_target.shape[0], key))
# and train the strong classifier
strong_classifier = trainer.train(training_input, training_target)
# write strong classifier to file
if args.classifier_file is not None:
hdf5 = bob.io.HDF5File(args.classifier_file, 'a')
hdf5.create_group(key)
hdf5.cd(key)
strong_classifier.save(hdf5)
del hdf5
strong_classifiers[key] = strong_classifier
# compute training performance
logger.info("Evaluating training data")
scores = numpy.zeros(training_target.shape)
labels = numpy.zeros(training_target.shape)
strong_classifier(training_input, scores, labels)
performance(training_target, labels, key, args.multi_variate)
else:
# read strong classifier from file
strong_classifiers = {}
hdf5 = bob.io.HDF5File(args.classifier_file, 'r')
for key in hdf5.sub_groups(relative=True, recursive=False):
hdf5.cd(key)
strong_classifiers[key] = xbob.boosting.BoostedMachine(hdf5)
hdf5.cd("..")
logger.info("Reading test data")
test_data = read_data(db, "test", args.digits, args.number_of_elements, args.multi_variate)
for key in sorted(test_data.keys()):
test_input, test_target = test_data[key]
logger.info("Classifying %d test samples for %s" % (test_target.shape[0], key))
# classify test samples
scores = numpy.zeros(test_target.shape)
labels = numpy.zeros(test_target.shape)
strong_classifiers[key](test_input, scores, labels)
performance(test_target, labels, key, args.multi_variate)
......
#!/usr/bin/env python
"""The test script to perform the binary classification on the digits from the MNIST dataset.
The MNIST data is exported using the xbob.db.mnist module which provide the train and test
The MNIST data is exported using the xbob.db.mnist module which provide the train and test
partitions for the digits. Pixel values of grey scale images are used as features and the
available algorithms for classification are Lut based Boosting and Stump based Boosting.
The script test all the possible combination of the two digits which results in 45 different
binary classfication tests.
The script test all the possible combination of the two digits which results in 45 different
binary classfication tests.
"""
......@@ -23,7 +23,7 @@ def main():
parser = argparse.ArgumentParser(description = " The arguments for the boosting. ")
parser.add_argument('-t', default = 'StumpTrainer',dest = "trainer_type", type = string, choices = {'StumpTrainer', 'LutTrainer'}, help = "This is the type of trainer used for the boosting." )
parser.add_argument('-r', default = 20, dest = "num_rnds", type = string , help = "The number of round for the boosting")
parser.add_argument('-l', default = 'exp', dest = "loss_type", type= string,choices = {'log','exp'} help = "The type of the loss function. Logit and Exponential functions are the avaliable options")
parser.add_argument('-l', default = 'exp', dest = "loss_type", type= string, choices = {'log','exp'}, help = "The type of the loss function. Logit and Exponential functions are the avaliable options")
parser.add_argument('-s', default = 'indep', dest = "selection_type", choices = {'indep', 'shared'}, type = string, help = "The feature selection type for the LUT based trainer. For multivarite case the features can be selected by sharing or independently ")
parser.add_argument('-n', default = 256, dest = "num_entries", type = int, help = "The number of entries in the LookUp table. It is the range of the feature values, e.g. if LBP features are used this values is 256.")
......@@ -60,13 +60,13 @@ def main():
boost_trainer = booster.Boost(args.trainer_type)
# Set the parameters for the boosting
boost_trainer.num_rnds = args.num_rnds
boost_trainer.loss_type = args.loss_type
boost_trainer.num_rnds = args.num_rnds
boost_trainer.loss_type = args.loss_type
boost_trainer.selection_type = args.selection_type
boost_trainer.num_entries = args.num_entries
# Perform boosting of the feature set samp
# Perform boosting of the feature set samp
model = boost_trainer.train(fea_train, label_train)
# Classify the test samples (testsamp) using the boosited classifier generated above
......
......@@ -3,13 +3,181 @@ import xbob.boosting
import numpy
import bob
import xbob.db.mnist
class TestBoosting(unittest.TestCase):
"""Class to test the LUT trainer """
"""Class to test the LUT trainer """
def test01_stump_boosting(self):
raise unittest.SkipTest("Implement me")
@classmethod
def setUpClass(self):
# create a single copy of the MNIST database to avoid downloading the packages several times
self.database = xbob.db.mnist.Database("Database")
#self.database = xbob.db.mnist.Database()
def test02_lut_boosting(self):
raise unittest.SkipTest("Implement me")
@classmethod
def tearDownClass(self):
# Clean up the mess that we created
del self.database
def _data(self, digits = [3, 0], count = 20):
# get the data
inputs, targets = [], []
for digit in digits:
input, target = self.database.data(labels = digit)
inputs.append(input[:count])
targets.append(target[:count])
return numpy.vstack(inputs), numpy.hstack(targets)
def _align_uni(self, targets):
# align target data to be used in a uni-variate classification
aligned = numpy.ones(targets.shape)
aligned[targets != targets[0]] = -1
return aligned
def _align_multi(self, targets, digits):
aligned = - numpy.ones((targets.shape[0], len(digits)))
for i, d in enumerate(digits):
aligned[targets==d, i] = 1
return aligned
def test01_stump_boosting(self):
# get test input data
inputs, targets = self._data()
aligned = self._align_uni(targets)
# for stump trainers, the exponential loss function is preferred
loss_function = xbob.boosting.loss.ExponentialLoss()
weak_trainer = xbob.boosting.trainer.StumpTrainer()
booster = xbob.boosting.trainer.Boosting(weak_trainer, loss_function, number_of_rounds=1)
# perform boosting
machine = booster.train(inputs.astype(numpy.float64), aligned)
# check the result
weight = 1.83178082
self.assertEqual(machine.weights.shape, (1,1))
self.assertAlmostEqual(machine.weights, -weight)
self.assertEqual(len(machine.weak_machines), 1)
self.assertEqual(machine.indices, [483])
weak = machine.weak_machines[0]
self.assertTrue(isinstance(weak, xbob.boosting.StumpMachine))
self.assertEqual(weak.threshold, 15.5)
self.assertEqual(weak.polarity, 1.)
# check first training image
single = machine(inputs[0].astype(numpy.uint16))
self.assertAlmostEqual(single, weight)
# check all training images
scores = numpy.ndarray(aligned.shape)
labels = numpy.ndarray(aligned.shape)
machine(inputs.astype(numpy.uint16), scores, labels)
# assert that 39 (out of 40) labels are correctly classified by a single feature position
self.assertTrue(numpy.allclose(labels * scores, weight))
self.assertEqual(numpy.count_nonzero(labels == aligned), 39)