baselines.rst 16.1 KB
Newer Older
André Anjos's avatar
André Anjos committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
.. vim: set fileencoding=utf-8 :
.. date: Thu Sep 20 11:58:57 CEST 2012

.. _bob.bio.vein.baselines:

===============================
 Executing Baseline Algorithms
===============================

The first thing you might want to do is to execute one of the vein
recognition algorithms that are implemented in ``bob.bio.vein``.


Running Baseline Experiments
----------------------------

17
To run the baseline experiments, you can use the ``verify.py`` script by
André Anjos's avatar
André Anjos committed
18 19 20 21
just going to the console and typing:

.. code-block:: sh

22
   $ verify.py
André Anjos's avatar
André Anjos committed
23 24 25


This script is explained in more detail in :ref:`bob.bio.base.experiments`.
26
The ``verify.py --help`` option shows you, which other options you can
27
set.
André Anjos's avatar
André Anjos committed
28 29

Usually it is a good idea to have at least verbose level 2 (i.e., calling
30
``verify.py --verbose --verbose``, or the short version ``verify.py
31
-vv``).
André Anjos's avatar
André Anjos committed
32

33
.. note:: **Running in Parallel**
André Anjos's avatar
André Anjos committed
34

35 36 37
   To run the experiments in parallel, you can define an SGE grid or local host
   (multi-processing) configurations as explained in
   :ref:`running_in_parallel`.
André Anjos's avatar
André Anjos committed
38

39 40 41 42
   In short, to run in the Idiap SGE grid, you can simply add the ``--grid``
   command line option, without parameters. To run experiments in parallel on
   the local machine, simply add a ``--parallel <N>`` option, where ``<N>``
   specifies the number of parallel jobs you want to execute.
André Anjos's avatar
André Anjos committed
43 44


45 46 47 48 49 50 51 52 53 54 55
Database setups and baselines are encoded using
:ref:`bob.bio.base.configuration-files`, all stored inside the package root, in
the directory ``bob/bio/vein/configurations``. Documentation for each resource
is available on the section :ref:`bob.bio.vein.resources`.

.. warning::

   You **cannot** run experiments just by executing the command line
   instructions described in this guide. You **need first** to procure yourself
   the raw data files that correspond to *each* database used here in order to
   correctly run experiments with those data. Biometric data is considered
56
   private data and, under EU regulations, cannot be distributed without a
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76
   consent or license. You may consult our
   :ref:`bob.bio.vein.resources.databases` resources section for checking
   currently supported databases and accessing download links for the raw data
   files.

   Once the raw data files have been downloaded, particular attention should be
   given to the directory locations of those. Unpack the databases carefully
   and annotate the root directory where they have been unpacked.

   Then, carefully read the *Databases* section of
   :ref:`bob.bio.base.installation` on how to correctly setup the
   ``~/.bob_bio_databases.txt`` file.

   Use the following keywords on the left side of the assignment (see
   :ref:`bob.bio.vein.resources.databases`):

   .. code-block:: text

      [YOUR_VERAFINGER_DIRECTORY] = /complete/path/to/verafinger
      [YOUR_UTFVP_DIRECTORY] = /complete/path/to/utfvp
77
      [YOUR_FV3D_DIRECTORY] = /complete/path/to/fv3d
78 79 80 81 82 83

   Notice it is rather important to use the strings as described above,
   otherwise ``bob.bio.base`` will not be able to correctly load your images.

   Once this step is done, you can proceed with the instructions below.

André Anjos's avatar
André Anjos committed
84

André Anjos's avatar
André Anjos committed
85 86 87 88 89
In the remainder of this section we introduce baseline experiments you can
readily run with this tool without further configuration. Baselines examplified
in this guide were published in [TVM14]_.


90 91
Repeated Line-Tracking with Miura Matching
==========================================
André Anjos's avatar
André Anjos committed
92

93
Detailed description at :ref:`bob.bio.vein.resources.recognition.rlt`.
André Anjos's avatar
André Anjos committed
94

95 96
To run the baseline on the `VERA fingervein`_ database, using the ``Nom``
protocol, do the following:
André Anjos's avatar
André Anjos committed
97

98

99
.. code-block:: sh
André Anjos's avatar
André Anjos committed
100

101
   $ verify.py verafinger rlt -vv
102

André Anjos's avatar
André Anjos committed
103

104 105 106
.. tip::

   If you have more processing cores on your local machine and don't want to
107
   submit your job for SGE execution, you can run it in parallel (using 4
108 109
   parallel tasks) by adding the options ``--parallel=4 --nice=10``. **Before**
   doing so, make sure the package gridtk_ is properly installed.
110

André Anjos's avatar
André Anjos committed
111 112 113 114
   Optionally, you may use the ``parallel`` resource configuration which
   already sets the number of parallel jobs to the number of hardware cores you
   have installed on your machine (as with
   :py:func:`multiprocessing.cpu_count`) and sets ``nice=10``. For example:
115 116 117

   .. code-block:: sh

118
      $ verify.py verafinger rlt parallel -vv
119

120 121 122 123 124 125 126 127 128 129 130 131
   To run on the Idiap SGE grid using our stock
   io-big-48-slots-4G-memory-enabled (see
   :py:mod:`bob.bio.vein.configurations.gridio4g48`) configuration, use:

   .. code-block:: sh

      $ verify.py verafinger rlt grid -vv

   You may also, optionally, use the configuration resource ``gridio4g48``,
   which is just an alias of ``grid`` in this package.


132 133 134 135

This command line selects and runs the following implementations for the
toolchain:

136 137
* :ref:`bob.bio.vein.resources.database.verafinger`
* :ref:`bob.bio.vein.resources.recognition.rlt`
André Anjos's avatar
André Anjos committed
138

139
As the tool runs, you'll see printouts that show how it advances through
140
preprocessing, feature extraction and matching. In a 4-core machine and using
141
4 parallel tasks, it takes around 4 hours to process this baseline with the
142 143
current code implementation.

144
To complete the evaluation, run the command bellow, that will output the equal
145 146
error rate (EER) and plot the detector error trade-off (DET) curve with the
performance:
147 148 149

.. code-block:: sh

150
   $ bob bio metrics <path-to>/verafinger/rlt/Nom/nonorm/scores-dev --no-evaluation
151 152 153 154 155 156 157 158 159 160 161
   [Min. criterion: EER ] Threshold on Development set `scores-dev`: 0.31835292
   ======  ========================
   None    Development scores-dev
   ======  ========================
   FtA     0.0%
   FMR     23.6% (11388/48180)
   FNMR    23.6% (52/220)
   FAR     23.6%
   FRR     23.6%
   HTER    23.6%
   ======  ========================
162 163 164 165 166


Maximum Curvature with Miura Matching
=====================================

167
Detailed description at :ref:`bob.bio.vein.resources.recognition.mc`.
168

169
To run the baseline on the `VERA fingervein`_ database, using the ``Nom``
170
protocol like above, do the following:
171

172 173

.. code-block:: sh
174

175
   $ verify.py verafinger mc -vv
176 177


178
This command line selects and runs the following implementations for the
179
toolchain:
180

181 182
* :ref:`bob.bio.vein.resources.database.verafinger`
* :ref:`bob.bio.vein.resources.recognition.mc`
183

184
In a 4-core machine and using 4 parallel tasks, it takes around 1 hour and 40
185
minutes to process this baseline with the current code implementation. Results
186
we obtained:
187

188
.. code-block:: sh
189

190
   $ bob bio metrics <path-to>/verafinger/mc/Nom/nonorm/scores-dev --no-evaluation
191 192 193 194 195 196 197 198 199 200 201 202
   [Min. criterion: EER ] Threshold on Development set `scores-dev`: 7.372830e-02
   ======  ========================
   None    Development scores-dev
   ======  ========================
   FtA     0.0%
   FMR     4.4% (2116/48180)
   FNMR    4.5% (10/220)
   FAR     4.4%
   FRR     4.5%
   HTER    4.5%
   ======  ========================

203

204 205 206 207 208 209
Wide Line Detector with Miura Matching
======================================

You can find the description of this method on the paper from Huang *et al.*
[HDLTL10]_.

210
To run the baseline on the `VERA fingervein`_ database, using the ``Nom``
211
protocol like above, do the following:
212

213 214 215

.. code-block:: sh

216
   $ verify.py verafinger wld -vv
217 218 219


This command line selects and runs the following implementations for the
220
toolchain:
221

222 223
* :ref:`bob.bio.vein.resources.database.verafinger`
* :ref:`bob.bio.vein.resources.recognition.wld`
224

225
In a 4-core machine and using 4 parallel tasks, it takes only around 5 minutes
226 227
minutes to process this baseline with the current code implementation.Results
we obtained:
228 229 230

.. code-block:: sh

231
   $ bob bio metrics <path-to>/verafinger/wld/Nom/nonorm/scores-dev --no-evaluation
232 233 234 235 236 237 238 239 240 241
   [Min. criterion: EER ] Threshold on Development set `scores-dev`: 2.402707e-01
   ======  ========================
   None    Development scores-dev
   ======  ========================
   FtA     0.0%
   FMR     9.8% (4726/48180)
   FNMR    10.0% (22/220)
   FAR     9.8%
   FRR     10.0%
   HTER    9.9%
242 243 244 245 246 247


Results for other Baselines
===========================

This package may generate results for other combinations of protocols and
André Anjos's avatar
André Anjos committed
248 249 250
databases. Here is a summary table for some variants (results expressed
correspond to the the equal-error rate on the development set, in percentage):

251 252 253 254 255
======================== ====== ====== ====== ====== ======
       Toolchain              Vera Finger         UTFVP
------------------------ -------------------- -------------
   Feature Extractor      Full     B    Nom   1vsall  nom
======================== ====== ====== ====== ====== ======
256
Repeated Line Tracking    14.6   13.4   23.6   3.4    1.4
257
Wide Line Detector         5.8    5.6    9.9   2.8    1.9
258
Maximum Curvature          2.5    1.4    4.5   0.9    0.4
259
======================== ====== ====== ====== ====== ======
André Anjos's avatar
André Anjos committed
260 261 262 263

In a machine with 48 cores, running these baselines took the following time
(hh:mm):

264 265 266 267 268 269 270 271 272
======================== ====== ====== ====== ====== ======
       Toolchain              Vera Finger         UTFVP
------------------------ -------------------- -------------
   Feature Extractor      Full     B    Nom   1vsall  nom
======================== ====== ====== ====== ====== ======
Repeated Line Tracking    01:16  00:23  00:23  12:44  00:35
Wide Line Detector        00:07  00:01  00:01  02:25  00:05
Maximum Curvature         03:28  00:54  00:59  58:34  01:48
======================== ====== ====== ====== ====== ======
André Anjos's avatar
André Anjos committed
273

274

275 276 277 278 279 280 281 282 283 284 285 286 287
Modifying Baseline Experiments
------------------------------

It is fairly easy to modify baseline experiments available in this package. To
do so, you must copy the configuration files for the given baseline you want to
modify, edit them to make the desired changes and run the experiment again.

For example, suppose you'd like to change the protocol on the Vera Fingervein
database and use the protocol ``full`` instead of the default protocol ``nom``.
First, you identify where the configuration file sits:

.. code-block:: sh

288
   $ resources.py -tc -p bob.bio.vein
289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328
   - bob.bio.vein X.Y.Z @ /path/to/bob.bio.vein:
     + mc         --> bob.bio.vein.configurations.maximum_curvature
     + parallel   --> bob.bio.vein.configurations.parallel
     + rlt        --> bob.bio.vein.configurations.repeated_line_tracking
     + utfvp      --> bob.bio.vein.configurations.utfvp
     + verafinger --> bob.bio.vein.configurations.verafinger
     + wld        --> bob.bio.vein.configurations.wide_line_detector


The listing above tells the ``verafinger`` configuration file sits on the
file ``/path/to/bob.bio.vein/bob/bio/vein/configurations/verafinger.py``. In
order to modify it, make a local copy. For example:

.. code-block:: sh

   $ cp /path/to/bob.bio.vein/bob/bio/vein/configurations/verafinger.py verafinger_full.py
   $ # edit verafinger_full.py, change the value of "protocol" to "full"


Also, don't forget to change all relative module imports (such as ``from
..database.verafinger import Database``) to absolute imports (e.g. ``from
bob.bio.vein.database.verafinger import Database``). This will make the
configuration file work irrespectively of its location w.r.t. ``bob.bio.vein``.
The final version of the modified file could look like this:

.. code-block:: python

   from bob.bio.vein.database.verafinger import Database

   database = Database(original_directory='/where/you/have/the/raw/files',
     original_extension='.png', #don't change this
     )

   protocol = 'full'


Now, re-run the experiment using your modified database descriptor:

.. code-block:: sh

329
   $ verify.py ./verafinger_full.py wld -vv
330 331 332 333 334 335 336 337 338 339 340 341 342 343


Notice we replace the use of the registered configuration file named
``verafinger`` by the local file ``verafinger_full.py``. This makes the program
``verify.py`` take that into consideration instead of the original file.


Other Resources
---------------

This package contains other resources that can be used to evaluate different
bits of the vein processing toolchain.


344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370
Training the Watershed Finger region detector
=============================================

The correct detection of the finger boundaries is an important step of many
algorithms for the recognition of finger veins. It allows to compensate for
eventual rotation and scaling issues one might find when comparing models and
probes. In this package, we propose a novel finger boundary detector based on
the `Watershedding Morphological Algorithm
<https://en.wikipedia.org/wiki/Watershed_(image_processing)>`. Watershedding
works in three steps:

1. Determine markers on the original image indicating the types of areas one
   would like to detect (e.g. "finger" or "background")
2. Determine a 2D (gray-scale) surface representing the original image in which
   darker spots (representing valleys) are more likely to be filled by
   surrounding markers. This is normally achieved by filtering the image with a
   high-pass filter like Sobel or using an edge detector such as Canny.
3. Run the watershed algorithm

In order to determine markers for step 1, we train a neural network which
outputs the likelihood of a point being part of a finger, given its coordinates
and values of surrounding pixels.

When used to run an experiment,
:py:class:`bob.bio.vein.preprocessor.WatershedMask` requires you provide a
*pre-trained* neural network model that presets the markers before
watershedding takes place. In order to create one, you can run the program
371
`bob_bio_vein_markdet.py`:
372 373 374

.. code-block:: sh

375
   $ bob_bio_vein_markdet.py --hidden=20 --samples=500 fv3d central dev
376 377 378 379 380 381 382 383 384 385 386

You input, as arguments to this application, the database, protocol and subset
name you wish to use for training the network. The data is loaded observing a
total maximum number of samples from the dataset (passed with ``--samples=N``),
the network is trained and recorded into an HDF5 file (by default, the file is
called ``model.hdf5``, but the name can be changed with the option
``--model=``).  Once you have a model, you can use the preprocessor mask by
constructing an object and attaching it to the
:py:class:`bob.bio.vein.preprocessor.Preprocessor` entry on your configuration.


387 388 389 390
Region of Interest Goodness of Fit
==================================

Automatic region of interest (RoI) finding and cropping can be evaluated using
391 392 393
a couple of scripts available in this package. The program
``bob_bio_vein_compare_rois.py`` compares two sets of ``preprocessed`` images
and masks, generated by *different* preprocessors (see
394 395 396 397 398 399 400 401 402
:py:class:`bob.bio.base.preprocessor.Preprocessor`) and calculates a few
metrics to help you determine how both techniques compare.  Normally, the
program is used to compare the result of automatic RoI to manually annoted
regions on the same images. To use it, just point it to the outputs of two
experiments representing the manually annotated regions and automatically
extracted ones. E.g.:

.. code-block:: sh

403
   $ bob_bio_vein_compare_rois.py ~/verafinger/mc_annot/preprocessed ~/verafinger/mc/preprocessed
404 405 406 407 408 409 410 411 412 413 414 415 416 417 418
   Jaccard index: 9.60e-01 +- 5.98e-02
   Intersection ratio (m1): 9.79e-01 +- 5.81e-02
   Intersection ratio of complement (m2): 1.96e-02 +- 1.53e-02


Values printed by the script correspond to the `Jaccard index`_
(:py:func:`bob.bio.vein.preprocessor.utils.jaccard_index`), as well as the
intersection ratio between the manual and automatically generated masks
(:py:func:`bob.bio.vein.preprocessor.utils.intersect_ratio`) and the ratio to
the complement of the intersection with respect to the automatically generated
mask
(:py:func:`bob.bio.vein.preprocessor.utils.intersect_ratio_of_complement`). You
can use the option ``-n 5`` to print the 5 worst cases according to each of the
metrics.

419 420 421 422

Pipeline Display
================

423 424
You can use the program ``bob_bio_vein_view_sample.py`` to display the images
after full processing using:
425 426 427

.. code-block:: sh

428
   $ bob_bio_vein_view_sample.py --save=output-dir verafinger /path/to/processed/directory 030-M/030_L_1
429
   $ # open output-dir
430

431 432 433
And you should be able to view images like these (example taken from the Vera
fingervein database, using the automatic annotator and Maximum Curvature
feature extractor):
434

435
.. figure:: img/preprocessed.*
436 437 438
   :scale: 50%

   Example RoI overlayed on finger vein image of the Vera fingervein database,
439
   as produced by the script ``bob_bio_vein_view_sample.py``.
440 441


442 443 444 445 446 447
.. figure:: img/binarized.*
   :scale: 50%

   Example of fingervein image from the Vera fingervein database, binarized by
   using Maximum Curvature, after pre-processing.

448

André Anjos's avatar
André Anjos committed
449
.. include:: links.rst