baselines.rst 13.5 KB
Newer Older
André Anjos's avatar
André Anjos committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
.. vim: set fileencoding=utf-8 :
.. date: Thu Sep 20 11:58:57 CEST 2012

.. _bob.bio.vein.baselines:

===============================
 Executing Baseline Algorithms
===============================

The first thing you might want to do is to execute one of the vein
recognition algorithms that are implemented in ``bob.bio.vein``.


Running Baseline Experiments
----------------------------

17
To run the baseline experiments, you can use the ``verify.py`` script by
André Anjos's avatar
André Anjos committed
18 19 20 21
just going to the console and typing:

.. code-block:: sh

22
   $ verify.py
André Anjos's avatar
André Anjos committed
23 24 25


This script is explained in more detail in :ref:`bob.bio.base.experiments`.
26
The ``verify.py --help`` option shows you, which other options you can
27
set.
André Anjos's avatar
André Anjos committed
28 29

Usually it is a good idea to have at least verbose level 2 (i.e., calling
30
``verify.py --verbose --verbose``, or the short version ``verify.py
31
-vv``).
André Anjos's avatar
André Anjos committed
32

33
.. note:: **Running in Parallel**
André Anjos's avatar
André Anjos committed
34

35 36 37
   To run the experiments in parallel, you can define an SGE grid or local host
   (multi-processing) configurations as explained in
   :ref:`running_in_parallel`.
André Anjos's avatar
André Anjos committed
38

39 40 41 42
   In short, to run in the Idiap SGE grid, you can simply add the ``--grid``
   command line option, without parameters. To run experiments in parallel on
   the local machine, simply add a ``--parallel <N>`` option, where ``<N>``
   specifies the number of parallel jobs you want to execute.
André Anjos's avatar
André Anjos committed
43 44


45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76
Database setups and baselines are encoded using
:ref:`bob.bio.base.configuration-files`, all stored inside the package root, in
the directory ``bob/bio/vein/configurations``. Documentation for each resource
is available on the section :ref:`bob.bio.vein.resources`.

.. warning::

   You **cannot** run experiments just by executing the command line
   instructions described in this guide. You **need first** to procure yourself
   the raw data files that correspond to *each* database used here in order to
   correctly run experiments with those data. Biometric data is considered
   private date and, under EU regulations, cannot be distributed without a
   consent or license. You may consult our
   :ref:`bob.bio.vein.resources.databases` resources section for checking
   currently supported databases and accessing download links for the raw data
   files.

   Once the raw data files have been downloaded, particular attention should be
   given to the directory locations of those. Unpack the databases carefully
   and annotate the root directory where they have been unpacked.

   Then, carefully read the *Databases* section of
   :ref:`bob.bio.base.installation` on how to correctly setup the
   ``~/.bob_bio_databases.txt`` file.

   Use the following keywords on the left side of the assignment (see
   :ref:`bob.bio.vein.resources.databases`):

   .. code-block:: text

      [YOUR_VERAFINGER_DIRECTORY] = /complete/path/to/verafinger
      [YOUR_UTFVP_DIRECTORY] = /complete/path/to/utfvp
77
      [YOUR_FV3D_DIRECTORY] = /complete/path/to/fv3d
78 79 80 81 82 83

   Notice it is rather important to use the strings as described above,
   otherwise ``bob.bio.base`` will not be able to correctly load your images.

   Once this step is done, you can proceed with the instructions below.

André Anjos's avatar
André Anjos committed
84

André Anjos's avatar
André Anjos committed
85 86 87 88 89
In the remainder of this section we introduce baseline experiments you can
readily run with this tool without further configuration. Baselines examplified
in this guide were published in [TVM14]_.


90 91
Repeated Line-Tracking with Miura Matching
==========================================
André Anjos's avatar
André Anjos committed
92

93
Detailed description at :ref:`bob.bio.vein.resources.recognition.rlt`.
André Anjos's avatar
André Anjos committed
94

95 96
To run the baseline on the `VERA fingervein`_ database, using the ``Nom``
protocol, do the following:
André Anjos's avatar
André Anjos committed
97

98

99
.. code-block:: sh
André Anjos's avatar
André Anjos committed
100

101
   $ verify.py verafinger rlt -vv
102

André Anjos's avatar
André Anjos committed
103

104 105 106
.. tip::

   If you have more processing cores on your local machine and don't want to
107 108 109
   submit your job for SGE execution, you can run it in parallel (using 4
   parallel tasks) by adding the options ``--parallel=4 --nice=10``.

André Anjos's avatar
André Anjos committed
110 111 112 113
   Optionally, you may use the ``parallel`` resource configuration which
   already sets the number of parallel jobs to the number of hardware cores you
   have installed on your machine (as with
   :py:func:`multiprocessing.cpu_count`) and sets ``nice=10``. For example:
André Anjos's avatar
André Anjos committed
114 115 116

   .. code-block:: sh

117
      $ verify.py verafinger rlt parallel -vv
André Anjos's avatar
André Anjos committed
118

119 120 121 122

This command line selects and runs the following implementations for the
toolchain:

123 124
* :ref:`bob.bio.vein.resources.database.verafinger`
* :ref:`bob.bio.vein.resources.recognition.rlt`
André Anjos's avatar
André Anjos committed
125

126
As the tool runs, you'll see printouts that show how it advances through
127
preprocessing, feature extraction and matching. In a 4-core machine and using
128
4 parallel tasks, it takes around 4 hours to process this baseline with the
129 130
current code implementation.

131
To complete the evaluation, run the command bellow, that will output the equal
132 133
error rate (EER) and plot the detector error trade-off (DET) curve with the
performance:
134 135 136

.. code-block:: sh

137
   $ bob_eval_threshold.py <path-to>/verafinger/rlt/Nom/nonorm/scores-dev
138 139
   ('Threshold:', 0.32045327)
   FAR : 26.362% (12701/48180)
140
   FRR : 26.364% (58/220)
141
   HTER: 26.363%
142 143 144 145 146


Maximum Curvature with Miura Matching
=====================================

147
Detailed description at :ref:`bob.bio.vein.resources.recognition.mc`.
148

149
To run the baseline on the `VERA fingervein`_ database, using the ``Nom``
150
protocol like above, do the following:
151

152 153

.. code-block:: sh
154

155
   $ verify.py verafinger mc -vv
156 157


158
This command line selects and runs the following implementations for the
159
toolchain:
160

161 162
* :ref:`bob.bio.vein.resources.database.verafinger`
* :ref:`bob.bio.vein.resources.recognition.mc`
163

164
In a 4-core machine and using 4 parallel tasks, it takes around 1 hour and 40
165
minutes to process this baseline with the current code implementation. Results
166
we obtained:
167

168
.. code-block:: sh
169

170
   $ bob_eval_threshold.py <path-to>/verafinger/mc/Nom/nonorm/scores-dev
171 172 173 174
   ('Threshold:', 0.078274325)
   FAR : 3.182% (1533/48180)
   FRR : 3.182% (7/220)
   HTER: 3.182%
175 176


177 178 179 180 181 182 183 184
Wide Line Detector with Miura Matching
======================================

You can find the description of this method on the paper from Huang *et al.*
[HDLTL10]_.

To run the baseline on the `VERA fingervein`_ database, using the ``NOM``
protocol like above, do the following:
185

186 187 188

.. code-block:: sh

189
   $ verify.py verafinger wld -vv
190 191 192


This command line selects and runs the following implementations for the
193
toolchain:
194

195 196
* :ref:`bob.bio.vein.resources.database.verafinger`
* :ref:`bob.bio.vein.resources.recognition.wld`
197

198
In a 4-core machine and using 4 parallel tasks, it takes only around 5 minutes
199 200
minutes to process this baseline with the current code implementation.Results
we obtained:
201 202 203

.. code-block:: sh

204
   $ bob_eval_threshold.py <path-to>/verafinger/wld/NOM/nonorm/scores-dev
205 206 207 208 209 210 211 212 213 214
   ('Threshold:', 0.239141175)
   FAR : 10.455% (5037/48180)
   FRR : 10.455% (23/220)
   HTER: 10.455%


Results for other Baselines
===========================

This package may generate results for other combinations of protocols and
André Anjos's avatar
André Anjos committed
215 216 217 218 219 220 221 222
databases. Here is a summary table for some variants (results expressed
correspond to the the equal-error rate on the development set, in percentage):

======================== ================= ====== ====== ====== ====== ======
               Approach                     Vera Finger             UTFVP
------------------------------------------ -------------------- -------------
   Feature Extractor      Post Processing   Full     B    Nom   1vsall  nom
======================== ================= ====== ====== ====== ====== ======
André Anjos's avatar
André Anjos committed
223 224 225 226 227 228
Repeated Line Tracking        None          23.9   24.1   24.9   1.7    1.4
Repeated Line Tracking     Histogram Eq.    26.2   23.6   24.9   2.1    0.9
Maximum Curvature             None           3.2    3.2    3.1   0.4    0.
Maximum Curvature          Histogram Eq.     3.0    2.7    2.7   0.4    0.
Wide Line Detector            None          10.2   10.2   10.5   2.3    1.7
Wide Line Detector         Histogram Eq.     8.0    9.7    7.3   1.7    0.9
André Anjos's avatar
André Anjos committed
229 230 231 232 233 234 235 236 237 238
======================== ================= ====== ====== ====== ====== ======

In a machine with 48 cores, running these baselines took the following time
(hh:mm):

======================== ================= ====== ====== ====== ====== ======
               Approach                     Vera Finger             UTFVP
------------------------------------------ -------------------- -------------
   Feature Extractor      Post Processing   Full     B    Nom   1vsall  nom
======================== ================= ====== ====== ====== ====== ======
André Anjos's avatar
André Anjos committed
239 240 241 242 243 244
Repeated Line Tracking        None          01:16  00:23  00:23  12:44  00:35
Repeated Line Tracking     Histogram Eq.    00:50  00:23  00:23  13:00  00:35
Maximum Curvature             None          03:28  00:54  00:59  58:34  01:48
Maximum Curvature          Histogram Eq.    02:45  00:54  00:59  49:03  01:49
Wide Line Detector            None          00:07  00:01  00:01  02:25  00:05
Wide Line Detector         Histogram Eq.    00:04  00:01  00:01  02:04  00:06
André Anjos's avatar
André Anjos committed
245
======================== ================= ====== ====== ====== ====== ======
André Anjos's avatar
André Anjos committed
246

247

248 249 250 251 252 253 254 255 256 257 258 259 260
Modifying Baseline Experiments
------------------------------

It is fairly easy to modify baseline experiments available in this package. To
do so, you must copy the configuration files for the given baseline you want to
modify, edit them to make the desired changes and run the experiment again.

For example, suppose you'd like to change the protocol on the Vera Fingervein
database and use the protocol ``full`` instead of the default protocol ``nom``.
First, you identify where the configuration file sits:

.. code-block:: sh

261
   $ resources.py -tc -p bob.bio.vein
262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301
   - bob.bio.vein X.Y.Z @ /path/to/bob.bio.vein:
     + mc         --> bob.bio.vein.configurations.maximum_curvature
     + parallel   --> bob.bio.vein.configurations.parallel
     + rlt        --> bob.bio.vein.configurations.repeated_line_tracking
     + utfvp      --> bob.bio.vein.configurations.utfvp
     + verafinger --> bob.bio.vein.configurations.verafinger
     + wld        --> bob.bio.vein.configurations.wide_line_detector


The listing above tells the ``verafinger`` configuration file sits on the
file ``/path/to/bob.bio.vein/bob/bio/vein/configurations/verafinger.py``. In
order to modify it, make a local copy. For example:

.. code-block:: sh

   $ cp /path/to/bob.bio.vein/bob/bio/vein/configurations/verafinger.py verafinger_full.py
   $ # edit verafinger_full.py, change the value of "protocol" to "full"


Also, don't forget to change all relative module imports (such as ``from
..database.verafinger import Database``) to absolute imports (e.g. ``from
bob.bio.vein.database.verafinger import Database``). This will make the
configuration file work irrespectively of its location w.r.t. ``bob.bio.vein``.
The final version of the modified file could look like this:

.. code-block:: python

   from bob.bio.vein.database.verafinger import Database

   database = Database(original_directory='/where/you/have/the/raw/files',
     original_extension='.png', #don't change this
     )

   protocol = 'full'


Now, re-run the experiment using your modified database descriptor:

.. code-block:: sh

302
   $ verify.py ./verafinger_full.py wld -vv
303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332


Notice we replace the use of the registered configuration file named
``verafinger`` by the local file ``verafinger_full.py``. This makes the program
``verify.py`` take that into consideration instead of the original file.


Other Resources
---------------

This package contains other resources that can be used to evaluate different
bits of the vein processing toolchain.


Region of Interest Goodness of Fit
==================================

Automatic region of interest (RoI) finding and cropping can be evaluated using
a couple of scripts available in this package. The program ``compare_rois.py``
compares two sets of ``preprocessed`` images and masks, generated by
*different* preprocessors (see
:py:class:`bob.bio.base.preprocessor.Preprocessor`) and calculates a few
metrics to help you determine how both techniques compare.  Normally, the
program is used to compare the result of automatic RoI to manually annoted
regions on the same images. To use it, just point it to the outputs of two
experiments representing the manually annotated regions and automatically
extracted ones. E.g.:

.. code-block:: sh

333
   $ compare_rois.py ~/verafinger/mc_annot/preprocessed ~/verafinger/mc/preprocessed
334 335 336 337 338 339 340 341 342 343 344 345 346 347 348
   Jaccard index: 9.60e-01 +- 5.98e-02
   Intersection ratio (m1): 9.79e-01 +- 5.81e-02
   Intersection ratio of complement (m2): 1.96e-02 +- 1.53e-02


Values printed by the script correspond to the `Jaccard index`_
(:py:func:`bob.bio.vein.preprocessor.utils.jaccard_index`), as well as the
intersection ratio between the manual and automatically generated masks
(:py:func:`bob.bio.vein.preprocessor.utils.intersect_ratio`) and the ratio to
the complement of the intersection with respect to the automatically generated
mask
(:py:func:`bob.bio.vein.preprocessor.utils.intersect_ratio_of_complement`). You
can use the option ``-n 5`` to print the 5 worst cases according to each of the
metrics.

André Anjos's avatar
André Anjos committed
349 350
You can use the program ``view_sample.py`` to display the images after
full processing using:
351 352 353

.. code-block:: sh

André Anjos's avatar
André Anjos committed
354 355
   $ ./bin/view_sample.py /path/to/verafinger/database /path/to/bob-bio-vein/output/for/toolchain 030-M/030_L_1 -s output-directory
   $ # open output-directory
356

André Anjos's avatar
André Anjos committed
357 358 359
And you should be able to view images like these (example taken from the Vera
fingervein database, using the automatic annotator and Maximum Curvature
feature extractor):
360

André Anjos's avatar
André Anjos committed
361
.. figure:: img/preprocessed.*
362 363 364
   :scale: 50%

   Example RoI overlayed on finger vein image of the Vera fingervein database,
André Anjos's avatar
André Anjos committed
365
   as produced by the script ``view_sample.py``.
366 367


André Anjos's avatar
André Anjos committed
368 369 370 371 372 373
.. figure:: img/binarized.*
   :scale: 50%

   Example of fingervein image from the Vera fingervein database, binarized by
   using Maximum Curvature, after pre-processing.

374

André Anjos's avatar
André Anjos committed
375
.. include:: links.rst