databases.rst 7 KB
Newer Older
Tiago de Freitas Pereira's avatar
Tiago de Freitas Pereira committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
.. vim: set fileencoding=utf-8 :
.. Tiago de Freitas Pereira <tiago.pereira@idiap.ch>


=============================
 Heterogeneous Face Databases
=============================


CUHK Face Sketch Database (CUFS)
--------------------------------
.. _db-CUHK-CUFS:


CUHK Face Sketch database (`CUFS <http://mmlab.ie.cuhk.edu.hk/archive/facesketch.html>`_) is composed by viewed sketches.
It includes 188 faces from the Chinese University of Hong Kong (CUHK) student database, 123 faces from the `AR database <http://www2.ece.ohio-state.edu/~aleix/ARdatabase.html>`_ and 295 faces from the `XM2VTS database <http://www.ee.surrey.ac.uk/CVSSP/xm2vtsdb/>`_.

There are 606 face images in total. 
For each face image, there is a sketch drawn by an artist based on a photo taken in a frontal pose, under normal lighting condition and with a neutral expression.

There is no evaluation protocol established for this database.
Each work that uses this database implements a different way to report the results.
In [Wang2009]_ the 606 identities were split in three sets (153 identities for training, 153 for development, 300 for evaluation).
The rank-1 identification rate in the evaluation set is used as performance measure.
Unfortunately the file names for each set were not distributed.

In [Klare2013]_ the authors created a protocol based on a 5-fold cross validation splitting the 606 identities in two sets with 404 identities for training and 202 for testing.
The average rank-1 identification rate is used as performance measure.
In [Bhatt2012]_, the authors evaluated the error rates using only the pairs (VIS -- Sketch) corresponding to the CUHK Student Database and AR Face Database and in [Bhatt2010]_ the authors used only the pairs corresponding to the CUHK Student Database.
In [Yi2015]_ the authors created a protocol based on a 10-fold cross validation splitting the 606 identities in two sets with 306 identities for training and 300 for testing.
Also the average rank-1 identification error rate in the test is used to report the results.
Finally in [Roy2016]_, since the method does not requires a background model, the whole 606 identities were used for evaluation and also to tune the hype-parameters; which is not a good practice in machine learning.
Just by reading what is written in the paper (no source code available), we can claim that the evaluation is biased.

For comparison reasons, we will follow the same strategy as in [Klare2013]_ and do a 5 fold cross-validation splitting the 606 identities in two sets with 404 identities for training and 202 for testing and use the average rank-1 identification rate, in the evaluation set as a metric.
For reproducibility purposes, this evaluation protocol is published in a python package `format <https://pypi.python.org/pypi/bob.db.cuhk_cufs>`_.
In this way future researchers will be able to reproduce exactly the same tests with the same identities in each fold (which is not possible today).


CASIA NIR-VIS 2.0 face database
-------------------------------

CASIA NIR-VIS 2.0 database [Li2013]_ offers pairs of mugshot images and their correspondent NIR photos. 
The images of this database were collected in four recording sessions: 2007 spring, 2009 summer, 2009 fall and 2010 summer, in which the first session is identical to the CASIA HFB database [Li2009]_. 
It consists of 725 subjects in total. 
There are [1-22] VIS and [5-50] NIR face images per subject.
The eyes positions are also distributed with the images.

This database has a well defined protocol and it is publicly available for `download <http://www.cbsr.ia.ac.cn/english/NIR-VIS-2.0-Database.html>`_.
We also organized this protocol in the same way as for CUFS database and it is also freely available for download `(bob.db.cbsr_nir_vis_2) <https://pypi.python.org/pypi/bob.db.cbsr_nir_vis_2>`_.
The average rank-1 identification rate in the evaluation set (called view 2) is used as an evaluation metric.


Pola Thermal
------------
.. _db-polathermal:

Collected by the U.S. Army Research Laboratory (ARL), the Polarimetric Thermal Face Database (first of this kind), contains polarimetric LWIR (longwave infrared) imagery and simultaneously acquired visible spectrum imagery from a set of 60 distinct subjects.

For the data collection, each subject was asked to sit in a chair and remove his or her glasses. 
A floor lamp with a compact fluorescent light bulb rated at 1550 lumens was placed 2m in front of the chair to illuminate the scene for 
the visible cameras and a uniform background was placed approximately 0.1m behind the chair.
Data was collected at three distances: Range 1 (2.5m), Range 2 (5m), and Range 3 (7.5m).
At each range, a baseline condition is first acquired where the subject is asked to maintain a neutral expression looking at the polarimetric thermal imager.
A second condition, which is referred as the "expressions" condition, was collected where the subject is asked to count out loud numerically from one upwards.
Counting orally results in a continuous range of motions of the mouth, and to some extent, the eyes, which can be recorded to produce variations in the facial imagery.
For each acquisition, 500 frames are recorded with the polarimeter (duration of 8.33 s at 60 fps), while 300 frames are recorded with each visible spectrum camera (duration of 10s at 30 fps).
For reproducibility purposes, the evaluation protocols of this database is freelly available in a python package `(bob.db.pola_thermal) <https://pypi.python.org/pypi/bob.db.pola_thermal>`_.


Near-Infrared and Visible-Light (NIVL) Dataset
----------------------------------------------
.. _db-nivl:


Collected by University of Notre Dame, the NIVL contains VIS and NIR face images from the same subjects.
The capturing process was carried out over the course of two semesters (fall 2011 and spring 2012).
The VIS images were collected using a Nikon D90 camera.
The Nikon D90 uses a :math:`23.6 \times 15.8` mm CMOS sensor and the resulting images have a :math:`4288 \times 2848` resolution.
The images were acquired using automatic exposure and automatic focus settings.
All images were acquired under normal indoor lighting at about a 5-foot standoff with frontal pose and a neutral facial expression.

The NIR images were acquired using a Honeywell CFAIRS system.
CFAIRS uses a modified Canon EOS 50D camera with a :math:`22.3 \times 14.9` CMOS sensor.
The resulting images have a resolution of :math:`4770 \times 3177`.
All images were acquired under normal indoor lighting with frontal pose and neutral facial expression.
NIR images were acquired at both a 5ft and 7ft standoff.

The dataset contains a total of 574 subjects.
There are a total of 2,341 VIS images and 22,264 NIR images from the 574 subjects.
A total of 402 subjects had both VIS and NIR images acquired during at least one session during both the fall and spring semesters.
Both VIS and NIR images were acquired in the same session, although not simultaneously.
For reproducibility purposes, the evaluation protocols of this database is freelly available in a python package `(bob.db.nivl) <https://pypi.python.org/pypi/bob.db.nivl>`_.