Databases commonly used for biometrics experiments in IDIAP.
Note: This document should be updated whenever a new database is added to the bob.db. To extend the table below, insert the a row for the new database. Rows of similar databases are grouped together in the table (e.g., all voice databases together, all face databases together, etc.). In the descriptions section below, the databases appear in strictly alphabetical order.
In the table below, click on the database name (or scroll down the page) to view a brief description of the database.
This face database was created by Aleix Martinez and Robert Benavente in the Computer Vision Center (CVC) at the U.A.B. It contains over 4,000 color images corresponding to 126 people's faces (70 men and 56 women). Images feature frontal view faces with different facial expressions, illumination conditions, and occlusions (sun glasses and scarf). The pictures were taken at the CVC under strictly controlled conditions. No restrictions on wear (clothes, glasses, etc.), make-up, hair style, etc. were imposed to participants. Each person participated in two sessions, separated by two weeks (14 days) time. The same pictures were taken in both sessions.
The database has been used in the first Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2015). Genuine speech is collected from 106 speakers (45 male, 61 female) and with no signiﬁcant channel or background noise effects. Spoofed speech is generated from the genuine data using a number of different spooﬁng algorithms, including variations of speech synthesis and voice conversion algorithms. The full dataset is partitioned into three subsets, the ﬁrst for training, the second for development and the third for evaluation. The database has 5 known attacks (both development and evaluation sets) and 5 unknown attacks (only in evaluation set). The database does not contain any replay attacks, so all attacks are so called logical access attacks.
The AVspoof database provides speech-based spoofing attacks to test both ASV systems and anti-spoofing algorithms. The attacks are created based on audio recordings acquired from 31 male and 13 female participants. The data acquisition process lasted approximately two months, spanned several sessions, which were configured in different environmental conditions and setups. After the collection of the data, replay (with iPhone 3GS, Samsung Galaxy 4, and a laptop), voice conversion (also replayed with laptop and high quality speakers), and speech synthesis (also replayed with laptop and high quality speakers) attacks were generated. Therefore, the database has both so called logical access attacks and presentation attacks.
The BANCA database is a large, realistic and challenging multi-modal database intended for training and testing multi-modal verification systems. The BANCA database was captured in four European languages in two modalities (face and voice). For recording, both high and low quality microphones and cameras were used. The subjects were recorded in three different scenarios, controlled, degraded and adverse over 12 different sessions spanning three months. In total data was collected for 208 people (half men and half women).
CASIA NIR VIS 2
CASIA NIR-VIS 2.0 database offers pairs of mugshot images and their correspondent NIR photos. Capured by CASIA (Chinese Academy of Sciences), the images of this database were collected in four recording sessions: 2007 spring, 2009 summer, 2009 fall and 2010 summer, in which the first session is identical to the HFB database. The CASIA NIR-VIS 2.0 database consists of 725 subjects in total. There are 1-22 VIS and 5-50 NIR face images per subject.
Database collected at IDIAP for remote-photoplethysmography experiments.
160 video sequences of 40 subjects, together with Blood Volume Pulse (BVP) and breathing rate curves.
4 sequences are recorded for each subject: 2 with controlled illumination and 2 with "natural" light (indoor though).
The CPqD biometric database is a bi-modal (face/speaker) video database recorded from 222 people (128 males and 98 females). In total 5 sessions were captured and for each one 27 recordings were made with 3 different devices: laptops, smartphones and phones callings (this one only audio)
CUHK Face Sketch database (CUFS) is for research on face sketch synthesis and face sketch recognition. It includes 188 faces from the Chinese University of Hong Kong (CUHK) student database, 123 faces from the AR database, and 295 faces from the XM2VTS database. There are 606 faces in total. For each face, there is a sketch drawn by an artist based on a photo taken in a frontal pose, under normal lighting condition, and with a neutral expression.
CUHK Face Sketch FERET Database (CUFSF) is for research on face sketch synthesis and face sketch recognition. It includes 1194 faces from the FERET database with their respective sketches (drawn by an artist based on a photo of the FERET database).
The Good, the bad, the ugly (GBU)
The Good, the Bad, and the Ugly challenge consists of three frontal still face partitions. The partition were designed to encourage the development of face recognition algorithms that excel at matching "hard" face pairs, but not at the expense of performance on "easy" face pairs.
IARPA Janus Benchmark A
The IJB-A database is a mixture of frontal and non-frontal images and videos (provided as single frames) from 500 different identities. In many of the images and video frames, there are several people visible, but only the ones that are annotated with a bounding box should be taken into consideration. For both model enrolment as well as for probing, images and video frames of one person are combined into so-called Templates.
The database is divided in 10 splits each defining training, enrolment and probe data.
It contains pairs of photographs and composite sketch images.
Labelled faces in the wild
Database of face photographs designed for studying the problem of unconstrained face recognition. The data set contains more than 13,000 images of faces collected from the web. Each face has been labelled with the name of the person pictured. 1680 of the people pictured have two or more distinct photos in the data set.
Long Distance Heterogeneous Face Database (LDHF-DB) is for research on VIS-NIR face recognition. It includes 100 identities faces captured in both VIS and NIR (at nighttime) in different standoffs: 1m, 60m, 100m and 150m.
Manhob HCI Tagging
Several video sequences of 29 subjects, totaling 3490 sequences
There's a lot of data recorded, but we only use color video sequences, and 3 recorded EKG signals.
MegaFace: 1 Million Faces for Recognition at Scale 690,572 unique people
The MOBIO database is a bi-modal (face/speaker) video database recorded from 152 people. The database has a female-male ratio of nearly 1:2 (100 males and 52 females) and was collected from August 2008 until July 2010 in 6 different sites from 5 different countries. In total 12 sessions were captured for each individual.
Database created at MSU, for face-PAD experiments. The public version of the database (available here) contains 280 videos corresponding to 35 clients. The videos are grouped as 'genuine' and 'attack'. The attack videos have been constructed from the genuine ones, and consist of three kinds: print, iPad (video-replay), and iPhone (video-replay). Face-locations are also provided for each frame of each video, but some (6 videos) face-locations are not reliable, because the videos are not correctly oriented.
It contains pairs of photographs and composite sketch images.
The CMU Multi-PIE face database contains more than 750,000 images of 337 people recorded in up to four sessions over the span of five months. Subjects were imaged under 15 view points and 19 illumination conditions while displaying a range of facial expressions. In addition, high resolution frontal images were acquired as well. In total, the database contains more than 305 GB of face data.
NIVL is for research on VIS-NIR face recognition. It includes 574 identities faces captured in both VIS and NIR.
Print-attack database with 15 subjects, for face-PAD experiments.
Includes 9,376 still images and 2,802 videos of 293 people. The images are balanced with respect to distance to the camera, alternative sensors, frontal versus not-frontal views, and different locations. Verification results are presented for public baseline algorithms and a commercial algorithm for three cases: comparing still images to still images, videos to videos, and still images to videos.
PUT Vein pattern database consists of 2400 images presenting human vein patterns. Half of images (1200 images) contain a palm vein pattern and the remaining images contain a wrist vein pattern (another 1200 images). Data was acquired from both hands of 50 students. Thus, it has 100 different patterns for palm and wrist region. Pictures ware taken in 3 series, 4 pictures each, with at least one week interval between each series. Images in database have 1280x960 resolution and are stored as 24-bit bitmap. Database consist of 2 main splits: hand and wrist, allowing to investigate both modalities.
Database collected at IDIAP, specifically for face-PAD (presentation-attack detection) experiments, but also provides a protocol for face-verification. It contains 1300 videos corresponding to 50 clients. Videos for each client have been captured under two different illumination-conditions. The verification-protocol consists of 100 videos in the group 'enroll'. The remaining 1200 videos are used for PAD, divided into three groups: training, development, and test. Each group contains both 'real' and 'attack' videos. The attack videos have been constructed from the real videos. Three kinds of attacks are considered: print, iPad(1st gen.), and iPhone(3gs). The database also provides nominal face bounding-boxes for every frame of every video.
Database collected at IDIAP in Nov. 2015, for experiments on face-PAD in mobile environments. Face-videos have been collected for each subject under diverse illumination conditions on two mobile devices: a mobile phone (LG G4, front camera) and a iPad. 40 subjects are represented in the database, which consists of 1030 video-files (including genuine and attack videos). The database-package provides a face-verification protocol as well as a face-PAD protocol. Two kinds of attacks are present in the database: print, and replay-video.
Data used to train the VGG Face Model from the paper (Deep Face Recognition)
Voxforge offers a collection transcribed speech for use with Free and Open Source Speech Recognition Engines. This database contains english audio files (only 6561 files) belonging to 30 speakers randomly selected.
Wine is a dataset containing 13-feature descriptors for 3 kinds of wines. The features are described in http://archive.ics.uci.edu/ml/datasets/Wine. The dataset is designed mainly for comparing classifiers. At IDIAP it is used for testing continuous integration of some Bob packages.
The XM2VTSDB contains four recordings of 295 subjects collected over a period of four months. Each recording contains a speaking head shot and a rotating head shot. Sets of data taken from this database are available including high quality colour images, 32 KHz 16-bit sound files, video sequences and a 3d Model.