Modified to accomodate instructions for creating new databases authored by André Anjos's avatar André Anjos
# Available Databases # Creating a new BEAT Database
Until further notice, these are the Databases currently installed on the [BEAT platform at Idiap](https://beat-eu.org/platform/): BEAT Databases are simply accessor plugins that allow the data belonging to a database to be read and fed into a BEAT Toolchain in the context of an Experiment. The accessors are called *views* in the BEAT jargon. Here is an UML diagram that explains the relationship of all components of a database successfuly integrated into the platform:
## Simple Face Recognition Databases ![database-uml](https://gitlab.idiap.ch/uploads/biometric/beat.databases/0354a72407/database-uml.png)
* `atnt`, protocol `idiap`: ## Components
+ Notes:
- Download link: [AT&T faces database](http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html) A BEAT database is composed of three main parts:
- No identity intermixing between `train` and `dev` (`templates`/`probes`)
- Only training and validation (*no evaluation* set) 1. A JSON file that declares the database protocols, its root folder location, sets, templates and outputs in terms of existing Dataformats;
+ Set `train` (200 images, 20 identities) 2. A RestructuredText document that thoroughly describes the database;
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample 3. A set of Python scripts that define the views (i.e., how to access the data), for each combination of protocol, set and output.
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `image`: [beat/image_grayscale](https://www.beat-eu.org/platform/dataformats/beat/image_grayscale/), image in gray-scale ## Procedure
+ Set `templates` (100 images, 20 identities):
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample To create a new BEAT database, you must be able to test the JSON descriptor (i.e., the arrangement of protocols, views, sets and outputs) together with your Python views inside, potentially, a real toolchain that uses the database. Doing so may be a complicated produce. This wiki page tries to simplify this process by providing instructions on how to build a new database from scratch.
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `template_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), template identity, unique for each template ### Step 1: Create a Bob database satellite package and a simple toolchain
- `image`: [beat/image_grayscale](https://www.beat-eu.org/platform/dataformats/beat/image_grayscale/), image in gray-scale
+ Set `probes` (100 images, 20 identities - same identities as in `templates`, different samples): The very first step is to make sure you can *use* the database locally, via a Bob database satellite package. A simple toolchain can be used to test if the Bob database satellite package is working as expected. Instructions on how to create a Bob database satellite package are [available here](https://pythonhosted.org/bob.extension/guide.html). You'll find numerous examples of database packages from that link.
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client ### Step 2: Install a development platform on your machine
- `probe_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), probe identity, unique for each probe
- `template_ids`: [beat/array_1d_uint64](https://www.beat-eu.org/platform/dataformats/beat/array_1d_uint64/), list of client (model) identities which to compare this sample to Once your Python code is ready, the next thing to do is to install a developement platform on your machine. It will allow you to input the JSON description and insert the views at the right location for development. To do so, follow the [instructions for development setup at our beat.web package](https://gitlab.idiap.ch/biometric/beat.web/tree/master). Follow the procedure to the end, including the (optional) installation of existing contributions.
- `image`: [beat/image_grayscale](https://www.beat-eu.org/platform/dataformats/beat/image_grayscale/), image in gray-scale
### Step 3: Add your database JSON and Views to beat.databases
## Advanced Face Recognition Databases
Inside `beat.web`, you'll find a checkout of `beat.databases` in `src/beat.databases`. Go ahead and create the view files and JSON descriptors matching the existing structure. JSON descriptors are inserted into `<beat.databases-root>/beat/databases/prefix/advanced/databases`. Views are inserted into <beat.databases-root>/beat/databases/<database-name>`. Add your database satellite package to the `setup.py` of `beat.databases.
* `atnt`, protocol `idiap_test_eyepos`:
+ Notes: > Note: If your Bob satellite package for the database is **not** available on PyPI, you must also add a
- Download link: [AT&T faces database](http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html) > checkout/mr.developer line on `beat.web/buildout.cfg` so that the system can properly download and install it
- No identity intermixing between `train` and `dev_*`/`test_*`. `dev_*` and `test_*` are identical > from its repository.
- Training, development and evaluation sets (`dev` and `test` sets are identical)
- Use this protocol to quickly (or locally) evaluate algorithms for deployment with large databases on the platform Re-run `./bin/buildout` at the `beat.web` level to make sure the new satellite package is installed and all is good for your first test.
+ Set `train` (200 images, 20 identities)
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample To install the database into your development server, run `./bin/install_contributions -v advanced`. If all is correct, the program should just report the insertion of the database into the system. Treat all errors until the database can be inserted correctly into the platform.
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in pseudo-RGB (all 3 planes are the same) ### Step 4: Create a toolchain and experiment
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `dev_templates` (100 images, 20 identities): With the database inserted into the system, launch `beat.web` this way:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client ```sh
- `template_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), template identity, unique for each template $ ./bin/django runserver -v3
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in pseudo-RGB (all 3 planes are the same) ```
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `dev_probes` (100 images, 20 identities - same identities as in `dev_templates`, different samples): Then open your browser and point it to `http://127.0.0.1:8000`. Login as user `user` with password `user` then, create a new toolchain using your database template.
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client Once a toolchain that can use your database is in place, add the algorithms for such a toolchain and, finally, create an experiment mixing the database protocol you want to test and the algorithms. Do **not** push "Go", but just "Queue" the experiment for the time being.
- `probe_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), probe identity, unique for each probe
- `template_ids`: [beat/array_1d_uint64](https://www.beat-eu.org/platform/dataformats/beat/array_1d_uint64/), list of client (model) identities which to compare this sample to ### Step 5: Run the experiment locally for interactive debugging
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in pseudo-RGB (all 3 planes are the same)
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers Once all is in place, you can quickly modify the views and algorithms so that the database works. The workflow is:
+ Set `test_templates` (100 images, 20 identities, same as `dev_templates`):
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample 1. Test run: `./bin/beat --prefix=web_dynamic_data experiments run <user>/<toolchain>/<version>/<experiment-label>`
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client 2. Modify the view/algorithms
- `template_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), template identity, unique for each template 3. Repeat from 1 until all works
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in pseudo-RGB (all 3 planes are the same)
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers ### Step 6: Commit to `beat.databases`
+ Set `test_probes` (100 images, 20 identities - same identities as in `test_templates`, different samples):
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample Once you are happy with all you did, you may commit the changes into the `beat.database` package. You should, *at least*, include the database JSON, a RestructuredText description and the Python views, but it is also recommended you include a baseline toolchain, experiment and algorithms so that others can test your setup during development.
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `probe_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), probe identity, unique for each probe > Note: If you have a database which is already in production at the BEAT platform, make sure that, when
- `template_ids`: [beat/array_1d_uint64](https://www.beat-eu.org/platform/dataformats/beat/array_1d_uint64/), list of client (model) identities which to compare this sample to > you modify your database, you create a new version.
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in pseudo-RGB (all 3 planes are the same) \ No newline at end of file
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
* `banca`, protocols `P`, `G`, `Mc`, `Md`, `Ma`, `Ud`, `Ua`:
+ Notes:
- Download link and references: [BANCA database](http://www.ee.surrey.ac.uk/CVSSP/banca/) (english set)
- No identity intermixing between `train` and `dev_*` and `test_*`
- Distribution of data:
| Protocol | Training Samples | Development Probes | Development Templates | Evaluation Probes | Evaluation Templates |
|----------|------------------|--------------------|-----------------------|-------------------|----------------------|
| P | 300 | 2730 (26 x 105) | 26 (x5 = 130) | 2730 (26 x 105) | 26 (x5 = 130) |
| Mc | 300 | 910 (26 x 35) | 26 (x5 = 130) | 910 (26 x 35) | 26 (x5 = 130) |
|----------|------------------|--------------------|-----------------------|-------------------|----------------------|
+ Set `train`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `dev_templates`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `template_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), template identity, unique for each template
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `dev_probes`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `probe_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), probe identity, unique for each probe
- `template_ids`: [beat/array_1d_uint64](https://www.beat-eu.org/platform/dataformats/beat/array_1d_uint64/), list of client (model) identities which to compare this sample to
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `test_templates`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `template_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), template identity, unique for each template
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `test_probes`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `probe_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), probe identity, unique for each probe
- `template_ids`: [beat/array_1d_uint64](https://www.beat-eu.org/platform/dataformats/beat/array_1d_uint64/), list of client (model) identities which to compare this sample to
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
* `xm2vts`, protocols `lp1`, `lp2`, `darkened-lp1`, `darkned-lp2`:
+ Notes:
- Download link and references: [XM2VTS database](http://www.ee.surrey.ac.uk/CVSSP/xm2vtsdb/)
- `train` and `dev_*` and `test_*`. **Same clients** on all sets.
+ Set `train`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `dev_templates`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `template_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), template identity, unique for each template
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `dev_probes`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `probe_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), probe identity, unique for each probe
- `template_ids`: [beat/array_1d_uint64](https://www.beat-eu.org/platform/dataformats/beat/array_1d_uint64/), list of client (model) identities which to compare this sample to
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `test_templates`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `template_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), template identity, unique for each template
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `test_probes`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `probe_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), probe identity, unique for each probe
- `template_ids`: [beat/array_1d_uint64](https://www.beat-eu.org/platform/dataformats/beat/array_1d_uint64/), list of client (model) identities which to compare this sample to
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
* `mobio`, protocols `male`, `female`:
+ Notes:
- Download link and references: [MOBIO database](http://www.idiap.ch/dataset/mobio)
- No identity intermixing between `train` and `dev_*` and `test_*`
+ Set `train`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `dev_templates`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `template_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), template identity, unique for each template
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `dev_probes`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `probe_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), probe identity, unique for each probe
- `template_ids`: [beat/array_1d_uint64](https://www.beat-eu.org/platform/dataformats/beat/array_1d_uint64/), list of client (model) identities which to compare this sample to
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `test_templates`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `template_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), template identity, unique for each template
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
+ Set `test_probes`:
- `file_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), sample identity, unique for each sample
- `client_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), client identity, unique for each client
- `probe_id`: [beat/uint64](https://www.beat-eu.org/platform/dataformats/beat/uint64/), probe identity, unique for each probe
- `template_ids`: [beat/array_1d_uint64](https://www.beat-eu.org/platform/dataformats/beat/array_1d_uint64/), list of client (model) identities which to compare this sample to
- `image`: [beat/image_rgb](https://www.beat-eu.org/platform/dataformats/beat/image_rgb/), image in RGB
- `eye_centers`: [beat/eye_positions](https://www.beat-eu.org/platform/dataformats/beat/eye_positions/), eye centers
\ No newline at end of file