Skip to content

Database interface

Tiago de Freitas Pereira requested to merge database-interface into dask-pipelines

Implemented a simple filelist database interface for the VanillaBiometrics based on CSVFiles.

The CSVDatasetDevEval needs to have the following format:

       my_dataset/
       my_dataset/my_protocol/
       my_dataset/my_protocol/train.csv
       my_dataset/my_protocol/train.csv/dev_enroll.csv
       my_dataset/my_protocol/train.csv/dev_probe.csv
       my_dataset/my_protocol/train.csv/eval_enroll.csv
       my_dataset/my_protocol/train.csv/eval_probe.csv
       ...

where each CSV file needs to have the following format:

       PATH,SUBJECT
       path_1,subject_1
       path_2,subject_2
       path_i,subject_j

This formart allows the usage of metadata by following the pattern below:

       PATH,SUBJECT,METADATA_1,METADATA_2,METADATA_k
       path_1,subject_1,A,B,C
       path_2,subject_2,A,B,1
       path_i,subject_j,2,3,4

We can imagine other implementations of this. For instance, CSVDatasetCrossValidation that given a csv file, it splits "on-the-fly" several data for enrolling, probing and training. Or CSVDatasetWithEyesAnnotation, that handles annotations for Face Rec pipelines.

I still need to implement a mechanism that takes zip files as input to CSVDatasetDevEval. That way we can ship databases as simple zip files

ping @ydayer @amohammadi

I'll merge this tomorrow. I need this to support the efforts on bob.bio.vein.

Edited by Tiago de Freitas Pereira

Merge request reports