Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • bob.bio.base bob.bio.base
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 15
    • Issues 15
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 1
    • Merge requests 1
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • bobbob
  • bob.bio.basebob.bio.base
  • Merge requests
  • !200

Database interface

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Tiago de Freitas Pereira requested to merge database-interface into dask-pipelines Oct 06, 2020
  • Overview 18
  • Commits 6
  • Pipelines 4
  • Changes 24

Implemented a simple filelist database interface for the VanillaBiometrics based on CSVFiles.

The CSVDatasetDevEval needs to have the following format:

       my_dataset/
       my_dataset/my_protocol/
       my_dataset/my_protocol/train.csv
       my_dataset/my_protocol/train.csv/dev_enroll.csv
       my_dataset/my_protocol/train.csv/dev_probe.csv
       my_dataset/my_protocol/train.csv/eval_enroll.csv
       my_dataset/my_protocol/train.csv/eval_probe.csv
       ...

where each CSV file needs to have the following format:

       PATH,SUBJECT
       path_1,subject_1
       path_2,subject_2
       path_i,subject_j

This formart allows the usage of metadata by following the pattern below:

       PATH,SUBJECT,METADATA_1,METADATA_2,METADATA_k
       path_1,subject_1,A,B,C
       path_2,subject_2,A,B,1
       path_i,subject_j,2,3,4

We can imagine other implementations of this. For instance, CSVDatasetCrossValidation that given a csv file, it splits "on-the-fly" several data for enrolling, probing and training. Or CSVDatasetWithEyesAnnotation, that handles annotations for Face Rec pipelines.

I still need to implement a mechanism that takes zip files as input to CSVDatasetDevEval. That way we can ship databases as simple zip files

ping @ydayer @amohammadi

I'll merge this tomorrow. I need this to support the efforts on bob.bio.vein.

Edited Oct 07, 2020 by Tiago de Freitas Pereira
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: database-interface