Skip to content

New database interface for PAD

Tiago de Freitas Pereira requested to merge new-db into master

Hi @amohammadi, @ydayer

Follow the proposition for a new DB interface for PAD. It follows the same guide lines used in bob.bio.base. Follow below the features implemented:

  1. Uses CSV files instead of LSTs; with that, you can ship metadata. However, it uses the same file structure as before, so no stress in porting stuff.
  2. The CSVPADDataset can transparently read the current LST files we have (I've created a sample loader that handles that).
  3. The CSVPADDataset is able to read either files inside of a file structure or files inside of a tarball.

Follow an example on how to use it, by reading from a file structure and from a tarball

    def run(path):

        dataset = CSVPADDataset(path, "protocol1")

        # Train
        assert len(dataset.fit_samples()) == 5
        # 2 out of 5 are bonafides
        assert sum([s.is_bonafide for s in dataset.fit_samples()]) == 2

        # DEV
        assert len(dataset.predict_samples()) == 5
        # 2 out of 5 are bonafides
        assert sum([s.is_bonafide for s in dataset.predict_samples()]) == 2

        # EVAL
        assert len(dataset.predict_samples(group="eval")) == 7
        # 3 out of 5 are bonafides
        assert sum([s.is_bonafide for s in dataset.predict_samples(group="eval")]) == 3

    csv_example_dir = os.path.realpath(
        bob.io.base.test_utils.datafile(".", __name__, "data/csv_dataset")
    )

    csv_example_tarball = os.path.realpath(
        bob.io.base.test_utils.datafile(".", __name__, "data/csv_dataset.tar.gz")
    )

    run(csv_example_dir)
    run(csv_example_tarball)

Merge request reports