Cannot specify original directory and extension for most of the databases anymore
While in previous database implementations, it was relatively straightforward to utilize the database interface in order to load pre-extracted features, in the new database interfaces this is no longer possible, for two reasons:
-
There is no simple way of providing the database interface with the directory where to read the data from. For example, in an old interface (LFW), we can still set:https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/c7ee7213f83f62b1e36685290e1defd10fea2c20/bob/bio/face/database/lfw.py#L90, while this option does no longer exist in newer interfaces: https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/c7ee7213f83f62b1e36685290e1defd10fea2c20/bob/bio/face/database/scface.py#L30, although it should be relatively straightforward to implement that since the default value is used just a few lines below: https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/c7ee7213f83f62b1e36685290e1defd10fea2c20/bob/bio/face/database/scface.py#L47. It should be simple to re-expose this option to the user.
-
The filename extension is by default empty: https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/c7ee7213f83f62b1e36685290e1defd10fea2c20/bob/bio/face/database/scface.py#L50, and there is no possibility to change that on the constructor. But even if we would expose the extension similarly to the directory name, we would still be in trouble since the default loader just appends the extension rather than ** replacing** it: https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/406f2da1faadacd4d4fe4e36e5a0010d78557513/bob/bio/base/database/csv_dataset.py#L145
The main issue with 2. is that someone has decided to ignore our old behavior to store keys without filename extension, and just added the original extension to the key. For example, running:
import bob.bio.face
db = bob.bio.face.database.SCFaceDatabase(protocol="far")
samples = db.all_samples()
print(samples[0].key)
will print filename.JPG
instead of filename
(without extension).
So, what I would propose is (the least amount of changes, anything else would require to re-create all the CSV-based databases):
-
Expose the original_directory
andoriginal_extension
parameters for all databases to the user, keeping their default values as they currently are. -
Change the line in https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/406f2da1faadacd4d4fe4e36e5a0010d78557513/bob/bio/base/database/csv_dataset.py#L145 to replace the extension with the given one (if one is given) rather than appending it to the filename.
Any objection?