Decoupling data and annotation loading of BioFile's from database interfaces
As originally proposed in bob.db.base#22 (closed)
I think it would make sense to separate data loading and annotation loading from the
database object and put the functionality in
This effort would also be aligned with our idea of samples in bob.pipelines. Currently I have this:
class BioFile(bob.db.base.File, _ReprMixin): """ A simple base class that defines basic properties of File object for the use in verification experiments Attributes ---------- client_id : str or int The id of the client this file belongs to. Its type depends on your implementation. If you use an SQL database, this should be an SQL type like Integer or String. path : object see :py:class:`bob.db.base.File` constructor file_id : object see :py:class:`bob.db.base.File` constructor original_directory : str or None The path to the original directory of the file original_extension : str or None The extension of the original files. This attribute is deprecated. Please try to include the extension in the ``path`` attribute annotation_directory : str or None The path to the directory of the annotations annotation_extension : str or None The extension of annotation files. Default is ``.json`` annotation_type : str or None The type of the annotation file, see :any:`bob.db.base.annotations.read_annotation_file`. Default is ``json``. """ def __init__( self, client_id, path, file_id=None, original_directory=None, original_extension=None, annotation_directory=None, annotation_extension=None, annotation_type=None, **kwargs, ): super(BioFile, self).__init__(path, file_id, **kwargs) # just copy the information self.client_id = client_id """The id of the client, to which this file belongs to.""" self.original_directory = original_directory self.original_extension = original_extension self.annotation_directory = annotation_directory self.annotation_extension = annotation_extension or ".json" self.annotation_type = annotation_type or "json" def load(self): """Loads the data at the specified location and using the given extension. Override it if you need to load differently. Returns ------- object The loaded data (normally :py:class:`numpy.ndarray`). """ # get the path path = self.make_path( self.original_directory or "", self.original_extension or "" ) return bob.io.base.load(path) @property def annotations(self): path = self.make_path(self.annotation_directory or "", self.annotation_extension or "") return read_annotation_file(path, annotation_type=self.annotation_type)
which requires a refactoring of our high-level db interfaces (we will not touch low-level db interfaces).
What do you think?
Of course, the load and annotations methods can be overridden per db.