New place to put extactors
Created a place to hold the code for extractors based on architectures defined in bob.learn.pytorch.architectures
.
@ageorge @onikisins @ssarfjoo : If you implement a new extractor, it should be placed here (don't worry about what has already been done in bob.ip.pytorch_extractor
, I'll take care of that when I'll have some more time)
Thanks
Merge request reports
Activity
Hey @heusch,
I would suggest to make the package structure more generic. As one can conclude from the names, currently it is limited to Extractors to be applied on images. But:
-
Input can be anything, not just images. So folder
image
is unnecessary from my point of view, unless you really want to separate extractor for different types of samples. -
We have three general blocks in bob pipe-lines:
preprocessor
,extractor
andalgorithm
. Any of these can be PyTorch based. Would be good to have folders for those as-well.
Thanks!
-
In my view having separate folders for image and audio is suitable architecture. Handling variable length input is important in audio, however is not critical in image. In addition, these extractors (image, audio) barely can be used interchangebly. Currently, I implemented audio embedding extractor as
preprocessor
block (as in this block we have access to raw data). Should we call them extractor in higher level or having separate folder forpreprocessor
andextractor
?@ssarfjoo, OK to have have
image
,audio
, etc. folders if you believe it will simplify understanding.For
preprocessor
,extractor
,algorithm
I would suggest we stick to Bob terminology and have different directories, because those have different parent classes. For example I already have an MLPalgorith
.Edited by Olegs NIKISINSThat's okay for me. @heusch what do you think about it?
Hi guys,
To address the two points mentioned by @onikisins :
- I would also prefer to have separate folders for image / speech / {insert_whatever_here}. As Saeed said, I think it would help categorize stuff
- Agreed on bob terminology, but this is relevant only if you use your extractor and /or algorithm within the bob.bio or bob.pad framework (i.e. pretraining a model to be used later for instance does not apply). Also, I think that preprocessor should not be placed here, but that's my opinion ... It's hard to say if preprocessor are "data" specific (i.e. for preprocessing a face) or "architecture" specific (i.e. a particular architecture expect a specific format as input) ...
As I investigated, currently for audio datasets,
preprocessor
is "data" specific which input of this block is sampling rate and raw speech data which will be loaded from Database interface. If we removepreprocessor
frombob.learn.pytorch
, what can be the alternative architecture? E.g., is this possible to have access to raw data inextractor
level without making redundant temp data? In current architecture ofpreprocessor
in Bob, the raw speech will be copied to the output ofpreprocessor
which is not optimized implementation.Edited by Saeed SARFJOOis this possible to have access to raw data in extractor level without making redundant temp data
Yes it is: just provide the preprocessed directory and set the skip_preprocessing flag to
True
in either command-line or config file when runningverify.py
orspoof.py
. Does that answer your question ?I think that your preprocessor should be located either in
bob.learn.pytorch
,bob.bio.spear
orbob.pad.voice
, or even in your project directory ... This is a tricky question, since I actually don't know what is considered generic enough to be embedded inbob.bio.spear
orbob.pad.voice
...As this
preprocessor
is relevant toextractor
this is better to be inbob.learn.pytorch
. So we must have preprocessor folder here too. Is this true to say in this conditionread_data
function ofpreprocessor
instead of readingbob.io.base.HDF5File
from preprocessed directory must read the raw data from Database? And usuallywrite_data
function shouldn't do anything.Guys, if you want my opinion, do not use
bob.bio.base
orbob.pad.base
classes to implement your extractors/algorithms etc. Write them in their own classes in a way that you can easily use them as preprocessor or extractor. For an example, see: https://gitlab.idiap.ch/bob/bob.ip.tensorflow_extractor/blob/d491c9833eff2e368aba03ffacb81aee089f8658/bob/ip/tensorflow_extractor/FaceNet.py#L47To use this class as a bob.bio.base extractor:: from bob.bio.base.extractor import Extractor class FaceNetExtractor(FaceNet, Extractor): pass extractor = FaceNetExtractor()
This way, you can use them as a pre-processor, extractor, or maybe an Algorithm. Depends on your preference. The FaceNet above can also be a preprocessor using our CallablePreprocessor
added 1 commit
- c5ebec5d - added bob.bio.base in both requirements and conda recipe (needed for extractors)
@ssarfjoo I'm going to merge this, so feel free to checkout the master once it is merged, and start your branch from there (or checkout what you may use/need).
Thanks
mentioned in commit 9c130f65