Adding tags to transformers to differentiate between annotators, preprocessors, and extractors.
I think while it was a good idea to use scikit-learn's transformers API for our classes, still there are some differences between the transformers that we implement. I suggest adding tags to our transformers (https://scikit-learn.org/stable/developers/develop.html#estimator-tags) to be able to programmatically differentiate them. For example, we can have:
class Preprocessor(BaseEstimator):
def _more_tags(self):
return {'bob_transformer': 'preprocessor'}
that would allow:
preprocessor = wrap(["sample"], preprocessor)
to implicitly imply:
transform_extra_arguments = (("annotations", "annotations"),)
preprocessor = wrap(["sample"], preprocessor, transform_extra_arguments=transform_extra_arguments)
Or wrapping an annotator would imply sample.annotations = annotator(sample.data)
instead of the usual sample.data = transformer(sample.data)
.
What do you think? Does it make sense?
Edited by Amir MOHAMMADI