Skip to content
Snippets Groups Projects
Closed SampleBatch design issues
  • View options
  • SampleBatch design issues

  • View options
  • Closed Issue created

    Hi,

    Although SampleBatch brings convenience and efficiency, it forces us to develop transformers that are compatible with it.

    Imagine the simple transformer bellow:

    class FakeTransformer(TransformerMixin, BaseEstimator):
        def fit(self, X, y=None):
            return self
    
        def transform(self, X):
            return X + 1
    
        def _more_tags(self):
            return {"stateless": True, "requires_fit": False}

    I can easily use it with numpy arrays as input.

        transformer = FakeTransformer()
        X = np.zeros(shape=(3, 160, 160))    
        transformed_X = transformer.transform(X)

    However, I run into problems once I wrap it as a sample

        sample = Sample(X)
        transformer_sample = wrap(["sample"], transformer)
        my_beautiful_sample = [s.data for s in transformer_sample.transform([sample])]
        # THIS DOESN'T WORK

    With this wrap, the input X of FakeTransformer.transform will be SampleBatch and not numpy array. Hence, I can't do X+1.

    I can approach this issue in my transformer by doing this:

        def transform(self, X):
            X = np.asarray(X)
            return X + 1

    However, this is a blocker if we want to use estimators developed by other people outside of our circle.

    Do you think it is sensible to have X wrapped as a SampleBatch once SampleTransform is used? It breaks encapsulation.

    Thanks

    Linked items ... 0

  • Activity

    • All activity
    • Comments only
    • History only
    • Newest first
    • Oldest first
    Loading Loading Loading Loading Loading Loading Loading Loading Loading Loading