Sample-based pipelines inefficiencies
This is a generic issue that I am raising that I believe we will face moving forward.
The biggest issue that I have found with our sample-based approach is when you have to concatenate samples to make a big array for processing steps such as .fit
methods.
The reason for this is that we are looking at samples individually, even though they might have come from a bigger array.
Let me demonstrate this with an example:
sample_stacking_issue.html
or sample_stacking_issue.ipynb
Edited by Amir MOHAMMADI