Add sequential and parallel processors, pre-processors, and extractors

added 1 commit

b8efca23 - Add sequential and parallel processors, pre-processors, and extractors

Maybe @tiago.pereira and @onikisins are interested in this.

@mguenther could you please review this?

@amohammadi Thanks for implementing this. I have made some comments inline. I think you should check any try ... except block that you have implemented. I believe, none of them should be necessary, and some of them might even be dangerous (you might lose some parameters). If there is an error, please let the Exception propagate, as in all cases they indicate a semantic error.

On the other hand, neither of your processors deal with the I/O of the preprocessed data or extracted features. For the parallel versions, you might want to write them in a single HDF5 file using a sub_group for each processor. For the sequential versions, you most probably want to use the last processor to handle the I/O.

While I can see the implementation of the Parallel{Preprocessor,Extractor}, I don't see the use of it. As mentioned before, there is no Algorithm that can make use of the features extracted by a parallel extractor. I also see major issues in the processing chain of parallel processors. When you have a parallel preprocessor with three preprocessors, do you also need a parallel extractor with three extractors? In your current implementation, each extractor works on the same preprocessed data... How do you connect the output of the parallel preprocessor to an extractor? How do you connect the output of the parallel extractor to an algorithm?

I can see that you did not implement a parallel or sequential algorithm, and I understand that such an algorithm would be a custom solution. I am fine with keeping the parallel implementations, but you should make sure (e.g., in the documentation) that these require custom implementations of the Algorithm, and that you either require the same number of parallel preprocessors and extractors, or that the extractor must also be a custom implementation. However, I would definitely prefer if you could re-think the parallel implementations and maybe delete them.

added 1 commit

7c2bc523 - Implement hdf5 io for extractors

Compare with previous version

added 1 commit

8b031ce9 - remove extra paranthesis

Compare with previous version

@mguenther I have addressed most of your comments but you misunderstood the parallel idea here. The parallel ones process data independently and then the result is merged as one larger feature. Think of it like I have an extractor that calculates the mean and another one that calculates the median. Now I want to construct larger features which contain both the mean and the median of the data. This is where the parallel paradigms come to play.

It is not like you have to use parallel preprocessors and extractors at the same time.

Add sequential and parallel processors, pre-processors, and extractors

Merged by (May 27, 2025 9:52pm UTC)

Activity

Add sequential and parallel processors, pre-processors, and extractors

Merge request reports

Merged by (May 27, 2025 9:52pm UTC)

Activity