Skip to content
Snippets Groups Projects

Add sequential and parallel processors, pre-processors, and extractors

Merged Amir MOHAMMADI requested to merge processors into master

Fixes #95 (closed)

Merge request reports

Checking pipeline status.

Approval is optional

Merged by avatar (May 27, 2025 9:52pm UTC)

Loading

Pipeline #13461 passed

Pipeline passed for 8d7a645c on master

Test coverage 88.00% (1.00%) from 6 jobs

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Manuel Günther
  • Manuel Günther
  • @amohammadi Thanks for implementing this. I have made some comments inline. I think you should check any try ... except block that you have implemented. I believe, none of them should be necessary, and some of them might even be dangerous (you might lose some parameters). If there is an error, please let the Exception propagate, as in all cases they indicate a semantic error.

    On the other hand, neither of your processors deal with the I/O of the preprocessed data or extracted features. For the parallel versions, you might want to write them in a single HDF5 file using a sub_group for each processor. For the sequential versions, you most probably want to use the last processor to handle the I/O.

    While I can see the implementation of the Parallel{Preprocessor,Extractor}, I don't see the use of it. As mentioned before, there is no Algorithm that can make use of the features extracted by a parallel extractor. I also see major issues in the processing chain of parallel processors. When you have a parallel preprocessor with three preprocessors, do you also need a parallel extractor with three extractors? In your current implementation, each extractor works on the same preprocessed data... How do you connect the output of the parallel preprocessor to an extractor? How do you connect the output of the parallel extractor to an algorithm?

    I can see that you did not implement a parallel or sequential algorithm, and I understand that such an algorithm would be a custom solution. I am fine with keeping the parallel implementations, but you should make sure (e.g., in the documentation) that these require custom implementations of the Algorithm, and that you either require the same number of parallel preprocessors and extractors, or that the extractor must also be a custom implementation. However, I would definitely prefer if you could re-think the parallel implementations and maybe delete them.

    Edited by Manuel Günther
  • Amir MOHAMMADI added 1 commit

    added 1 commit

    • 7c2bc523 - Implement hdf5 io for extractors

    Compare with previous version

  • Amir MOHAMMADI added 1 commit

    added 1 commit

    Compare with previous version

  • @mguenther I have addressed most of your comments but you misunderstood the parallel idea here. The parallel ones process data independently and then the result is merged as one larger feature. Think of it like I have an extractor that calculates the mean and another one that calculates the median. Now I want to construct larger features which contain both the mean and the median of the data. This is where the parallel paradigms come to play.

    It is not like you have to use parallel preprocessors and extractors at the same time.

    Edited by Amir MOHAMMADI
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading