Remix datamodules should keep test sets from different databases separated
This would allow one to assess the performance on different sub-parts of the test set. It is always possible to combine these partial results to present one single number if need be.