k-fold metrics are not tested currently

We should test the k-fold metrics and ensure their output is still compatible with the "macro" averages proposed by scikit-learn (if that makes sense!).

Assignee Loading

Time tracking Loading