Allowing eval DIR curve to use the dev thresholds
-
mentioned in commit 70e39528
-
Nice discussion :-)
Oh, you are right. Follow a script that plots exactly this behaviour (different score distributions for the dev and eval) and its respective DIR (using the threshold of the dev set and using the threshold of the eval).toy.pymy_nice_report.pdf
But this issue is tricky; Let's say that you want to choose your threshold operation point and, as a good researcher, you use the
dev
set for that. You also want to do an analysis under different FARs. In order to evaluate if certain threshold is good for theeval
set you would need to analyse two metrics: The DIR(\tau) and the False Alarm Rate(\tau).Observe that we can't make this decision using the
evaluate.py
Do you think is a good idea to have some sort customized DIR with two
x-axis
(https://stackoverflow.com/questions/28112217/matplotlib-second-x-axis-with-transformed-values)? The one in the bottom would show the FAR of thedev
set (the one used to estimate the thresholds) and the one in the top would correspond the FAR of theeval
set (computed with thedev
thresholds).Thanks
-
Thanks for showing empirically that my thoughts were correct :-D
I can see that you want to use the Lausanne protocol (with
dev
andeval
sets) in the right way. I am not sure how to do that. For regular ROC, DET and CMC plots, there is no such thing as a second x-axis either. That's why we didn't plot the ROC curve on theeval
set in our ICML paper (see http://publications.idiap.ch/index.php/publications/show/3666), and I don't think plotting ROC/DET/CMC/DIR curves on theeval
set makes a whole lot of sense. Anyways, these plots are created by default, i.e., in case someone is of a different opinion than me :-)The only plot that makes use of the Lausanne protocol in the right way is the EPC curve, which plots the HTER of the
eval
set based on several thresholds based on thedev
set. If you want, you can invent such a plot for open-set experiments, write a paper about it, publish it, and incorporate the plot intoevaluate.py
later. For now, I think there is no need/reason/incentive to have such a plot inside this PR.