eer_threshold doesn't behave as expected

Created by: siebenkopf

Hi there,

I have found out that the eer_threshold function (as well as the min_hter_threshold function) does not perform as expected in case of highly unbalanced scores. Particularly, when I try to use a special low valued error indicator such as -10000, the returned threshold is garbage. Consider the following example:

positives = [1.]*100 + [-10000.]
negatives = [-1.]*100 + [-10000.]

threshold = bob.measure.eer_threshold(negatives, positives)
print threshold, bob.measure.farfrr(negatives, positives, threshold)
print 0, bob.measure.farfrr(negatives, positives, 0)

The outputs of the two print commands are:

-4999.5 (0.9900990099009901, 0.009900990099009901)
0 (0.0, 0.009900990099009901)

So, the second output, where I estimated a threshold 0 myself, is much more suitable than the first output, which was generated by eer_threshold (the min_hter_threshold behaves exactly the same).

I would suggest, we should rethink our implementation of the thresholds. I am pretty sure that there exist smarter ways (I had implemented some during my PhD...)

Cheers Manuel