bob.measure issueshttps://gitlab.idiap.ch/bob/bob.measure/-/issues2018-02-19T18:37:26Zhttps://gitlab.idiap.ch/bob/bob.measure/-/issues/27FAR and FRR thresholds are computed even when there is no data support2018-02-19T18:37:26ZManuel Günthersiebenkopf@googlemail.comFAR and FRR thresholds are computed even when there is no data supportI have lately come across a situation, where FAR (and FRR) thresholds were computed, although they should not have been.
Imagine the negative score distribution `[0.5, 0.6, 0.7, 0.8, 0.9, 1., 1., 1., 1., 1.]`. A threshold should now be c...I have lately come across a situation, where FAR (and FRR) thresholds were computed, although they should not have been.
Imagine the negative score distribution `[0.5, 0.6, 0.7, 0.8, 0.9, 1., 1., 1., 1., 1.]`. A threshold should now be computed for `FAR=0.1`. Our current implementation of `bob.measure.far_threshold` will return the threshold `1`. However, this threshold does not give us a false acceptance rate of `0.1`, but of `0.5`. In fact, there is no (data-driven) threshold that would provide a false acceptance rate of `0.1`.
A similar issue arises, when the number of data points is not sufficient for a given threshold to be computed.
From only 10 data points, you cannot provide a (data-driven) threshold for `FAR=0.05`, while our current implementation happily provides one.
There are two possible solutions for this issue.
First, we can simply return a threshold that is *just slightly higher* than the largest negative (or slightly lower than the largest positive when computing FRR threshold). This will indeed provide a solution, but this is not justified by data point and might be arbitrarily wrong, i.e., when applied to other test data.
Instead, we should just return `NaN`, since we really cannot compute a justified threshold for the requested FAR or FRR values.May 2017 HackathonAmir MOHAMMADIAmir MOHAMMADIhttps://gitlab.idiap.ch/bob/bob.measure/-/issues/26ROC and DET plots are calculated incorrectly sometimes2018-01-16T16:56:34ZAmir MOHAMMADIROC and DET plots are calculated incorrectly sometimesFollowing the discussion here: https://groups.google.com/forum/#!topic/bob-devel/EIp1nvw5-vQ
Looks like we have a corner case where the scores have a very large peak in their distribution:
![hist_data](/uploads/ee0071522224130f9fa7be...Following the discussion here: https://groups.google.com/forum/#!topic/bob-devel/EIp1nvw5-vQ
Looks like we have a corner case where the scores have a very large peak in their distribution:
![hist_data](/uploads/ee0071522224130f9fa7be7b8508f2e4/hist_data.png)
The scores are also available: [fusion_all_200_datatset.npy](/uploads/fa88359dfdb4d66cdbc54f3e4a677149/fusion_all_200_datatset.npy)
To load them:
```python
>>> scores = numpy.load('fusion_all_200_datatset.npy')
... positives = scores[0]
... negatives = scores[1]
# The negatives are mostly 0
>>> sum(negatives < 9e-16)
51911
>>> sum(negatives < 9e-17)
51525
>>> sum(negatives < 9e-18)
51029
>>> sum(negatives < 9e-20)
49675
>>> sum(negatives < 9e-22)
47543
>>> sum(negatives < 9e-30)
27487
>>> sum(negatives < 9e-60)
0
```
@tiago.pereira may be able to provide a set of smaller scores to debug this.
When I calculate the EER, I get `2.7%`:
```python
>>> scores = numpy.load('fusion_all_200_datatset.npy')
... positives = scores[0]
... negatives = scores[1]
...
>>> threshold = bob.measure.eer_threshold(negatives, positives)
... FAR, FRR = bob.measure.farfrr(negatives, positives, threshold)
...
>>> FAR, FRR
(0.02762483029114497, 0.027626084163186636)
>>> negatives.mean(), negatives.std(), positives.mean(), positives.std()
(3.0290521114232657e-06, 0.00067550514259906739, 0.20795613959959214, 0.32512541269459205)
>>> 100*(FAR+FRR)/2
2.76254572271658
>>> bob.measure.plot.roc(negatives, positives, npoints)
[<matplotlib.lines.Line2D object at 0x7f472a8bed10>]
>>> bob.measure.plot.det(negatives, positives, npoints, color=(0,0,0), linestyle='-', label='test')
... bob.measure.plot.det_axis([0.1, 80, 0.1, 80])
...
[-3.090232246772911, 0.8416212348748217, -3.090232246772911, 0.8416212348748217]
# plot the EER point on the DET curve
>>> pyplot.plot(bob.measure.ppndf(FAR), bob.measure.ppndf(FAR), 'ro')
[<matplotlib.lines.Line2D object at 0x7f472a4dc510>]
```
But when I plot the ROC and DET curves, I get curves with EER of `6%` or more than `20%`:
DET CURVE with around `6%` EER:
![wrong_det](/uploads/6c8c93c1c994f0c256ca1b90087a1484/wrong_det.png)
ROC CURVE with more than `20%` EER:
![wrong_roc](/uploads/4730c633a9c7a6ae0ef0d56227e79e41/wrong_roc.png)May 2017 HackathonAmir MOHAMMADIAmir MOHAMMADI