FAR and FRR thresholds are computed even when there is no data support
More actions
Copy reference
Report abuse
View options
Truncate descriptions
Hide sidebar
Ctrl+/
ClosedIssuecreated
I have lately come across a situation, where FAR (and FRR) thresholds were computed, although they should not have been.
Imagine the negative score distribution [0.5, 0.6, 0.7, 0.8, 0.9, 1., 1., 1., 1., 1.]. A threshold should now be computed for FAR=0.1. Our current implementation of bob.measure.far_threshold will return the threshold 1. However, this threshold does not give us a false acceptance rate of 0.1, but of 0.5. In fact, there is no (data-driven) threshold that would provide a false acceptance rate of 0.1.
A similar issue arises, when the number of data points is not sufficient for a given threshold to be computed.
From only 10 data points, you cannot provide a (data-driven) threshold for FAR=0.05, while our current implementation happily provides one.
There are two possible solutions for this issue.
First, we can simply return a threshold that is just slightly higher than the largest negative (or slightly lower than the largest positive when computing FRR threshold). This will indeed provide a solution, but this is not justified by data point and might be arbitrarily wrong, i.e., when applied to other test data.
Instead, we should just return NaN, since we really cannot compute a justified threshold for the requested FAR or FRR values.