Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
    • Help
    • Support
    • Submit feedback
    • Contribute to GitLab
  • Sign in
bob.measure
bob.measure
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 2
    • Issues 2
    • List
    • Boards
    • Labels
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • bob
  • bob.measurebob.measure
  • Issues
  • #27

Closed
Open
Opened Mar 30, 2017 by Manuel Günther@mguenther
  • Report abuse
  • New issue
Report abuse New issue

FAR and FRR thresholds are computed even when there is no data support

I have lately come across a situation, where FAR (and FRR) thresholds were computed, although they should not have been. Imagine the negative score distribution [0.5, 0.6, 0.7, 0.8, 0.9, 1., 1., 1., 1., 1.]. A threshold should now be computed for FAR=0.1. Our current implementation of bob.measure.far_threshold will return the threshold 1. However, this threshold does not give us a false acceptance rate of 0.1, but of 0.5. In fact, there is no (data-driven) threshold that would provide a false acceptance rate of 0.1.

A similar issue arises, when the number of data points is not sufficient for a given threshold to be computed. From only 10 data points, you cannot provide a (data-driven) threshold for FAR=0.05, while our current implementation happily provides one.

There are two possible solutions for this issue. First, we can simply return a threshold that is just slightly higher than the largest negative (or slightly lower than the largest positive when computing FRR threshold). This will indeed provide a solution, but this is not justified by data point and might be arbitrarily wrong, i.e., when applied to other test data.

Instead, we should just return NaN, since we really cannot compute a justified threshold for the requested FAR or FRR values.

Assignee
Assign to
May 2017 Hackathon
Milestone
May 2017 Hackathon
Assign milestone
Time tracking
None
Due date
None
1
Labels
bug
Assign labels
  • View project labels
Reference: bob/bob.measure#27