guide.rst 17.9 KB
 André Anjos committed Nov 21, 2013 1 2 3 4 .. vim: set fileencoding=utf-8 : .. Andre Anjos .. Tue 15 Oct 17:41:52 2013  André Anjos committed Feb 18, 2014 5 .. testsetup:: *  André Anjos committed Nov 21, 2013 6   André Anjos committed Feb 18, 2014 7 8 9 10 11 12  import numpy positives = numpy.random.normal(1,1,100) negatives = numpy.random.normal(-1,1,100) import matplotlib if not hasattr(matplotlib, 'backends'): matplotlib.use('pdf') #non-interactive avoids exception on display  André Anjos committed May 26, 2014 13  import bob.measure  André Anjos committed Nov 21, 2013 14 15 16 17 18  ============ User Guide ============  André Anjos committed May 26, 2014 19 Methods in the :py:mod:bob.measure module can help you to quickly and easily  André Anjos committed Dec 12, 2013 20 21 evaluate error for multi-class or binary classification problems. If you are not yet familiarized with aspects of performance evaluation, we recommend the  22 23 following papers and book chapters for an overview of some of the implemented methods.  André Anjos committed Dec 12, 2013 24 25 26 27 28 29 30  * Bengio, S., Keller, M., Mariéthoz, J. (2004). The Expected Performance Curve_. International Conference on Machine Learning ICML Workshop on ROC Analysis in Machine Learning, 136(1), 1963–1966. * Martin, A., Doddington, G., Kamm, T., Ordowski, M., & Przybocki, M. (1997). The DET curve in assessment of detection task performance_. Fifth European Conference on Speech Communication and Technology (pp. 1895-1898).  31 32 * Li, S., Jain, A.K. (2005), Handbook of Face Recognition, Chapter 14, Springer  André Anjos committed Dec 12, 2013 33 34 35 36 37 38 39 40 41 42 43 44 45  Overview -------- A classifier is subject to two types of errors, either the real access/signal is rejected (false rejection) or an impostor attack/a false access is accepted (false acceptance). A possible way to measure the detection performance is to use the Half Total Error Rate (HTER), which combines the False Rejection Rate (FRR) and the False Acceptance Rate (FAR) and is defined in the following formula: .. math::  Manuel Günther committed Nov 25, 2015 46  HTER(\tau, \mathcal{D}) = \frac{FAR(\tau, \mathcal{D}) + FRR(\tau, \mathcal{D})}{2} \quad \textrm{[\%]}  André Anjos committed Dec 12, 2013 47   André Anjos committed Sep 28, 2016 48   André Anjos committed Dec 12, 2013 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 where :math:\mathcal{D} denotes the dataset used. Since both the FAR and the FRR depends on the threshold :math:\tau, they are strongly related to each other: increasing the FAR will reduce the FRR and vice-versa. For this reason, results are often presented using either a Receiver Operating Characteristic (ROC) or a Detection-Error Tradeoff (DET) plot, these two plots basically present the FAR versus the FRR for different values of the threshold. Another widely used measure to summarise the performance of a system is the Equal Error Rate (EER), defined as the point along the ROC or DET curve where the FAR equals the FRR. However, it was noted in by Bengio et al. (2004) that ROC and DET curves may be misleading when comparing systems. Hence, the so-called Expected Performance Curve (EPC) was proposed and consists of an unbiased estimate of the reachable performance of a system at various operating points. Indeed, in real-world scenarios, the threshold :math:\tau has to be set a priori: this is typically done using a development set (also called cross-validation set). Nevertheless, the optimal threshold can be different depending on the relative importance given to the FAR and the FRR. Hence, in the EPC framework, the cost  André Anjos committed Nov 08, 2016 67 68 :math:\beta \in [0;1] is defined as the trade-off between the FAR and FRR. The optimal threshold :math:\tau^* is then computed using different values of  André Anjos committed Dec 12, 2013 69 70 71 72 73 :math:\beta, corresponding to different operating points: .. math:: \tau^{*} = \arg\!\min_{\tau} \quad \beta \cdot \textrm{FAR}(\tau, \mathcal{D}_{d}) + (1-\beta) \cdot \textrm{FRR}(\tau, \mathcal{D}_{d})  André Anjos committed Sep 28, 2016 74   André Anjos committed Dec 12, 2013 75 where :math:\mathcal{D}_{d} denotes the development set and should be  André Anjos committed Sep 28, 2016 76 completely separate to the evaluation set :math:\mathcal{D}.  André Anjos committed Dec 12, 2013 77 78 79 80 81  Performance for different values of :math:\beta is then computed on the test set :math:\mathcal{D}_{t} using the previously derived threshold. Note that setting :math:\beta to 0.5 yields to the Half Total Error Rate (HTER) as defined in the first equation.  André Anjos committed Nov 21, 2013 82 83 84  .. note::  Manuel Günther committed Nov 25, 2015 85  Most of the methods available in this module require as input a set of 2  André Anjos committed Dec 12, 2013 86 87 88 89 90 91  :py:class:numpy.ndarray objects that contain the scores obtained by the classification system to be evaluated, without specific order. Most of the classes that are defined to deal with two-class problems. Therefore, in this setting, and throughout this manual, we have defined that the **negatives** represents the impostor attacks or false class accesses (that is when a sample of class A is given to the classifier of another class, such as class  Manuel Günther committed Nov 25, 2015 92  B) for of the classifier. The second set, referred as the **positives**  André Anjos committed Dec 12, 2013 93 94 95 96 97 98 99 100 101 102 103  represents the true class accesses or signal response of the classifier. The vectors are called this way because the procedures implemented in this module expects that the scores of **negatives** to be statistically distributed to the left of the signal scores (the **positives**). If that is not the case, one should either invert the input to the methods or multiply all scores available by -1, in order to have them inverted. The input to create these two vectors is generated by experiments conducted by the user and normally sits in files that may need some parsing before these vectors can be extracted.  Manuel Günther committed Nov 25, 2015 104  In the remainder of this section we assume you have successfully parsed and  André Anjos committed Dec 12, 2013 105 106  loaded your scores in two 1D float64 vectors and are ready to evaluate the performance of the classifier.  André Anjos committed Nov 21, 2013 107   108 109 Verification ------------  André Anjos committed Nov 21, 2013 110   André Anjos committed Dec 12, 2013 111 112 To count the number of correctly classified positives and negatives you can use the following techniques:  André Anjos committed Nov 21, 2013 113 114 115  .. doctest::  Manuel Günther committed Nov 25, 2015 116 117 118 119 120 121  >>> # negatives, positives = parse_my_scores(...) # write parser if not provided! >>> T = 0.0 #Threshold: later we explain how one can calculate these >>> correct_negatives = bob.measure.correctly_classified_negatives(negatives, T) >>> FAR = 1 - (float(correct_negatives.sum())/negatives.size) >>> correct_positives = bob.measure.correctly_classified_positives(positives, T) >>> FRR = 1 - (float(correct_positives.sum())/positives.size)  André Anjos committed Nov 21, 2013 122   André Anjos committed Dec 12, 2013 123 We do provide a method to calculate the FAR and FRR in a single shot:  André Anjos committed Nov 21, 2013 124 125 126  .. doctest::  Manuel Günther committed Nov 25, 2015 127  >>> FAR, FRR = bob.measure.farfrr(negatives, positives, T)  André Anjos committed Nov 21, 2013 128   André Anjos committed Dec 12, 2013 129 130 131 132 133 134 The threshold T is normally calculated by looking at the distribution of negatives and positives in a development (or validation) set, selecting a threshold that matches a certain criterion and applying this derived threshold to the test (or evaluation) set. This technique gives a better overview of the generalization of a method. We implement different techniques for the calculation of the threshold:  André Anjos committed Nov 21, 2013 135   André Anjos committed Dec 12, 2013 136 * Threshold for the EER  André Anjos committed Nov 21, 2013 137   André Anjos committed Dec 12, 2013 138  .. doctest::  André Anjos committed Nov 21, 2013 139   André Anjos committed May 26, 2014 140  >>> T = bob.measure.eer_threshold(negatives, positives)  André Anjos committed Nov 21, 2013 141   André Anjos committed Dec 12, 2013 142 * Threshold for the minimum HTER  André Anjos committed Nov 21, 2013 143   André Anjos committed Dec 12, 2013 144  .. doctest::  André Anjos committed Nov 21, 2013 145   André Anjos committed May 26, 2014 146  >>> T = bob.measure.min_hter_threshold(negatives, positives)  André Anjos committed Nov 21, 2013 147   André Anjos committed Dec 12, 2013 148 149 * Threshold for the minimum weighted error rate (MWER) given a certain cost :math:\beta.  André Anjos committed Nov 21, 2013 150   Manuel Günther committed Nov 25, 2015 151  .. doctest:: python  André Anjos committed Nov 21, 2013 152   André Anjos committed Dec 12, 2013 153  >>> cost = 0.3 #or "beta"  André Anjos committed May 26, 2014 154  >>> T = bob.measure.min_weighted_error_rate_threshold(negatives, positives, cost)  André Anjos committed Nov 21, 2013 155   André Anjos committed Dec 12, 2013 156  .. note::  André Anjos committed Nov 21, 2013 157   Manuel Günther committed Nov 25, 2015 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173  By setting cost to 0.5 is equivalent to use :py:func:bob.measure.min_hter_threshold. .. note:: Many functions in bob.measure have an is_sorted parameter, which defaults to False, throughout. However, these functions need sorted positive and/or negative scores. If scores are not in ascendantly sorted order, internally, they will be copied -- twice! To avoid scores to be copied, you might want to sort the scores in ascending order, e.g., by: .. doctest:: python >>> negatives.sort() >>> positives.sort() >>> t = bob.measure.min_weighted_error_rate_threshold(negatives, positives, cost, is_sorted = True) >>> assert T == t  174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 Identification -------------- For identification, the Recognition Rate is one of the standard measures. To compute recognition rates, you can use the :py:func:bob.measure.recognition_rate function. This function expects a relatively complex data structure, which is the same as for the CMC_ below. For each probe item, the scores for negative and positive comparisons are computed, and collected for all probe items: .. doctest:: >>> rr_scores = [] >>> for probe in range(10): ... pos = numpy.random.normal(1, 1, 1) ... neg = numpy.random.normal(0, 1, 19) ... rr_scores.append((neg, pos))  189  >>> rr = bob.measure.recognition_rate(rr_scores, rank=1)  190 191 192 193 194 195 196 197 198 199 200 201 202 203 204  For open set identification, according to Li and Jain (2005) there are two different error measures defined. The first measure is the :py:func:bob.measure.detection_identification_rate, which counts the number of correctly classified in-gallery probe items. The second measure is the :py:func:bob.measure.false_alarm_rate, which counts, how often an out-of-gallery probe item was incorrectly accepted. Both rates can be computed using the same data structure, with one exception. Both functions require that at least one probe item exists, which has no according gallery item, i.e., where the positives are empty or None: (continued from above...) .. doctest:: >>> for probe in range(10): ... pos = None ... neg = numpy.random.normal(-2, 1, 10) ... rr_scores.append((neg, pos))  205 206  >>> dir = bob.measure.detection_identification_rate(rr_scores, threshold = 0, rank=1) >>> far = bob.measure.false_alarm_rate(rr_scores, threshold = 0)  207   Theophile GENTILHOMME committed Mar 20, 2018 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 Confidence interval ------------------- A confidence interval for parameter x consists of a lower estimate L, and an upper estimate U, such that the probability of the true value being within the interval estimate is equal to \alpha. For example, a 95% confidence interval (i.e. \alpha = 0.95) for a parameter x is given by [L, U] such that Prob(x∈[L,U]) = 95%. The smaller the test size, the wider the confidence interval will be, and the greater alpha, the smaller the confidence interval will be. The Clopper-Pearson interval_, a common method for calculating confidence intervals, is function of the number of success, the number of trials and confidence value \alpha is used as :py:func:bob.measure.utils.confidence_for_indicator_variable. It is based on the cumulative probabilities of the binomial distribution. This method is quite conservative, meaning that the true coverage rate of a 95% Clopper–Pearson interval may be well above 95%.  André Anjos committed Nov 21, 2013 227   André Anjos committed Dec 12, 2013 228 229 Plotting --------  André Anjos committed Nov 21, 2013 230   André Anjos committed Dec 12, 2013 231 An image is worth 1000 words, they say. You can combine the capabilities of  André Anjos committed Sep 29, 2016 232 233 Matplotlib_ with |project| to plot a number of curves. However, you must have that package installed though. In this section we describe a few recipes.  André Anjos committed Nov 21, 2013 234   André Anjos committed Dec 12, 2013 235 236 ROC ===  André Anjos committed Nov 21, 2013 237   André Anjos committed Dec 12, 2013 238 239 240 The Receiver Operating Characteristic (ROC) curve is one of the oldest plots in town. To plot an ROC curve, in possession of your **negatives** and **positives**, just do something along the lines of:  André Anjos committed Nov 21, 2013 241 242 243  .. doctest::  Manuel Günther committed Nov 25, 2015 244 245 246 247 248 249 250 251  >>> from matplotlib import pyplot >>> # we assume you have your negatives and positives already split >>> npoints = 100 >>> bob.measure.plot.roc(negatives, positives, npoints, color=(0,0,0), linestyle='-', label='test') # doctest: +SKIP >>> pyplot.xlabel('FAR (%)') # doctest: +SKIP >>> pyplot.ylabel('FRR (%)') # doctest: +SKIP >>> pyplot.grid(True) >>> pyplot.show() # doctest: +SKIP  André Anjos committed Nov 21, 2013 252   André Anjos committed Dec 12, 2013 253 You should see an image like the following one:  André Anjos committed Nov 21, 2013 254   André Anjos committed Feb 18, 2014 255 256 257 .. plot:: import numpy  258  numpy.random.seed(42)  André Anjos committed May 26, 2014 259  import bob.measure  André Anjos committed Feb 18, 2014 260 261 262 263 264  from matplotlib import pyplot positives = numpy.random.normal(1,1,100) negatives = numpy.random.normal(-1,1,100) npoints = 100  André Anjos committed May 26, 2014 265  bob.measure.plot.roc(negatives, positives, npoints, color=(0,0,0), linestyle='-', label='test')  André Anjos committed Feb 18, 2014 266 267 268 269  pyplot.grid(True) pyplot.xlabel('FAR (%)') pyplot.ylabel('FRR (%)') pyplot.title('ROC')  André Anjos committed Nov 21, 2013 270   André Anjos committed Dec 12, 2013 271 As can be observed, plotting methods live in the namespace  André Anjos committed Sep 29, 2016 272 273 274 275 276 277 278 279 280 281 282 283 :py:mod:bob.measure.plot. They work like the :py:func:matplotlib.pyplot.plot itself, except that instead of receiving the x and y point coordinates as parameters, they receive the two :py:class:numpy.ndarray arrays with negatives and positives, as well as an indication of the number of points the curve must contain. As in the :py:func:matplotlib.pyplot.plot command, you can pass optional parameters for the line as shown in the example to setup its color, shape and even the label. For an overview of the keywords accepted, please refer to the Matplotlib_'s Documentation. Other plot properties such as the plot title, axis labels, grids, legends should be controlled directly using the relevant Matplotlib_'s controls.  André Anjos committed Nov 21, 2013 284   André Anjos committed Dec 12, 2013 285 286 DET ===  André Anjos committed Nov 21, 2013 287   André Anjos committed Dec 12, 2013 288 A DET curve can be drawn using similar commands such as the ones for the ROC curve:  André Anjos committed Nov 21, 2013 289 290 291  .. doctest::  André Anjos committed Dec 12, 2013 292 293 294  >>> from matplotlib import pyplot >>> # we assume you have your negatives and positives already split >>> npoints = 100  André Anjos committed May 26, 2014 295 296  >>> bob.measure.plot.det(negatives, positives, npoints, color=(0,0,0), linestyle='-', label='test') # doctest: +SKIP >>> bob.measure.plot.det_axis([0.01, 40, 0.01, 40]) # doctest: +SKIP  André Anjos committed Dec 12, 2013 297 298 299 300  >>> pyplot.xlabel('FAR (%)') # doctest: +SKIP >>> pyplot.ylabel('FRR (%)') # doctest: +SKIP >>> pyplot.grid(True) >>> pyplot.show() # doctest: +SKIP  André Anjos committed Nov 21, 2013 301   André Anjos committed Dec 12, 2013 302 This will produce an image like the following one:  André Anjos committed Nov 21, 2013 303   André Anjos committed Feb 18, 2014 304 305 306 .. plot:: import numpy  307  numpy.random.seed(42)  André Anjos committed May 26, 2014 308  import bob.measure  André Anjos committed Feb 18, 2014 309 310 311 312 313 314  from matplotlib import pyplot positives = numpy.random.normal(1,1,100) negatives = numpy.random.normal(-1,1,100) npoints = 100  André Anjos committed May 26, 2014 315 316  bob.measure.plot.det(negatives, positives, npoints, color=(0,0,0), linestyle='-', label='test') bob.measure.plot.det_axis([0.1, 80, 0.1, 80])  André Anjos committed Feb 18, 2014 317 318 319 320  pyplot.grid(True) pyplot.xlabel('FAR (%)') pyplot.ylabel('FRR (%)') pyplot.title('DET')  André Anjos committed Nov 21, 2013 321 322 323  .. note::  André Anjos committed Dec 12, 2013 324 325 326  If you wish to reset axis zooming, you must use the Gaussian scale rather than the visual marks showed at the plot, which are just there for displaying purposes. The real axis scale is based on the  Manuel Günther committed Oct 28, 2014 327  :py:func:bob.measure.ppndf method. For example, if you wish to set the x and y  André Anjos committed Dec 12, 2013 328  axis to display data between 1% and 40% here is the recipe:  André Anjos committed Nov 21, 2013 329   André Anjos committed Dec 12, 2013 330  .. doctest::  André Anjos committed Nov 21, 2013 331   André Anjos committed Dec 12, 2013 332  >>> #AFTER you plot the DET curve, just set the axis in this way:  André Anjos committed May 26, 2014 333  >>> pyplot.axis([bob.measure.ppndf(k/100.0) for k in (1, 40, 1, 40)]) # doctest: +SKIP  André Anjos committed Nov 21, 2013 334   André Anjos committed Dec 12, 2013 335  We provide a convenient way for you to do the above in this module. So,  André Anjos committed May 26, 2014 336  optionally, you may use the bob.measure.plot.det_axis method like this:  André Anjos committed Nov 21, 2013 337   André Anjos committed Dec 12, 2013 338  .. doctest::  André Anjos committed Nov 21, 2013 339   André Anjos committed May 26, 2014 340  >>> bob.measure.plot.det_axis([1, 40, 1, 40]) # doctest: +SKIP  André Anjos committed Nov 21, 2013 341   André Anjos committed Dec 12, 2013 342 343 EPC ===  André Anjos committed Nov 21, 2013 344   Manuel Günther committed Sep 17, 2015 345 Drawing an EPC requires that both the development set negatives and positives are provided alongside  André Anjos committed Dec 12, 2013 346 the test (or evaluation) set ones. Because of this the API is slightly modified:  André Anjos committed Nov 21, 2013 347 348 349  .. doctest::  André Anjos committed May 26, 2014 350  >>> bob.measure.plot.epc(dev_neg, dev_pos, test_neg, test_pos, npoints, color=(0,0,0), linestyle='-') # doctest: +SKIP  André Anjos committed Dec 12, 2013 351  >>> pyplot.show() # doctest: +SKIP  André Anjos committed Nov 21, 2013 352   André Anjos committed Dec 12, 2013 353 This will produce an image like the following one:  André Anjos committed Nov 21, 2013 354   André Anjos committed Feb 18, 2014 355 356 357 .. plot:: import numpy  358  numpy.random.seed(42)  André Anjos committed May 26, 2014 359  import bob.measure  André Anjos committed Feb 18, 2014 360 361 362 363 364 365 366  from matplotlib import pyplot dev_pos = numpy.random.normal(1,1,100) dev_neg = numpy.random.normal(-1,1,100) test_pos = numpy.random.normal(0.9,1,100) test_neg = numpy.random.normal(-1.1,1,100) npoints = 100  André Anjos committed May 26, 2014 367  bob.measure.plot.epc(dev_neg, dev_pos, test_neg, test_pos, npoints, color=(0,0,0), linestyle='-')  André Anjos committed Feb 18, 2014 368 369  pyplot.grid(True) pyplot.title('EPC')  André Anjos committed Nov 21, 2013 370   Manuel Günther committed Sep 17, 2015 371 372 373 374  CMC ===  André Anjos committed Sep 29, 2016 375 376 377 378 379 The Cumulative Match Characteristics (CMC) curve estimates the probability that the correct model is in the *N* models with the highest similarity to a given probe. A CMC curve can be plotted using the :py:func:bob.measure.plot.cmc function. The CMC can be calculated from a relatively complex data structure, which defines a pair of positive and negative scores **per probe**:  Manuel Günther committed Sep 17, 2015 380 381 382 383  .. plot:: import numpy  384  numpy.random.seed(42)  Manuel Günther committed Sep 17, 2015 385 386 387  import bob.measure from matplotlib import pyplot  388  cmc_scores = []  Manuel Günther committed Sep 17, 2015 389 390 391  for probe in range(10): positives = numpy.random.normal(1, 1, 1) negatives = numpy.random.normal(0, 1, 19)  392 393  cmc_scores.append((negatives, positives)) bob.measure.plot.cmc(cmc_scores, logx=False)  André Anjos committed Sep 28, 2016 394  pyplot.grid(True)  Manuel Günther committed Sep 17, 2015 395 396 397 398 399  pyplot.title('CMC') pyplot.xlabel('Rank') pyplot.xticks([1,5,10,20]) pyplot.xlim([1,20]) pyplot.ylim([0,100])  400  pyplot.ylabel('Probability of Recognition (%)')  Manuel Günther committed Sep 17, 2015 401 402 403  Usually, there is only a single positive score per probe, but this is not a fixed restriction.  404 405 406 407  Detection & Identification Curve ================================  André Anjos committed Sep 28, 2016 408 409 410 411 412 413 414 415 The detection & identification curve is designed to evaluate open set identification tasks. It can be plotted using the :py:func:bob.measure.plot.detection_identification_curve function, but it requires at least one open-set probe, i.e., where no corresponding positive score exists, for which the FAR values are computed. Here, we plot the detection and identification curve for rank 1, so that the recognition rate for FAR=1 will be identical to the rank one :py:func:bob.measure.recognition_rate obtained in the CMC plot above.  416 417 418 419 420 421 422 423 424  .. plot:: import numpy numpy.random.seed(42) import bob.measure from matplotlib import pyplot cmc_scores = []  Manuel Günther committed Oct 16, 2017 425  for probe in range(1000):  426 427 428  positives = numpy.random.normal(1, 1, 1) negatives = numpy.random.normal(0, 1, 19) cmc_scores.append((negatives, positives))  Manuel Günther committed Oct 16, 2017 429  for probe in range(1000):  430 431 432 433 434  negatives = numpy.random.normal(-1, 1, 10) cmc_scores.append((negatives, None)) bob.measure.plot.detection_identification_curve(cmc_scores, rank=1, logx=True) pyplot.xlabel('False Alarm Rate')  Manuel Günther committed Oct 16, 2017 435  pyplot.xlim([0.0001, 1])  436  pyplot.ylabel('Detection & Identification Rate (%)')  Manuel Günther committed Oct 16, 2017 437  pyplot.ylim([0,1])  438 439 440   André Anjos committed Dec 12, 2013 441 442 Fine-tunning ============  André Anjos committed Nov 21, 2013 443   André Anjos committed May 26, 2014 444 445 The methods inside :py:mod:bob.measure.plot are only provided as a Matplotlib_ wrapper to equivalent methods in :py:mod:bob.measure that can  André Anjos committed Dec 12, 2013 446 447 only calculate the points without doing any plotting. You may prefer to tweak the plotting or even use a different plotting system such as gnuplot. Have a  André Anjos committed Sep 28, 2016 448 449 450 451 look at the implementations at :py:mod:bob.measure.plot to understand how to use the |project| methods to compute the curves and interlace that in the way that best suits you.  André Anjos committed Nov 21, 2013 452 453 .. include:: links.rst  André Anjos committed Dec 12, 2013 454 .. Place youre references here:  André Anjos committed Nov 21, 2013 455   André Anjos committed Dec 12, 2013 456 457 .. _The Expected Performance Curve: http://publications.idiap.ch/downloads/reports/2005/bengio_2005_icml.pdf .. _The DET curve in assessment of detection task performance: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.117.4489&rep=rep1&type=pdf  Theophile GENTILHOMME committed Mar 20, 2018 458 .. _The Clopper-Pearson interval: https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Clopper-Pearson_interval