Skip to content
Snippets Groups Projects
Commit 9e8c5e3a authored by André Anjos's avatar André Anjos :speech_balloon:
Browse files

Merge branch 'opt_test' into 'master'

Make use of test set on compute_perf optional; Update docs

This MR addresses two issues:

* Users which don't have a test set can now also use `bob_compute_perf`. In this case, only reduced number of statistics and plots are printed
* Fixes to the user guide, mainly concerning how to call the scripts since the simplifications with docopt

See merge request !22
parents ac41e56a e5324be0
Branches
Tags
1 merge request!22Make use of test set on compute_perf optional; Update docs
Pipeline #
...@@ -6,20 +6,21 @@ ...@@ -6,20 +6,21 @@
1. Computes the threshold using either EER or min. HTER criteria on 1. Computes the threshold using either EER or min. HTER criteria on
development set scores development set scores
2. Applies the above threshold on test set scores to compute the HTER 2. Applies the above threshold on test set scores to compute the HTER, if a
test-score set is provided
3. Reports error rates on the console 3. Reports error rates on the console
4. Plots ROC, EPC, DET curves and score distributions to a multi-page PDF 4. Plots ROC, EPC, DET curves and score distributions to a multi-page PDF
file (unless --no-plot is passed) file (unless --no-plot is passed)
Usage: %(prog)s [-v...] [options] <dev-scores> <test-scores> Usage: %(prog)s [-v...] [options] <dev-scores> [<test-scores>]
%(prog)s --help %(prog)s --help
%(prog)s --version %(prog)s --version
Arguments: Arguments:
<dev-scores> Path to the file containing the development scores <dev-scores> Path to the file containing the development scores
<test-scores> Path to the file containing the test scores <test-scores> (optional) Path to the file containing the test scores.
Options: Options:
...@@ -53,15 +54,15 @@ import os ...@@ -53,15 +54,15 @@ import os
import sys import sys
import numpy import numpy
import logging import bob.core
__logging_format__='[%(levelname)s] %(message)s' logger = bob.core.log.setup("bob.measure")
logging.basicConfig(format=__logging_format__)
logger = logging.getLogger('bob')
def print_crit(dev_neg, dev_pos, test_neg, test_pos, crit): def print_crit(crit, dev_scores, test_scores=None):
"""Prints a single output line that contains all info for a given criterion""" """Prints a single output line that contains all info for a given criterion"""
dev_neg, dev_pos = dev_scores
if crit == 'EER': if crit == 'EER':
from .. import eer_threshold from .. import eer_threshold
thres = eer_threshold(dev_neg, dev_pos) thres = eer_threshold(dev_neg, dev_pos)
...@@ -73,46 +74,71 @@ def print_crit(dev_neg, dev_pos, test_neg, test_pos, crit): ...@@ -73,46 +74,71 @@ def print_crit(dev_neg, dev_pos, test_neg, test_pos, crit):
dev_far, dev_frr = farfrr(dev_neg, dev_pos, thres) dev_far, dev_frr = farfrr(dev_neg, dev_pos, thres)
dev_hter = (dev_far + dev_frr)/2.0 dev_hter = (dev_far + dev_frr)/2.0
test_far, test_frr = farfrr(test_neg, test_pos, thres)
test_hter = (test_far + test_frr)/2.0
print("[Min. criterion: %s] Threshold on Development set: %e" % (crit, thres)) print("[Min. criterion: %s] Threshold on Development set: %e" % (crit, thres))
dev_ni = dev_neg.shape[0] #number of impostors dev_ni = dev_neg.shape[0] #number of impostors
dev_fa = int(round(dev_far*dev_ni)) #number of false accepts dev_fa = int(round(dev_far*dev_ni)) #number of false accepts
dev_nc = dev_pos.shape[0] #number of clients dev_nc = dev_pos.shape[0] #number of clients
dev_fr = int(round(dev_frr*dev_nc)) #number of false rejects dev_fr = int(round(dev_frr*dev_nc)) #number of false rejects
test_ni = test_neg.shape[0] #number of impostors
test_fa = int(round(test_far*test_ni)) #number of false accepts
test_nc = test_pos.shape[0] #number of clients
test_fr = int(round(test_frr*test_nc)) #number of false rejects
dev_far_str = "%.3f%% (%d/%d)" % (100*dev_far, dev_fa, dev_ni) dev_far_str = "%.3f%% (%d/%d)" % (100*dev_far, dev_fa, dev_ni)
test_far_str = "%.3f%% (%d/%d)" % (100*test_far, test_fa, test_ni)
dev_frr_str = "%.3f%% (%d/%d)" % (100*dev_frr, dev_fr, dev_nc) dev_frr_str = "%.3f%% (%d/%d)" % (100*dev_frr, dev_fr, dev_nc)
test_frr_str = "%.3f%% (%d/%d)" % (100*test_frr, test_fr, test_nc)
dev_max_len = max(len(dev_far_str), len(dev_frr_str)) dev_max_len = max(len(dev_far_str), len(dev_frr_str))
test_max_len = max(len(test_far_str), len(test_frr_str))
def fmt(s, space): def fmt(s, space):
return ('%' + ('%d' % space) + 's') % s return ('%' + ('%d' % space) + 's') % s
print(" | %s | %s" % (fmt("Development", -1*dev_max_len), if test_scores is None:
fmt("Test", -1*test_max_len)))
print("-------+-%s-+-%s" % (dev_max_len*"-", (2+test_max_len)*"-")) # prints only dev performance rates
print(" FAR | %s | %s" % (fmt(dev_far_str, dev_max_len), fmt(test_far_str, print(" | %s" % fmt("Development", -1*dev_max_len))
test_max_len))) print("-------+-%s" % (dev_max_len*"-"))
print(" FRR | %s | %s" % (fmt(dev_frr_str, dev_max_len), fmt(test_frr_str, print(" FAR | %s" % fmt(dev_far_str, dev_max_len))
test_max_len))) print(" FRR | %s" % fmt(dev_frr_str, dev_max_len))
dev_hter_str = "%.3f%%" % (100*dev_hter) dev_hter_str = "%.3f%%" % (100*dev_hter)
test_hter_str = "%.3f%%" % (100*test_hter) print(" HTER | %s" % fmt(dev_hter_str, -1*dev_max_len))
print(" HTER | %s | %s" % (fmt(dev_hter_str, -1*dev_max_len),
fmt(test_hter_str, -1*test_max_len))) else:
# computes statistics for the test set based on the threshold a priori
test_neg, test_pos = test_scores
test_far, test_frr = farfrr(test_neg, test_pos, thres)
test_hter = (test_far + test_frr)/2.0
test_ni = test_neg.shape[0] #number of impostors
test_fa = int(round(test_far*test_ni)) #number of false accepts
test_nc = test_pos.shape[0] #number of clients
test_fr = int(round(test_frr*test_nc)) #number of false rejects
test_far_str = "%.3f%% (%d/%d)" % (100*test_far, test_fa, test_ni)
test_frr_str = "%.3f%% (%d/%d)" % (100*test_frr, test_fr, test_nc)
test_max_len = max(len(test_far_str), len(test_frr_str))
# prints both dev and test performance rates
print(" | %s | %s" % (fmt("Development", -1*dev_max_len),
fmt("Test", -1*test_max_len)))
print("-------+-%s-+-%s" % (dev_max_len*"-", (2+test_max_len)*"-"))
print(" FAR | %s | %s" % (fmt(dev_far_str, dev_max_len),
fmt(test_far_str, test_max_len)))
print(" FRR | %s | %s" % (fmt(dev_frr_str, dev_max_len),
fmt(test_frr_str, test_max_len)))
dev_hter_str = "%.3f%%" % (100*dev_hter)
test_hter_str = "%.3f%%" % (100*test_hter)
print(" HTER | %s | %s" % (fmt(dev_hter_str, -1*dev_max_len),
fmt(test_hter_str, -1*test_max_len)))
def plots(dev_neg, dev_pos, test_neg, test_pos, crit, points, filename): def plots(crit, points, filename, dev_scores, test_scores=None):
"""Saves ROC, DET and EPC curves on the file pointed out by filename.""" """Saves ROC, DET and EPC curves on the file pointed out by filename."""
dev_neg, dev_pos = dev_scores
if test_scores is not None:
test_neg, test_pos = test_scores
else:
test_neg, test_pos = None, None
from .. import plot from .. import plot
import matplotlib import matplotlib
...@@ -124,41 +150,54 @@ def plots(dev_neg, dev_pos, test_neg, test_pos, crit, points, filename): ...@@ -124,41 +150,54 @@ def plots(dev_neg, dev_pos, test_neg, test_pos, crit, points, filename):
# ROC # ROC
fig = mpl.figure() fig = mpl.figure()
plot.roc(dev_neg, dev_pos, points, color=(0.3,0.3,0.3),
linestyle='--', dashes=(6,2), label='development') if test_scores is not None:
plot.roc(test_neg, test_pos, points, color=(0,0,0), plot.roc(dev_neg, dev_pos, points, color=(0.3,0.3,0.3),
linestyle='-', label='test') linestyle='--', dashes=(6,2), label='development')
plot.roc(test_neg, test_pos, points, color=(0,0,0),
linestyle='-', label='test')
else:
plot.roc(dev_neg, dev_pos, points, color=(0,0,0),
linestyle='-', label='development')
mpl.axis([0,40,0,40]) mpl.axis([0,40,0,40])
mpl.title("ROC Curve") mpl.title("ROC Curve")
mpl.xlabel('FAR (%)') mpl.xlabel('FAR (%)')
mpl.ylabel('FRR (%)') mpl.ylabel('FRR (%)')
mpl.grid(True, color=(0.3,0.3,0.3)) mpl.grid(True, color=(0.3,0.3,0.3))
mpl.legend() if test_scores is not None: mpl.legend()
pp.savefig(fig) pp.savefig(fig)
# DET # DET
fig = mpl.figure() fig = mpl.figure()
plot.det(dev_neg, dev_pos, points, color=(0.3,0.3,0.3),
linestyle='--', dashes=(6,2), label='development') if test_scores is not None:
plot.det(test_neg, test_pos, points, color=(0,0,0), plot.det(dev_neg, dev_pos, points, color=(0.3,0.3,0.3),
linestyle='-', label='test') linestyle='--', dashes=(6,2), label='development')
plot.det(test_neg, test_pos, points, color=(0,0,0),
linestyle='-', label='test')
else:
plot.det(dev_neg, dev_pos, points, color=(0,0,0),
linestyle='-', label='development')
plot.det_axis([0.01, 40, 0.01, 40]) plot.det_axis([0.01, 40, 0.01, 40])
mpl.title("DET Curve") mpl.title("DET Curve")
mpl.xlabel('FAR (%)') mpl.xlabel('FAR (%)')
mpl.ylabel('FRR (%)') mpl.ylabel('FRR (%)')
mpl.grid(True, color=(0.3,0.3,0.3)) mpl.grid(True, color=(0.3,0.3,0.3))
mpl.legend() if test_scores is not None: mpl.legend()
pp.savefig(fig) pp.savefig(fig)
# EPC # EPC - requires test set
fig = mpl.figure() if test_scores is not None:
plot.epc(dev_neg, dev_pos, test_neg, test_pos, points, fig = mpl.figure()
color=(0,0,0), linestyle='-') plot.epc(dev_neg, dev_pos, test_neg, test_pos, points,
mpl.title('EPC Curve') color=(0,0,0), linestyle='-')
mpl.xlabel('Cost') mpl.title('EPC Curve')
mpl.ylabel('Min. HTER (%)') mpl.xlabel('Cost')
mpl.grid(True, color=(0.3,0.3,0.3)) mpl.ylabel('Min. HTER (%)')
pp.savefig(fig) mpl.grid(True, color=(0.3,0.3,0.3))
pp.savefig(fig)
# Distribution for dev and test scores on the same page # Distribution for dev and test scores on the same page
if crit == 'EER': if crit == 'EER':
...@@ -168,9 +207,15 @@ def plots(dev_neg, dev_pos, test_neg, test_pos, crit, points, filename): ...@@ -168,9 +207,15 @@ def plots(dev_neg, dev_pos, test_neg, test_pos, crit, points, filename):
from .. import min_hter_threshold from .. import min_hter_threshold
thres = min_hter_threshold(dev_neg, dev_pos) thres = min_hter_threshold(dev_neg, dev_pos)
mpl.subplot(2,1,1) fig = mpl.figure()
if test_scores is not None:
mpl.subplot(2,1,1)
all_scores = numpy.hstack((dev_neg, test_neg, dev_pos, test_pos))
else:
all_scores = numpy.hstack((dev_neg, dev_pos))
nbins=20 nbins=20
all_scores = numpy.hstack((dev_neg, test_neg, dev_pos, test_pos))
score_range = all_scores.min(), all_scores.max() score_range = all_scores.min(), all_scores.max()
mpl.hist(dev_neg, label='Impostors', normed=True, color='red', alpha=0.5, mpl.hist(dev_neg, label='Impostors', normed=True, color='red', alpha=0.5,
bins=nbins) bins=nbins)
...@@ -179,25 +224,33 @@ def plots(dev_neg, dev_pos, test_neg, test_pos, crit, points, filename): ...@@ -179,25 +224,33 @@ def plots(dev_neg, dev_pos, test_neg, test_pos, crit, points, filename):
mpl.xlim(*score_range) mpl.xlim(*score_range)
_, _, ymax, ymin = mpl.axis() _, _, ymax, ymin = mpl.axis()
mpl.vlines(thres, ymin, ymax, color='black', label='EER', linestyle='dashed') mpl.vlines(thres, ymin, ymax, color='black', label='EER', linestyle='dashed')
mpl.ylabel('Dev. Scores (normalized)')
ax = mpl.gca() if test_scores is not None:
ax.axes.get_xaxis().set_ticklabels([]) ax = mpl.gca()
mpl.legend(loc='upper center', ncol=3, bbox_to_anchor=(0.5, -0.01), ax.axes.get_xaxis().set_ticklabels([])
fontsize=10) mpl.legend(loc='upper center', ncol=3, bbox_to_anchor=(0.5, -0.01),
fontsize=10)
mpl.ylabel('Dev. Scores (normalized)')
else:
mpl.ylabel('Normalized Count')
mpl.legend(loc='best', fancybox=True, framealpha=0.5)
mpl.title('Score Distributions') mpl.title('Score Distributions')
mpl.grid(True, alpha=0.5) mpl.grid(True, alpha=0.5)
mpl.subplot(2,1,2) if test_scores is not None:
mpl.hist(test_neg, label='Impostors', normed=True, color='red', alpha=0.5, mpl.subplot(2,1,2)
bins=nbins) mpl.hist(test_neg, label='Impostors', normed=True, color='red', alpha=0.5,
mpl.hist(test_pos, label='Genuine', normed=True, color='blue', alpha=0.5, bins=nbins)
bins=nbins) mpl.hist(test_pos, label='Genuine', normed=True, color='blue', alpha=0.5,
mpl.ylabel('Test Scores (normalized)') bins=nbins)
mpl.xlabel('Score value') mpl.ylabel('Test Scores (normalized)')
mpl.xlim(*score_range) mpl.xlabel('Score value')
_, _, ymax, ymin = mpl.axis() mpl.xlim(*score_range)
mpl.vlines(thres, ymin, ymax, color='black', label='EER', linestyle='dashed') _, _, ymax, ymin = mpl.axis()
mpl.grid(True, alpha=0.5) mpl.vlines(thres, ymin, ymax, color='black', label='EER',
linestyle='dashed')
mpl.grid(True, alpha=0.5)
pp.savefig(fig) pp.savefig(fig)
pp.close() pp.close()
...@@ -225,8 +278,8 @@ def main(user_input=None): ...@@ -225,8 +278,8 @@ def main(user_input=None):
) )
# Sets-up logging # Sets-up logging
if args['--verbose'] == 1: logging.getLogger().setLevel(logging.INFO) verbosity = int(args['--verbose'])
elif args['--verbose'] >= 2: logging.getLogger().setLevel(logging.DEBUG) bob.core.log.set_verbosity_level(logger, verbosity)
# Checks number of points option # Checks number of points option
try: try:
...@@ -240,15 +293,18 @@ def main(user_input=None): ...@@ -240,15 +293,18 @@ def main(user_input=None):
'than zero') 'than zero')
from ..load import load_score, get_negatives_positives from ..load import load_score, get_negatives_positives
dev_neg, dev_pos = get_negatives_positives(load_score(args['<dev-scores>'])) dev_scores = get_negatives_positives(load_score(args['<dev-scores>']))
test_neg, test_pos = get_negatives_positives(load_score(args['<test-scores>']))
if args['<test-scores>'] is not None:
test_scores = get_negatives_positives(load_score(args['<test-scores>']))
else:
test_scores = None
print_crit(dev_neg, dev_pos, test_neg, test_pos, 'EER') print_crit('EER', dev_scores, test_scores)
print_crit(dev_neg, dev_pos, test_neg, test_pos, 'Min. HTER') print_crit('Min. HTER', dev_scores, test_scores)
if not args['--no-plot']: if not args['--no-plot']:
plots(dev_neg, dev_pos, test_neg, test_pos, 'EER', args['--points'], plots('EER', args['--points'], args['--output'], dev_scores, test_scores)
args['--output'])
print("[Plots] Performance curves => '%s'" % args['--output']) print("[Plots] Performance curves => '%s'" % args['--output'])
return 0 return 0
...@@ -46,6 +46,22 @@ def test_compute_perf(): ...@@ -46,6 +46,22 @@ def test_compute_perf():
nose.tools.eq_(main(cmdline), 0) nose.tools.eq_(main(cmdline), 0)
def test_compute_perf_only_dev():
# sanity checks
assert os.path.exists(DEV_SCORES)
tmp_output = tempfile.NamedTemporaryFile(prefix=__name__, suffix='.pdf')
cmdline = [
DEV_SCORES,
'--output=' + tmp_output.name,
]
from .script.compute_perf import main
nose.tools.eq_(main(cmdline), 0)
def test_eval_threshold(): def test_eval_threshold():
# sanity checks # sanity checks
......
...@@ -64,8 +64,8 @@ scenarios, the threshold :math:`\tau` has to be set a priori: this is typically ...@@ -64,8 +64,8 @@ scenarios, the threshold :math:`\tau` has to be set a priori: this is typically
done using a development set (also called cross-validation set). Nevertheless, done using a development set (also called cross-validation set). Nevertheless,
the optimal threshold can be different depending on the relative importance the optimal threshold can be different depending on the relative importance
given to the FAR and the FRR. Hence, in the EPC framework, the cost given to the FAR and the FRR. Hence, in the EPC framework, the cost
:math:`\beta \in [0;1]` is defined as the trade-off between the FAR and FRR. The :math:`\beta \in [0;1]` is defined as the trade-off between the FAR and FRR.
optimal threshold :math:`\tau^*` is then computed using different values of The optimal threshold :math:`\tau^*` is then computed using different values of
:math:`\beta`, corresponding to different operating points: :math:`\beta`, corresponding to different operating points:
.. math:: .. math::
...@@ -455,7 +455,7 @@ Error Rate) on a set, after setting up |project|, just do: ...@@ -455,7 +455,7 @@ Error Rate) on a set, after setting up |project|, just do:
.. code-block:: sh .. code-block:: sh
$ bob_eval_threshold.py --scores=development-scores-4col.txt $ bob_eval_threshold.py development-scores-4col.txt
Threshold: -0.004787956164 Threshold: -0.004787956164
FAR : 6.731% (35/520) FAR : 6.731% (35/520)
FRR : 6.667% (26/390) FRR : 6.667% (26/390)
...@@ -470,18 +470,18 @@ application ``bob_apply_threshold.py``: ...@@ -470,18 +470,18 @@ application ``bob_apply_threshold.py``:
.. code-block:: sh .. code-block:: sh
$ bob_apply_threshold.py --scores=test-scores-4col.txt --threshold=-0.0047879 $ bob_apply_threshold.py -0.0047879 test-scores-4col.txt
FAR : 2.115% (11/520) FAR : 2.115% (11/520)
FRR : 7.179% (28/390) FRR : 7.179% (28/390)
HTER: 4.647% HTER: 4.647%
In this case, only the error figures are presented. You can conduct the In this case, only the error figures are presented. You can conduct the
evaluation and plotting of development and test set data using our combined evaluation and plotting of development (and test set data) using our combined
``bob_compute_perf.py`` script. You pass both sets and it does the rest: ``bob_compute_perf.py`` script. You pass both sets and it does the rest:
.. code-block:: sh .. code-block:: sh
$ bob_compute_perf.py --devel=development-scores-4col.txt --test=test-scores-4col.txt $ bob_compute_perf.py development-scores-4col.txt test-scores-4col.txt
[Min. criterion: EER] Threshold on Development set: -4.787956e-03 [Min. criterion: EER] Threshold on Development set: -4.787956e-03
| Development | Test | Development | Test
-------+-----------------+------------------ -------+-----------------+------------------
...@@ -499,8 +499,8 @@ evaluation and plotting of development and test set data using our combined ...@@ -499,8 +499,8 @@ evaluation and plotting of development and test set data using our combined
Inside that script we evaluate 2 different thresholds based on the EER and the Inside that script we evaluate 2 different thresholds based on the EER and the
minimum HTER on the development set and apply the output to the test set. As minimum HTER on the development set and apply the output to the test set. As
can be seen from the toy-example above, the system generalizes reasonably well. can be seen from the toy-example above, the system generalizes reasonably well.
A single PDF file is generated containing an EPC as well as ROC and DET plots of such a A single PDF file is generated containing an EPC as well as ROC and DET plots
system. of such a system.
Use the ``--help`` option on the above-cited scripts to find-out about more Use the ``--help`` option on the above-cited scripts to find-out about more
options. options.
...@@ -509,71 +509,105 @@ options. ...@@ -509,71 +509,105 @@ options.
Score file conversion Score file conversion
--------------------- ---------------------
Sometimes, it is required to export the score files generated by Bob to a different format, e.g., to be able to generate a plot comparing Bob's systems with other systems. Sometimes, it is required to export the score files generated by Bob to a
In this package, we provide source code to convert between different types of score files. different format, e.g., to be able to generate a plot comparing Bob's systems
with other systems. In this package, we provide source code to convert between
different types of score files.
Bob to OpenBR Bob to OpenBR
============= =============
One of the supported formats is the matrix format that the National Institute of Standards and Technology (NIST) uses, and which is supported by OpenBR_. One of the supported formats is the matrix format that the National Institute
The scores are stored in two binary matrices, where the first matrix (usually with a ``.mtx`` filename extension) contains the raw scores, while a second mask matrix (extension ``.mask``) contains information, which scores are positives, and which are negatives. of Standards and Technology (NIST) uses, and which is supported by OpenBR_.
The scores are stored in two binary matrices, where the first matrix (usually
with a ``.mtx`` filename extension) contains the raw scores, while a second
mask matrix (extension ``.mask``) contains information, which scores are
positives, and which are negatives.
To convert from Bob's four column or five column score file to a pair of these matrices, you can use the :py:func:`bob.measure.openbr.write_matrix` function. To convert from Bob's four column or five column score file to a pair of these
In the simplest way, this function takes a score file ``'five-column-sore-file'`` and writes the pair ``'openbr.mtx', 'openbr.mask'`` of OpenBR compatible files: matrices, you can use the :py:func:`bob.measure.openbr.write_matrix` function.
In the simplest way, this function takes a score file
``'five-column-sore-file'`` and writes the pair ``'openbr.mtx', 'openbr.mask'``
of OpenBR compatible files:
.. code-block:: py .. code-block:: py
>>> bob.measure.openbr.write_matrix('five-column-sore-file', 'openbr.mtx', 'openbr.mask', score_file_format = '5column') >>> bob.measure.openbr.write_matrix('five-column-sore-file', 'openbr.mtx', 'openbr.mask', score_file_format = '5column')
In this way, the score file will be parsed and the matrices will be written in the same order that is obtained from the score file. In this way, the score file will be parsed and the matrices will be written in
the same order that is obtained from the score file.
For most of the applications, this should be sufficient, but as the identity information is lost in the matrix files, no deeper analysis is possible anymore when just using the matrices.
To enforce an order of the models and probes inside the matrices, you can use the ``model_names`` and ``probe_names`` parameters of :py:func:`bob.measure.openbr.write_matrix`: For most of the applications, this should be sufficient, but as the identity
information is lost in the matrix files, no deeper analysis is possible anymore
* The ``probe_names`` parameter lists the ``path`` elements stored in the score files, which are the fourth column in a ``5column`` file, and the third column in a ``4column`` file, see :py:func:`bob.measure.load.five_column` and :py:func:`bob.measure.load.four_column`. when just using the matrices. To enforce an order of the models and probes
inside the matrices, you can use the ``model_names`` and ``probe_names``
* The ``model_names`` parameter is a bit more complicated. parameters of :py:func:`bob.measure.openbr.write_matrix`:
In a ``5column`` format score file, the model names are defined by the second column of that file, see :py:func:`bob.measure.load.five_column`.
In a ``4column`` format score file, the model information is not contained, but only the client information of the model. * The ``probe_names`` parameter lists the ``path`` elements stored in the score
Hence, for the ``4column`` format, the ``model_names`` actually lists the client ids found in the first column, see :py:func:`bob.measure.load.four_column`. files, which are the fourth column in a ``5column`` file, and the third
column in a ``4column`` file, see :py:func:`bob.measure.load.five_column` and
:py:func:`bob.measure.load.four_column`.
* The ``model_names`` parameter is a bit more complicated. In a ``5column``
format score file, the model names are defined by the second column of that
file, see :py:func:`bob.measure.load.five_column`. In a ``4column`` format
score file, the model information is not contained, but only the client
information of the model. Hence, for the ``4column`` format, the
``model_names`` actually lists the client ids found in the first column, see
:py:func:`bob.measure.load.four_column`.
.. warning:: .. warning::
The model information is lost, but required to write the matrix files. The model information is lost, but required to write the matrix files. In
In the ``4column`` format, we use client ids instead of the model information. the ``4column`` format, we use client ids instead of the model
Hence, when several models exist per client, this function will not work as expected. information. Hence, when several models exist per client, this function
will not work as expected.
Additionally, there are fields in the matrix files, which define the gallery and probe list files that were used to generate the matrix.
These file names can be selected with the ``gallery_file_name`` and ``probe_file_name`` keyword parameters of :py:func:`bob.measure.openbr.write_matrix`. Additionally, there are fields in the matrix files, which define the gallery
and probe list files that were used to generate the matrix. These file names
Finally, OpenBR defines a specific ``'search'`` score file format, which is designed to be used to compute CMC curves. can be selected with the ``gallery_file_name`` and ``probe_file_name`` keyword
The score matrix contains descendingly sorted and possibly truncated list of scores, i.e., for each probe, a sorted list of all scores for the models is generated. parameters of :py:func:`bob.measure.openbr.write_matrix`.
To generate these special score file format, you can specify the ``search`` parameter.
It specifies the number of highest scores per probe that should be kept. Finally, OpenBR defines a specific ``'search'`` score file format, which is
If the ``search`` parameter is set to a negative value, all scores will be kept. designed to be used to compute CMC curves. The score matrix contains
If the ``search`` parameter is higher as the actual number of models, ``NaN`` scores will be appended, and the according mask values will be set to ``0`` (i.e., to be ignored). descendingly sorted and possibly truncated list of scores, i.e., for each
probe, a sorted list of all scores for the models is generated. To generate
these special score file format, you can specify the ``search`` parameter. It
specifies the number of highest scores per probe that should be kept. If the
``search`` parameter is set to a negative value, all scores will be kept. If
the ``search`` parameter is higher as the actual number of models, ``NaN``
scores will be appended, and the according mask values will be set to ``0``
(i.e., to be ignored).
OpenBR to Bob OpenBR to Bob
============= =============
On the other hand, you might also want to generate a Bob-compatible (four or five column) score file based on a pair of OpenBR matrix and mask files. On the other hand, you might also want to generate a Bob-compatible (four or
This is possible by using the :py:func:`bob.measure.openbr.write_score_file` function. five column) score file based on a pair of OpenBR matrix and mask files. This
At the basic, it takes the given pair of matrix and mask files, as well as the desired output score file: is possible by using the :py:func:`bob.measure.openbr.write_score_file`
function. At the basic, it takes the given pair of matrix and mask files, as
well as the desired output score file:
.. code-block:: py .. code-block:: py
>>> bob.measure.openbr.write_score_file('openbr.mtx', 'openbr.mask', 'four-column-sore-file') >>> bob.measure.openbr.write_score_file('openbr.mtx', 'openbr.mask', 'four-column-sore-file')
This score file is sufficient to compute a CMC curve (see `CMC`_), however it does not contain relevant client ids or paths for models and probes. This score file is sufficient to compute a CMC curve (see `CMC`_), however it
does not contain relevant client ids or paths for models and probes.
Particularly, it assumes that each client has exactly one associated model. Particularly, it assumes that each client has exactly one associated model.
To add/correct these information, you can use additional parameters to :py:func:`bob.measure.openbr.write_score_file`. To add/correct these information, you can use additional parameters to
Client ids of models and probes can be added using the ``models_ids`` and ``probes_ids`` keyword arguments. :py:func:`bob.measure.openbr.write_score_file`. Client ids of models and
The length of these lists must be identical to the number of models and probes as given in the matrix files, **and they must be in the same order as used to compute the OpenBR matrix**. probes can be added using the ``models_ids`` and ``probes_ids`` keyword
This includes that the same same-client and different-client pairs as indicated by the OpenBR mask will be generated, which will be checked inside the function. arguments. The length of these lists must be identical to the number of models
and probes as given in the matrix files, **and they must be in the same order
To add model and probe path information, the ``model_names`` and ``probe_names`` parameters, which need to have the same size and order as the ``models_ids`` and ``probes_ids``. as used to compute the OpenBR matrix**. This includes that the same
These information are simply stored in the score file, and no further check is applied. same-client and different-client pairs as indicated by the OpenBR mask will be
generated, which will be checked inside the function.
To add model and probe path information, the ``model_names`` and
``probe_names`` parameters, which need to have the same size and order as the
``models_ids`` and ``probes_ids``. These information are simply stored in the
score file, and no further check is applied.
.. note:: The ``model_names`` parameter is used only when writing score files in ``score_file_format='5column'``, in the ``'4column'`` format, this parameter is ignored. .. note:: The ``model_names`` parameter is used only when writing score files in ``score_file_format='5column'``, in the ``'4column'`` format, this parameter is ignored.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment