Commit 29448290 authored by Tiago de Freitas Pereira's avatar Tiago de Freitas Pereira
Browse files

Allowing eval DIR curve to use the dev thresholds

parent 1ec23556
Pipeline #11840 passed with stages
in 16 minutes and 48 seconds
......@@ -193,7 +193,7 @@ def _plot_cmc(cmcs, colors, labels, title, fontsize=10, position=None):
return figure
def _plot_dir(cmc_scores, far_values, rank, colors, labels, title, fontsize=10, position=None):
def _plot_dir(cmc_scores, far_values, rank, colors, labels, title, fontsize=10, position=None, thresholds=None):
if position is None: position = 'lower right'
# open new page for current plot
figure = pyplot.figure()
......@@ -206,7 +206,8 @@ def _plot_dir(cmc_scores, far_values, rank, colors, labels, title, fontsize=10,
raise ValueError("There need to be at least one pair with only negative scores")
# compute thresholds based on FAR values
thresholds = [bob.measure.far_threshold(negatives, [], v, True) for v in far_values]
if thresholds is None:
thresholds = [bob.measure.far_threshold(negatives, [], v, True) for v in far_values]
# compute detection and identification rate based on the thresholds for
# the given rank
......@@ -223,7 +224,7 @@ def _plot_dir(cmc_scores, far_values, rank, colors, labels, title, fontsize=10,
pyplot.legend(loc=position, prop = {'size':fontsize})
pyplot.title(title)
return figure
return figure, thresholds
def _plot_epc(scores_dev, scores_eval, colors, labels, title, fontsize=10, position=None):
......@@ -452,9 +453,10 @@ def main(command_line_parameters=None):
# create a multi-page PDF for the DIR curve
pdf = PdfPages(args.dir)
# create a separate figure for dev and eval
pdf.savefig(_plot_dir(cmcs_dev, fars, args.rank, colors, args.legends, args.title[0] if args.title is not None else "DIR curve for development set", args.legend_font_size, args.legend_position), bbox_inches='tight')
figure, thresholds = _plot_dir(cmcs_dev, fars, args.rank, colors, args.legends, args.title[0] if args.title is not None else "DIR curve for development set", args.legend_font_size, args.legend_position)
pdf.savefig(figure, bbox_inches='tight')
if args.eval_files:
pdf.savefig(_plot_dir(cmcs_eval, fars, args.rank, colors, args.legends, args.title[1] if args.title is not None else "DIR curve for evaluation set", args.legend_font_size, args.legend_position), bbox_inches='tight')
pdf.savefig(_plot_dir(cmcs_eval, fars, args.rank, colors, args.legends, args.title[1] if args.title is not None else "DIR curve for evaluation set", args.legend_font_size, args.legend_position, thresholds=thresholds)[0], bbox_inches='tight')
pdf.close()
except RuntimeError as e:
raise RuntimeError("During plotting of DIR curves, the following exception occured:\n%s")
  • mentioned in commit 70e39528

    Toggle commit list
  • Nice discussion :-)

    Oh, you are right. Follow a script that plots exactly this behaviour (different score distributions for the dev and eval) and its respective DIR (using the threshold of the dev set and using the threshold of the eval).toy.pymy_nice_report.pdf

    But this issue is tricky; Let's say that you want to choose your threshold operation point and, as a good researcher, you use the dev set for that. You also want to do an analysis under different FARs. In order to evaluate if certain threshold is good for the eval set you would need to analyse two metrics: The DIR(\tau) and the False Alarm Rate(\tau).

    Observe that we can't make this decision using the evaluate.py

    Do you think is a good idea to have some sort customized DIR with two x-axis (https://stackoverflow.com/questions/28112217/matplotlib-second-x-axis-with-transformed-values)? The one in the bottom would show the FAR of the dev set (the one used to estimate the thresholds) and the one in the top would correspond the FAR of the eval set (computed with the dev thresholds).

    Thanks

  • Thanks for showing empirically that my thoughts were correct :-D

    I can see that you want to use the Lausanne protocol (with dev and eval sets) in the right way. I am not sure how to do that. For regular ROC, DET and CMC plots, there is no such thing as a second x-axis either. That's why we didn't plot the ROC curve on the eval set in our ICML paper (see http://publications.idiap.ch/index.php/publications/show/3666), and I don't think plotting ROC/DET/CMC/DIR curves on the eval set makes a whole lot of sense. Anyways, these plots are created by default, i.e., in case someone is of a different opinion than me :-)

    The only plot that makes use of the Lausanne protocol in the right way is the EPC curve, which plots the HTER of the eval set based on several thresholds based on the dev set. If you want, you can invent such a plot for open-set experiments, write a paper about it, publish it, and incorporate the plot into evaluate.py later. For now, I think there is no need/reason/incentive to have such a plot inside this PR.

  • ahahaaha, yes, there is not reason for that so far.

    I will merge it. Thanks for the discussion

Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment