Skip to content
Snippets Groups Projects
Commit 5aff2862 authored by André Anjos's avatar André Anjos :speech_balloon:
Browse files

[doc] Reset curves after changes to plotting strategy

parent 8a419db4
No related branches found
No related tags found
No related merge requests found
Pipeline #39738 passed
Showing
with 25 additions and 2 deletions
......@@ -104,3 +104,7 @@
.. [SANDLER-2018] *M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L.-C.h Chen*,
**MobileNetV2: Inverted Residuals and Linear Bottlenecks**, 2018.
https://arxiv.org/abs/1801.04381
.. [DAVIS-2006] *J. Davis and M. Goadrich*, **The relationship between
Precision-Recall and ROC curves**. 23rd international conference on Machine
learning (ICML’06), 2006. https://doi.org/10.1145/1143844.1143874
......@@ -82,7 +82,25 @@ Next, you will find the PR plots showing confidence intervals, for the various
models explored, on a per dataset arrangement. All curves correspond to test
set performances. Single performance figures (F1-micro scores) correspond to
its average value across all test set images, for a fixed threshold set to
``0.5``.
``0.5``, and using 1000 points for curve calculation.
.. tip:: **Curve Intepretation**
PR curves behave differently than traditional ROC curves (using Specificity
versus Sensitivity) with respect to the overall shape. You may have a look
at [DAVIS-2006]_ for details on the relationship between PR and ROC curves.
For example, PR curves are not guaranteed to be monotonically increasing or
decreasing with the scanned thresholds (e.g. see M2U-Net on STARE dataset).
Each evaluated threshold in a combination of trained models and datasets is
represented by a point in each curve. Points are linearly interpolated to
created a line. For each evaluated threshold and every trained model and
dataset, we assume that the standard deviation on both precision and recall
estimation represent good proxies for the uncertainty around that point. We
therefore plot a transparent ellipse centered around each evaluated point in
which the width corresponds to twice the recall standard deviation and the
height, twice the precision standard deviation.
.. list-table::
......@@ -133,4 +151,5 @@ Remarks
models show consistently less variability than the second annotator.
Unfortunately, this cannot be conclusive.
.. include:: ../../links.rst
No preview for this file type
doc/results/xtest/driu-chasedb1.png

160 KiB | W: | H:

doc/results/xtest/driu-chasedb1.png

156 KiB | W: | H:

doc/results/xtest/driu-chasedb1.png
doc/results/xtest/driu-chasedb1.png
doc/results/xtest/driu-chasedb1.png
doc/results/xtest/driu-chasedb1.png
  • 2-up
  • Swipe
  • Onion skin
No preview for this file type
doc/results/xtest/driu-drive.png

179 KiB | W: | H:

doc/results/xtest/driu-drive.png

176 KiB | W: | H:

doc/results/xtest/driu-drive.png
doc/results/xtest/driu-drive.png
doc/results/xtest/driu-drive.png
doc/results/xtest/driu-drive.png
  • 2-up
  • Swipe
  • Onion skin
No preview for this file type
doc/results/xtest/driu-hrf.png

177 KiB | W: | H:

doc/results/xtest/driu-hrf.png

174 KiB | W: | H:

doc/results/xtest/driu-hrf.png
doc/results/xtest/driu-hrf.png
doc/results/xtest/driu-hrf.png
doc/results/xtest/driu-hrf.png
  • 2-up
  • Swipe
  • Onion skin
No preview for this file type
doc/results/xtest/driu-iostar-vessel.png

156 KiB | W: | H:

doc/results/xtest/driu-iostar-vessel.png

156 KiB | W: | H:

doc/results/xtest/driu-iostar-vessel.png
doc/results/xtest/driu-iostar-vessel.png
doc/results/xtest/driu-iostar-vessel.png
doc/results/xtest/driu-iostar-vessel.png
  • 2-up
  • Swipe
  • Onion skin
No preview for this file type
doc/results/xtest/driu-stare.png

174 KiB | W: | H:

doc/results/xtest/driu-stare.png

164 KiB | W: | H:

doc/results/xtest/driu-stare.png
doc/results/xtest/driu-stare.png
doc/results/xtest/driu-stare.png
doc/results/xtest/driu-stare.png
  • 2-up
  • Swipe
  • Onion skin
......@@ -94,7 +94,7 @@ cross-tests explored, on a per cross-tested model arrangement. All curves
correspond to test set performances. Single performance figures (F1-micro
scores) correspond to its average value across all test set images, for a fixed
threshold set *a priori* on the training set of dataset used for creating the
model.
model, and using 100 points for curve calculation.
.. list-table::
......
No preview for this file type
doc/results/xtest/m2unet-chasedb1.png

165 KiB | W: | H:

doc/results/xtest/m2unet-chasedb1.png

165 KiB | W: | H:

doc/results/xtest/m2unet-chasedb1.png
doc/results/xtest/m2unet-chasedb1.png
doc/results/xtest/m2unet-chasedb1.png
doc/results/xtest/m2unet-chasedb1.png
  • 2-up
  • Swipe
  • Onion skin
No preview for this file type
doc/results/xtest/m2unet-drive.png

170 KiB | W: | H:

doc/results/xtest/m2unet-drive.png

172 KiB | W: | H:

doc/results/xtest/m2unet-drive.png
doc/results/xtest/m2unet-drive.png
doc/results/xtest/m2unet-drive.png
doc/results/xtest/m2unet-drive.png
  • 2-up
  • Swipe
  • Onion skin
No preview for this file type
doc/results/xtest/m2unet-hrf.png

186 KiB | W: | H:

doc/results/xtest/m2unet-hrf.png

181 KiB | W: | H:

doc/results/xtest/m2unet-hrf.png
doc/results/xtest/m2unet-hrf.png
doc/results/xtest/m2unet-hrf.png
doc/results/xtest/m2unet-hrf.png
  • 2-up
  • Swipe
  • Onion skin
No preview for this file type
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment