diff --git a/doc/references.rst b/doc/references.rst index 6b942813843865e8d8e9cc87d15952c3a3ad3af4..c6c0447fb1e111806ba1dab5e1338041c7530f66 100644 --- a/doc/references.rst +++ b/doc/references.rst @@ -104,3 +104,7 @@ .. [SANDLER-2018] *M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L.-C.h Chen*, **MobileNetV2: Inverted Residuals and Linear Bottlenecks**, 2018. https://arxiv.org/abs/1801.04381 + +.. [DAVIS-2006] *J. Davis and M. Goadrich*, **The relationship between + Precision-Recall and ROC curves**. 23rd international conference on Machine + learning (ICML’06), 2006. https://doi.org/10.1145/1143844.1143874 diff --git a/doc/results/baselines/index.rst b/doc/results/baselines/index.rst index 65652a48c235cb7f801047a4debc241cc6f07526..76a9d3b2e303b0a27ad8f03af6a2c40c29960388 100644 --- a/doc/results/baselines/index.rst +++ b/doc/results/baselines/index.rst @@ -82,7 +82,25 @@ Next, you will find the PR plots showing confidence intervals, for the various models explored, on a per dataset arrangement. All curves correspond to test set performances. Single performance figures (F1-micro scores) correspond to its average value across all test set images, for a fixed threshold set to -``0.5``. +``0.5``, and using 1000 points for curve calculation. + +.. tip:: **Curve Intepretation** + + PR curves behave differently than traditional ROC curves (using Specificity + versus Sensitivity) with respect to the overall shape. You may have a look + at [DAVIS-2006]_ for details on the relationship between PR and ROC curves. + For example, PR curves are not guaranteed to be monotonically increasing or + decreasing with the scanned thresholds (e.g. see M2U-Net on STARE dataset). + + Each evaluated threshold in a combination of trained models and datasets is + represented by a point in each curve. Points are linearly interpolated to + created a line. For each evaluated threshold and every trained model and + dataset, we assume that the standard deviation on both precision and recall + estimation represent good proxies for the uncertainty around that point. We + therefore plot a transparent ellipse centered around each evaluated point in + which the width corresponds to twice the recall standard deviation and the + height, twice the precision standard deviation. + .. list-table:: @@ -133,4 +151,5 @@ Remarks models show consistently less variability than the second annotator. Unfortunately, this cannot be conclusive. + .. include:: ../../links.rst diff --git a/doc/results/xtest/driu-chasedb1.pdf b/doc/results/xtest/driu-chasedb1.pdf index bb28aafb8479d2f07336a7dfdcdd16fa148f5117..fde498eaa0c30e0b348324953adbbeb621ead281 100644 Binary files a/doc/results/xtest/driu-chasedb1.pdf and b/doc/results/xtest/driu-chasedb1.pdf differ diff --git a/doc/results/xtest/driu-chasedb1.png b/doc/results/xtest/driu-chasedb1.png index be26e9f8e8b140aaca692c417892abb515a180f4..c921aa43cbe27ff4c576210aab1d27b5857d481b 100644 Binary files a/doc/results/xtest/driu-chasedb1.png and b/doc/results/xtest/driu-chasedb1.png differ diff --git a/doc/results/xtest/driu-drive.pdf b/doc/results/xtest/driu-drive.pdf index 1f9fc10b1ffa612f6f889666598620f1007ef03f..8b50e32801e9d7d3201daed77f907a7aab8416fc 100644 Binary files a/doc/results/xtest/driu-drive.pdf and b/doc/results/xtest/driu-drive.pdf differ diff --git a/doc/results/xtest/driu-drive.png b/doc/results/xtest/driu-drive.png index fba68683bf3a43a19a0a99c2f2f63d3f5d219473..091466ee12a0e4b1639e599565b11662807a4442 100644 Binary files a/doc/results/xtest/driu-drive.png and b/doc/results/xtest/driu-drive.png differ diff --git a/doc/results/xtest/driu-hrf.pdf b/doc/results/xtest/driu-hrf.pdf index 01d78c98e041d4db603bce19ad549811b3cc3a81..efc48a36390dc65cb31b5499dc1b0b82db451065 100644 Binary files a/doc/results/xtest/driu-hrf.pdf and b/doc/results/xtest/driu-hrf.pdf differ diff --git a/doc/results/xtest/driu-hrf.png b/doc/results/xtest/driu-hrf.png index 0cbd94c9f9c9ffa71f5f513cd8f997070bc979ee..eebe231b707c126e5e05eba396c1f64b780a6edd 100644 Binary files a/doc/results/xtest/driu-hrf.png and b/doc/results/xtest/driu-hrf.png differ diff --git a/doc/results/xtest/driu-iostar-vessel.pdf b/doc/results/xtest/driu-iostar-vessel.pdf index db822d0f5c53d1579b8c71f3e339eefe2519e582..c4a6fab362dc5dae7cc6462f5c3028241764d7bd 100644 Binary files a/doc/results/xtest/driu-iostar-vessel.pdf and b/doc/results/xtest/driu-iostar-vessel.pdf differ diff --git a/doc/results/xtest/driu-iostar-vessel.png b/doc/results/xtest/driu-iostar-vessel.png index 5842c71646109aecb0456cb20f99783b3b3bda0d..adbfb7352febb35115ee6c558c1a2422015eb721 100644 Binary files a/doc/results/xtest/driu-iostar-vessel.png and b/doc/results/xtest/driu-iostar-vessel.png differ diff --git a/doc/results/xtest/driu-stare.pdf b/doc/results/xtest/driu-stare.pdf index f44d12dafca83b3e2d3af52a48244ad1bd43365c..16bdde942b5f3ad1607e5d4fa7b3ff4c77977e8b 100644 Binary files a/doc/results/xtest/driu-stare.pdf and b/doc/results/xtest/driu-stare.pdf differ diff --git a/doc/results/xtest/driu-stare.png b/doc/results/xtest/driu-stare.png index 6573b820be3e2402e732c45d1c106f0abb103aef..de273d697636439d80d00aba7d407f58f44f9e08 100644 Binary files a/doc/results/xtest/driu-stare.png and b/doc/results/xtest/driu-stare.png differ diff --git a/doc/results/xtest/index.rst b/doc/results/xtest/index.rst index e65c9e9997e8dca62e0a6fcfd0bea45bae11906d..73ebafffe2e3ccead8f89569e918d4ae5ffb127b 100644 --- a/doc/results/xtest/index.rst +++ b/doc/results/xtest/index.rst @@ -94,7 +94,7 @@ cross-tests explored, on a per cross-tested model arrangement. All curves correspond to test set performances. Single performance figures (F1-micro scores) correspond to its average value across all test set images, for a fixed threshold set *a priori* on the training set of dataset used for creating the -model. +model, and using 100 points for curve calculation. .. list-table:: diff --git a/doc/results/xtest/m2unet-chasedb1.pdf b/doc/results/xtest/m2unet-chasedb1.pdf index 22368ff4c89b9968a63a2c937ba1945f1dc881ec..efc263a243293184af63d492afa23931b40102b9 100644 Binary files a/doc/results/xtest/m2unet-chasedb1.pdf and b/doc/results/xtest/m2unet-chasedb1.pdf differ diff --git a/doc/results/xtest/m2unet-chasedb1.png b/doc/results/xtest/m2unet-chasedb1.png index f7fbaffad64fd42012f2394e412b7f4183ba2f05..98edc0b110663ad6d13c138763ddc9656ec737be 100644 Binary files a/doc/results/xtest/m2unet-chasedb1.png and b/doc/results/xtest/m2unet-chasedb1.png differ diff --git a/doc/results/xtest/m2unet-drive.pdf b/doc/results/xtest/m2unet-drive.pdf index e8090cecb4c178646e0a0bb51da5b3a89f0b1548..c672c163666917083984d1a14836c22184faa3ff 100644 Binary files a/doc/results/xtest/m2unet-drive.pdf and b/doc/results/xtest/m2unet-drive.pdf differ diff --git a/doc/results/xtest/m2unet-drive.png b/doc/results/xtest/m2unet-drive.png index 0b628ddfab5d6dc678d5f665f4e4eb3c7edec1fd..68d882506091f5e4cdf50e8f11c6f1720bab8000 100644 Binary files a/doc/results/xtest/m2unet-drive.png and b/doc/results/xtest/m2unet-drive.png differ diff --git a/doc/results/xtest/m2unet-hrf.pdf b/doc/results/xtest/m2unet-hrf.pdf index 73d400cf279864de7fb3d372824606c2cdba79aa..226f3af219390f6ce0863df66c1575f79b4cb998 100644 Binary files a/doc/results/xtest/m2unet-hrf.pdf and b/doc/results/xtest/m2unet-hrf.pdf differ diff --git a/doc/results/xtest/m2unet-hrf.png b/doc/results/xtest/m2unet-hrf.png index ab4bcb45f2fa74bd6fa3b6a575723d71160a5c32..19959b3c5aae6e639ae8199343549c4d2f1dc114 100644 Binary files a/doc/results/xtest/m2unet-hrf.png and b/doc/results/xtest/m2unet-hrf.png differ diff --git a/doc/results/xtest/m2unet-iostar-vessel.pdf b/doc/results/xtest/m2unet-iostar-vessel.pdf index 6a59bc6ea23d7d17d54d08db8c8735b54445aae0..4592dc35ba1f668d8b3f5fdb9451b22b2f0dba9b 100644 Binary files a/doc/results/xtest/m2unet-iostar-vessel.pdf and b/doc/results/xtest/m2unet-iostar-vessel.pdf differ diff --git a/doc/results/xtest/m2unet-iostar-vessel.png b/doc/results/xtest/m2unet-iostar-vessel.png index df9cc400f92f759f752161788e28a31826c17c94..58aa8a5457a482908789b1b04ae82ff249a503e1 100644 Binary files a/doc/results/xtest/m2unet-iostar-vessel.png and b/doc/results/xtest/m2unet-iostar-vessel.png differ diff --git a/doc/results/xtest/m2unet-stare.pdf b/doc/results/xtest/m2unet-stare.pdf index 127f8d2abbc8aafd33fcbdcd4cc9613bea15b3b0..13f57fc75575561cebf62ace972749f268530307 100644 Binary files a/doc/results/xtest/m2unet-stare.pdf and b/doc/results/xtest/m2unet-stare.pdf differ diff --git a/doc/results/xtest/m2unet-stare.png b/doc/results/xtest/m2unet-stare.png index e80cd25d1bce4604f62358d7d01bdfb1d4f67c6d..f4ec5207403426c73d003bea7e27b341a2787f3f 100644 Binary files a/doc/results/xtest/m2unet-stare.png and b/doc/results/xtest/m2unet-stare.png differ