Skip to content
Snippets Groups Projects
Commit 3420c751 authored by André Anjos's avatar André Anjos :speech_balloon:
Browse files

[doc] Update baseline results

parent a2c1b973
No related branches found
No related tags found
No related merge requests found
Pipeline #40341 passed
Showing
with 290 additions and 234 deletions
......@@ -16,12 +16,12 @@ Inference
---------
You may use one of your trained models (or :ref:`one of ours
<bob.ip.binseg.models>` to run inference on existing datasets or your own
dataset. In inference (or prediction) mode, we input data, the trained model,
and output HDF5 files containing the prediction outputs for every input image.
Each HDF5 file contains a single object with a 2-dimensional matrix of floating
point numbers indicating the vessel probability (``[0.0,1.0]``) for each pixel
in the input image.
<bob.ip.binseg.results.baselines>` to run inference on existing datasets or
your own dataset. In inference (or prediction) mode, we input data, the
trained model, and output HDF5 files containing the prediction outputs for
every input image. Each HDF5 file contains a single object with a
2-dimensional matrix of floating point numbers indicating the vessel
probability (``[0.0,1.0]``) for each pixel in the input image.
Inference on an existing dataset
......@@ -38,7 +38,7 @@ To run inference, use the sub-command :ref:`predict
Replace ``<model>`` and ``<dataset>`` by the appropriate :ref:`configuration
files <bob.ip.binseg.configs>`. Replace ``<path/to/model.pth>`` to a path
leading to the pre-trained model, or URL pointing to a pre-trained model (e.g.
:ref:`one of ours <bob.ip.binseg.models>`).
:ref:`one of ours <bob.ip.binseg.results.baselines>`).
Inference on a custom dataset
......
......@@ -27,27 +27,47 @@
.. Pretrained models
.. _baselines_driu_drive: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/driu/drive/model.pth
.. _baselines_hed_drive: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/hed/drive/model.pth
.. _baselines_m2unet_drive: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/m2unet/drive/model.pth
.. _baselines_unet_drive: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/unet/drive/model.pth
.. _baselines_driu_stare: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/driu/stare/model.pth
.. _baselines_hed_stare: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/hed/stare/model.pth
.. _baselines_m2unet_stare: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/m2unet/stare/model.pth
.. _baselines_unet_stare: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/unet/stare/model.pth
.. _baselines_driu_chase: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/driu/chasedb1/model.pth
.. _baselines_hed_chase: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/hed/chasedb1/model.pth
.. _baselines_m2unet_chase: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/m2unet/chasedb1/model.pth
.. _baselines_unet_chase: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/unet/chasedb1/model.pth
.. _baselines_driu_hrf: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/driu/hrf/model.pth
.. _baselines_hed_hrf: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/hed/hrf/model.pth
.. _baselines_m2unet_hrf: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/m2unet/hrf/model.pth
.. _baselines_unet_hrf: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/unet/hrf/model.pth
.. _baselines_driu_iostar: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/driu/iostar-vessel/model.pth
.. _baselines_hed_iostar: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/hed/iostar-vessel/model.pth
.. _baselines_m2unet_iostar: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/m2unet/iostar-vessel/model.pth
.. _baselines_unet_iostar: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/unet/iostar-vessel/model.pth
.. _baselines_driu_drive: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/driu-drive-1947d9fa.pth
.. _baselines_hed_drive: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/hed-drive-c8b86082.pth
.. _baselines_m2unet_drive: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/m2unet-drive-ce4c7a53.pth
.. _baselines_unet_drive: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/unet-drive-0ac99e2e.pth
.. _baselines_driu_stare: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/driu-stare-79dec93a.pth
.. _baselines_hed_stare: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/hed-stare-fcdb7671.pth
.. _baselines_m2unet_stare: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/m2unet-stare-952778c2.pth
.. _baselines_unet_stare: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/unet-stare-49b6a6d0.pth
.. _baselines_driu_chase: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/driu-chasedb1-e7cf53c3.pth
.. _baselines_hed_chase: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/hed-chasedb1-55ec6d34.pth
.. _baselines_m2unet_chase: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/m2unet-chasedb1-0becbf29.pth
.. _baselines_unet_chase: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/unet-chasedb1-be41b5a5.pth
.. _baselines_driu_hrf: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/driu-hrf-c9e6a889.pth
.. _baselines_hed_hrf: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/hed-hrf-3f4ab1c4.pth
.. _baselines_m2unet_hrf: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/m2unet-hrf-2c3f2485.pth
.. _baselines_unet_hrf: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/unet-hrf-9a559821.pth
.. _baselines_driu_iostar: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/driu-iostar-vessel-ef8cc27b.pth
.. _baselines_hed_iostar: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/hed-iostar-vessel-37cfaee1.pth
.. _baselines_m2unet_iostar: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/m2unet-iostar-vessel-223b61ef.pth
.. _baselines_unet_iostar: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/baselines/unet-iostar-vessel-86c78e87.pth
.. _covd_driu_drive: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/driu/drive/model.pth
.. _covd_hed_drive: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/hed/drive/model.pth
.. _covd_m2unet_drive: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/m2unet/drive/model.pth
.. _covd_unet_drive: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/unet/drive/model.pth
.. _covd_driu_stare: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/driu/stare/model.pth
.. _covd_hed_stare: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/hed/stare/model.pth
.. _covd_m2unet_stare: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/m2unet/stare/model.pth
.. _covd_unet_stare: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/unet/stare/model.pth
.. _covd_driu_chase: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/driu/chasedb1/model.pth
.. _covd_hed_chase: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/hed/chasedb1/model.pth
.. _covd_m2unet_chase: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/m2unet/chasedb1/model.pth
.. _covd_unet_chase: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/unet/chasedb1/model.pth
.. _covd_driu_hrf: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/driu/hrf/model.pth
.. _covd_hed_hrf: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/hed/hrf/model.pth
.. _covd_m2unet_hrf: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/m2unet/hrf/model.pth
.. _covd_unet_hrf: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/unet/hrf/model.pth
.. _covd_driu_iostar: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/driu/iostar-vessel/model.pth
.. _covd_hed_iostar: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/hed/iostar-vessel/model.pth
.. _covd_m2unet_iostar: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/m2unet/iostar-vessel/model.pth
.. _covd_unet_iostar: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/covd/unet/iostar-vessel/model.pth
.. DRIVE
.. _driu_drive.pth: https://www.idiap.ch/software/bob/data/bob/bob.ip.binseg/master/DRIU_DRIVE.pth
......
.. -*- coding: utf-8 -*-
.. _bob.ip.binseg.models:
===================
Pretrained Models
===================
We offer the following pre-trained models allowing inference and score
reproduction of our results. Due to storage limitations we only provide
weights of a subset of all evaluated models.
.. list-table::
* - **Datasets / Models**
- :py:mod:`driu <bob.ip.binseg.configs.models.driu>`
- :py:mod:`m2unet <bob.ip.binseg.configs.models.m2unet>`
* - :py:mod:`drive <bob.ip.binseg.configs.datasets.drive.default>`
- driu_drive.pth_
- m2unet_drive.pth_
* - :py:mod:`drive-drive <bob.ip.binseg.configs.datasets.drive.covd>`
-
- m2unet_covd-drive.pth_
* - :py:mod:`drive-ssl <bob.ip.binseg.configs.datasets.drive.ssl>`
-
- m2unet_covd-drive_ssl.pth_
* - :py:mod:`stare <bob.ip.binseg.configs.datasets.stare.ah>`
- driu_stare.pth_
- m2unet_stare.pth_
* - :py:mod:`stare-covd <bob.ip.binseg.configs.datasets.stare.covd>`
-
- m2unet_covd-stare.pth_
* - :py:mod:`stare-ssl <bob.ip.binseg.configs.datasets.stare.ssl>`
-
- m2unet_covd-stare_ssl.pth_
* - :py:mod:`chasedb1 <bob.ip.binseg.configs.datasets.chasedb1.first_annotator>`
- driu_chasedb1.pth_
- m2unet_chasedb1.pth_
* - :py:mod:`chasedb1-covd <bob.ip.binseg.configs.datasets.chasedb1.covd>`
-
- m2unet_covd-chasedb1.pth_
* - :py:mod:`chasedb1-ssl <bob.ip.binseg.configs.datasets.chasedb1.ssl>`
-
- m2unet_covd-chasedb1_ssl.pth_
* - :py:mod:`iostar-vessel <bob.ip.binseg.configs.datasets.iostar.vessel>`
- driu_iostar.pth_
- m2unet_iostar.pth_
* - :py:mod:`iostar-vessel-covd <bob.ip.binseg.configs.datasets.iostar.covd>`
-
- m2unet_covd-iostar.pth_
* - :py:mod:`iostar-vessel-ssl <bob.ip.binseg.configs.datasets.iostar.ssl>`
-
- m2unet_covd-iostar_ssl.pth_
* - :py:mod:`hrf <bob.ip.binseg.configs.datasets.hrf.default>`
- driu_hrf.pth_
- m2unet_hrf.pth_
* - :py:mod:`hrf-covd <bob.ip.binseg.configs.datasets.hrf.covd>`
-
- m2unet_covd-hrf.pth_
* - :py:mod:`hrf-ssl <bob.ip.binseg.configs.datasets.hrf.ssl>`
-
- m2unet_covd-hrf_ssl.pth_
.. include:: links.rst
No preview for this file type
doc/results/baselines/chasedb1.png

89.8 KiB | W: | H:

doc/results/baselines/chasedb1.png

89.8 KiB | W: | H:

doc/results/baselines/chasedb1.png
doc/results/baselines/chasedb1.png
doc/results/baselines/chasedb1.png
doc/results/baselines/chasedb1.png
  • 2-up
  • Swipe
  • Onion skin
No preview for this file type
doc/results/baselines/drive.png

90 KiB | W: | H:

doc/results/baselines/drive.png

90.3 KiB | W: | H:

doc/results/baselines/drive.png
doc/results/baselines/drive.png
doc/results/baselines/drive.png
doc/results/baselines/drive.png
  • 2-up
  • Swipe
  • Onion skin
File added
doc/results/baselines/hrf-fullres.png

95.2 KiB

No preview for this file type
doc/results/baselines/hrf.png

83.3 KiB | W: | H:

doc/results/baselines/hrf.png

85.8 KiB | W: | H:

doc/results/baselines/hrf.png
doc/results/baselines/hrf.png
doc/results/baselines/hrf.png
doc/results/baselines/hrf.png
  • 2-up
  • Swipe
  • Onion skin
......@@ -11,9 +11,13 @@ F1 Scores (micro-level)
* Benchmark results for models: DRIU, HED, M2U-Net and U-Net.
* Models are trained and tested on the same dataset (**numbers in bold**
indicate number of parameters per model). Models are trained for a fixed
number of 1000 epochs, with a learning rate of 0.001 until epoch 900 and then
0.0001 until the end of the training.
indicate approximate number of parameters per model). Models are trained for
a fixed number of 1000 epochs, with a learning rate of 0.001 until epoch 900
and then 0.0001 until the end of the training.
* During the training session, an unaugmented copy of the training set is used
as validation set. We keep checkpoints for the best performing networks
based on such validation set. The best performing network during training is
used for evaluation.
* Database and model resource configuration links (table top row and left
column) are linked to the originating configuration files used to obtain
these results.
......@@ -26,6 +30,9 @@ F1 Scores (micro-level)
analyze`` providing the model URL as ``--weight`` parameter.
* For comparison purposes, we provide "second-annotator" performances on the
same test set, where available.
* :ref:`Our baseline script <bob.ip.binseg.baseline-script>` was used to
generate the results displayed here.
* HRF models were trained using half the full resolution (1168x1648)
.. list-table::
......@@ -45,35 +52,40 @@ F1 Scores (micro-level)
- 25.8M
* - :py:mod:`drive <bob.ip.binseg.configs.datasets.drive.default>`
- 0.788 (0.021)
- `0.819 (0.017) <baselines_driu_drive_>`_
- `0.806 (0.017) <baselines_hed_drive_>`_
- `0.803 (0.017) <baselines_m2unet_drive_>`_
- `0.823 (0.016) <baselines_unet_drive_>`_
- `0.821 (0.014) <baselines_driu_drive_>`_
- `0.813 (0.016) <baselines_hed_drive_>`_
- `0.802 (0.014) <baselines_m2unet_drive_>`_
- `0.825 (0.015) <baselines_unet_drive_>`_
* - :py:mod:`stare <bob.ip.binseg.configs.datasets.stare.ah>`
- 0.759 (0.028)
- `0.822 (0.037) <baselines_driu_stare_>`_
- `0.808 (0.046) <baselines_hed_stare_>`_
- `0.811 (0.039) <baselines_m2unet_stare_>`_
- `0.827 (0.041) <baselines_unet_stare_>`_
- `0.828 (0.039) <baselines_driu_stare_>`_
- `0.815 (0.047) <baselines_hed_stare_>`_
- `0.818 (0.035) <baselines_m2unet_stare_>`_
- `0.828 (0.050) <baselines_unet_stare_>`_
* - :py:mod:`chasedb1 <bob.ip.binseg.configs.datasets.chasedb1.first_annotator>`
- 0.768 (0.023)
- `0.810 (0.017) <baselines_driu_chase_>`_
- `0.806 (0.021) <baselines_hed_chase_>`_
- `0.798 (0.017) <baselines_m2unet_chase_>`_
- `0.803 (0.015) <baselines_unet_chase_>`_
* - :py:mod:`hrf <bob.ip.binseg.configs.datasets.hrf.default>`
- `0.812 (0.018) <baselines_driu_chase_>`_
- `0.806 (0.020) <baselines_hed_chase_>`_
- `0.798 (0.018) <baselines_m2unet_chase_>`_
- `0.807 (0.017) <baselines_unet_chase_>`_
* - :py:mod:`hrf <bob.ip.binseg.configs.datasets.hrf.default>` (1168x1648)
-
- `0.802 (0.039) <baselines_driu_hrf_>`_
- `0.793 (0.041) <baselines_hed_hrf_>`_
- `0.785 (0.041) <baselines_m2unet_hrf_>`_
- `0.797 (0.038) <baselines_unet_hrf_>`_
- `0.808 (0.038) <baselines_driu_hrf_>`_
- `0.803 (0.040) <baselines_hed_hrf_>`_
- `0.796 (0.048) <baselines_m2unet_hrf_>`_
- `0.811 (0.039) <baselines_unet_hrf_>`_
* - :py:mod:`hrf <bob.ip.binseg.configs.datasets.hrf.default>` (2336x3296)
-
- `0.722 (0.073) <baselines_driu_hrf_>`_
- `0.703 (0.090) <baselines_hed_hrf_>`_
- `0.713 (0.143) <baselines_m2unet_hrf_>`_
- `0.756 (0.051) <baselines_unet_hrf_>`_
* - :py:mod:`iostar-vessel <bob.ip.binseg.configs.datasets.iostar.vessel>`
-
- `0.823 (0.021) <baselines_driu_iostar_>`_
- `0.821 (0.022) <baselines_hed_iostar_>`_
- `0.816 (0.021) <baselines_m2unet_iostar_>`_
- `0.818 (0.019) <baselines_unet_iostar_>`_
- `0.825 (0.020) <baselines_driu_iostar_>`_
- `0.827 (0.020) <baselines_hed_iostar_>`_
- `0.820 (0.018) <baselines_m2unet_iostar_>`_
- `0.818 (0.020) <baselines_unet_iostar_>`_
Precision-Recall (PR) Curves
----------------------------
......@@ -90,7 +102,7 @@ its average value across all test set images, for a fixed threshold set to
versus Sensitivity) with respect to the overall shape. You may have a look
at [DAVIS-2006]_ for details on the relationship between PR and ROC curves.
For example, PR curves are not guaranteed to be monotonically increasing or
decreasing with the scanned thresholds (e.g. see M2U-Net on STARE dataset).
decreasing with the scanned thresholds.
Each evaluated threshold in a combination of trained models and datasets is
represented by a point in each curve. Points are linearly interpolated to
......@@ -125,16 +137,21 @@ its average value across all test set images, for a fixed threshold set to
- .. figure:: hrf.png
:align: center
:scale: 50%
:alt: Model comparisons for hrf datasets
:alt: Model comparisons for hrf datasets (matching training resolution: 1168x1648)
:py:mod:`hrf <bob.ip.binseg.configs.datasets.hrf.default>`: PR curve and F1 scores at T=0.5 (:download:`pdf <hrf.pdf>`)
:py:mod:`hrf <bob.ip.binseg.configs.datasets.hrf.default>` (1168x1648): PR curve and F1 scores at T=0.5 (:download:`pdf <hrf.pdf>`)
* - .. figure:: iostar-vessel.png
:align: center
:scale: 50%
:alt: Model comparisons for iostar-vessel datasets
:py:mod:`iostar-vessel <bob.ip.binseg.configs.datasets.iostar.vessel>`: PR curve and F1 scores at T=0.5 (:download:`pdf <iostar-vessel.pdf>`)
-
- .. figure:: hrf-fullres.png
:align: center
:scale: 50%
:alt: Model comparisons for hrf datasets (double training resolution: 2336x3296)
:py:mod:`hrf <bob.ip.binseg.configs.datasets.hrf.default>` (2336x3296): PR curve and F1 scores at T=0.5 (:download:`pdf <hrf-fullres.pdf>`)
Remarks
......@@ -149,7 +166,9 @@ Remarks
* Where second annotator labels exist, model performance and variability seems
on par with such annotations. One possible exception is for CHASE-DB1, where
models show consistently less variability than the second annotator.
Unfortunately, this cannot be conclusive.
Unfortunately, this is not conclusive.
* Training at half resolution for HRF shows a small loss in performance (10 to
15%) when the high-resolution version is used as evaluation set.
.. include:: ../../links.rst
No preview for this file type
doc/results/baselines/iostar-vessel.png

82.3 KiB | W: | H:

doc/results/baselines/iostar-vessel.png

83.8 KiB | W: | H:

doc/results/baselines/iostar-vessel.png
doc/results/baselines/iostar-vessel.png
doc/results/baselines/iostar-vessel.png
doc/results/baselines/iostar-vessel.png
  • 2-up
  • Swipe
  • Onion skin
No preview for this file type
doc/results/baselines/stare.png

91.5 KiB | W: | H:

doc/results/baselines/stare.png

89.4 KiB | W: | H:

doc/results/baselines/stare.png
doc/results/baselines/stare.png
doc/results/baselines/stare.png
doc/results/baselines/stare.png
  • 2-up
  • Swipe
  • Onion skin
......@@ -2,121 +2,80 @@
.. _bob.ip.binseg.results.covd:
.. todo::
========================================
Combined Vessel Dataset (COVD) Results
========================================
F1 Scores (micro-level)
-----------------------
* Benchmark results for models: DRIU, HED, M2U-Net and U-Net.
* Models are trained on a COVD **excluding** the target dataset, and tested on
the target dataset (**numbers in bold** indicate number of parameters per
model). Models are trained for a fixed number of 1000 epochs, with a
learning rate of 0.001 until epoch 900 and then 0.0001 until the end of the
training.
* Database and model resource configuration links (table top row and left
column) are linked to the originating configuration files used to obtain
these results.
* Check `our paper`_ for details on the calculation of the F1 Score and standard
deviations (in parentheses).
* Single performance numbers correspond to *a priori* performance indicators,
where the threshold is previously selected on the training set (COVD
excluding the target dataset)
* You can cross check the analysis numbers provided in this table by
downloading this software package, the raw data, and running ``bob binseg
analyze`` providing the model URL as ``--weight`` parameter.
* For comparison purposes, we provide "second-annotator" performances on the
same test set, where available.
This section is outdated and needs re-factoring.
============================
COVD- and COVD-SLL Results
============================
In addition to the M2U-Net architecture, we also evaluated the larger DRIU
network and a variation of it that contains batch normalization (DRIU+BN) on
COVD- (Combined Vessel Dataset from all training data minus target test set)
and COVD-SSL (COVD- and Semi-Supervised Learning). Perhaps surprisingly, for
the majority of combinations, the performance of the DRIU variants are roughly
equal or worse to the ones obtained with the much smaller M2U-Net. We
anticipate that one reason for this could be overparameterization of large
VGG-16 models that are pretrained on ImageNet.
F1 Scores
---------
Comparison of F1 Scores (micro-level and standard deviation) of DRIU and
M2U-Net on COVD- and COVD-SSL. Standard deviation across test-images in
brackets.
.. list-table::
:header-rows: 1
* - F1 score
- :py:mod:`DRIU <bob.ip.binseg.configs.models.driu>`/:py:mod:`DRIU@SSL <bob.ip.binseg.configs.models.driu_ssl>`
- :py:mod:`DRIU+BN <bob.ip.binseg.configs.models.driu_bn>`/:py:mod:`DRIU+BN@SSL <bob.ip.binseg.configs.models.driu_bn_ssl>`
- :py:mod:`M2U-Net <bob.ip.binseg.configs.models.m2unet>`/:py:mod:`M2U-Net@SSL <bob.ip.binseg.configs.models.m2unet_ssl>`
* - :py:mod:`COVD-DRIVE <bob.ip.binseg.configs.datasets.drive.covd>`
- 0.788 (0.018)
- 0.797 (0.019)
- `0.789 (0.018) <m2unet_covd-drive.pth>`_
* - :py:mod:`COVD-DRIVE+SSL <bob.ip.binseg.configs.datasets.drive.ssl>`
- 0.785 (0.018)
- 0.783 (0.019)
- `0.791 (0.014) <m2unet_covd-drive_ssl.pth>`_
* - :py:mod:`COVD-STARE <bob.ip.binseg.configs.datasets.stare.covd>`
- 0.778 (0.117)
- 0.778 (0.122)
- `0.812 (0.046) <m2unet_covd-stare.pth>`_
* - :py:mod:`COVD-STARE+SSL <bob.ip.binseg.configs.datasets.stare.ssl>`
- 0.788 (0.102)
- 0.811 (0.074)
- `0.820 (0.044) <m2unet_covd-stare_ssl.pth>`_
* - :py:mod:`COVD-CHASEDB1 <bob.ip.binseg.configs.datasets.chasedb1.covd>`
- 0.796 (0.027)
- 0.791 (0.025)
- `0.788 (0.024) <m2unet_covd-chasedb1.pth>`_
* - :py:mod:`COVD-CHASEDB1+SSL <bob.ip.binseg.configs.datasets.chasedb1.ssl>`
- 0.796 (0.024)
- 0.798 (0.025)
- `0.799 (0.026) <m2unet_covd-chasedb1_ssl.pth>`_
* - :py:mod:`COVD-HRF <bob.ip.binseg.configs.datasets.hrf.covd>`
- 0.799 (0.044)
- 0.800 (0.045)
- `0.802 (0.045) <m2unet_covd-hrf.pth>`_
* - :py:mod:`COVD-HRF+SSL <bob.ip.binseg.configs.datasets.hrf.ssl>`
- 0.799 (0.044)
- 0.784 (0.048)
- `0.797 (0.044) <m2unet_covd-hrf_ssl.pth>`_
* - :py:mod:`COVD-IOSTAR-VESSEL <bob.ip.binseg.configs.datasets.iostar.covd>`
- 0.791 (0.021)
- 0.777 (0.032)
- `0.793 (0.015) <m2unet_covd-iostar.pth>`_
* - :py:mod:`COVD-IOSTAR-VESSEL+SSL <bob.ip.binseg.configs.datasets.iostar.ssl>`
- 0.797 (0.017)
- 0.811 (0.074)
- `0.785 (0.018) <m2unet_covd-iostar_ssl.pth>`_
M2U-Net Precision vs. Recall Curves
-----------------------------------
Precision vs. recall curves for each evaluated dataset. Note that here the
F1-score is calculated on a macro level (see paper for more details).
.. figure:: pr_CHASEDB1.png
:scale: 50 %
:align: center
:alt: model comparisons
CHASE_DB1: Precision vs Recall curve and F1 scores
.. figure:: pr_DRIVE.png
:scale: 50 %
:align: center
:alt: model comparisons
DRIVE: Precision vs Recall curve and F1 scores
.. figure:: pr_HRF.png
:scale: 50 %
:align: center
:alt: model comparisons
HRF: Precision vs Recall curve and F1 scores
.. figure:: pr_IOSTARVESSEL.png
:scale: 50 %
:align: center
:alt: model comparisons
IOSTAR: Precision vs Recall curve and F1 scores
.. figure:: pr_STARE.png
:scale: 50 %
:align: center
:alt: model comparisons
STARE: Precision vs Recall curve and F1 scores
:header-rows: 2
* -
-
- :py:mod:`driu <bob.ip.binseg.configs.models.driu>`
- :py:mod:`hed <bob.ip.binseg.configs.models.hed>`
- :py:mod:`m2unet <bob.ip.binseg.configs.models.m2unet>`
- :py:mod:`unet <bob.ip.binseg.configs.models.unet>`
* - Dataset
- 2nd. Annot.
- 15M
- 14.7M
- 0.55M
- 25.8M
* - :py:mod:`drive <bob.ip.binseg.configs.datasets.drive.covd>`
- 0.788 (0.021)
- `0.768 (0.031) <covd_driu_drive_>`_
- `0.750 (0.036) <covd_hed_drive_>`_
- `0.771 (0.027) <covd_m2unet_drive_>`_
- `0.775 (0.029) <covd_unet_drive_>`_
* - :py:mod:`stare <bob.ip.binseg.configs.datasets.stare.covd>`
- 0.759 (0.028)
- `0.786 (0.100) <covd_driu_stare_>`_
- `0.738 (0.193) <covd_hed_stare_>`_
- `0.800 (0.080) <covd_m2unet_stare_>`_
- `0.806 (0.072) <covd_unet_stare_>`_
* - :py:mod:`chasedb1 <bob.ip.binseg.configs.datasets.chasedb1.covd>`
- 0.768 (0.023)
- `0.778 (0.031) <covd_driu_chase_>`_
- `0.777 (0.028) <covd_hed_chase_>`_
- `0.776 (0.031) <covd_m2unet_chase_>`_
- `0.779 (0.028) <covd_unet_chase_>`_
* - :py:mod:`hrf <bob.ip.binseg.configs.datasets.hrf.covd>`
-
- `0.742 (0.049) <covd_driu_hrf_>`_
- `0.719 (0.047) <covd_hed_hrf_>`_
- `0.735 (0.045) <covd_m2unet_hrf_>`_
- `0.746 (0.046) <covd_unet_hrf_>`_
* - :py:mod:`iostar-vessel <bob.ip.binseg.configs.datasets.iostar.covd>`
-
- `0.790 (0.023) <covd_driu_iostar_>`_
- `0.792 (0.020) <covd_hed_iostar_>`_
- `0.788 (0.021) <covd_m2unet_iostar_>`_
- `0.783 (0.019) <covd_unet_iostar_>`_
.. include:: ../../links.rst
......@@ -18,6 +18,7 @@ strategy.
baselines/index
xtest/index
covd/index
old/index
.. include:: ../links.rst
.. -*- coding: utf-8 -*-
.. _bob.ip.binseg.results.old:
.. todo::
This section is outdated and needs re-factoring.
============================
COVD- and COVD-SLL Results
============================
In addition to the M2U-Net architecture, we also evaluated the larger DRIU
network and a variation of it that contains batch normalization (DRIU+BN) on
COVD- (Combined Vessel Dataset from all training data minus target test set)
and COVD-SSL (COVD- and Semi-Supervised Learning). Perhaps surprisingly, for
the majority of combinations, the performance of the DRIU variants are roughly
equal or worse to the ones obtained with the much smaller M2U-Net. We
anticipate that one reason for this could be overparameterization of large
VGG-16 models that are pretrained on ImageNet.
F1 Scores
---------
Comparison of F1 Scores (micro-level and standard deviation) of DRIU and
M2U-Net on COVD- and COVD-SSL. Standard deviation across test-images in
brackets.
.. list-table::
:header-rows: 1
* - F1 score
- :py:mod:`DRIU <bob.ip.binseg.configs.models.driu>`/:py:mod:`DRIU@SSL <bob.ip.binseg.configs.models.driu_ssl>`
- :py:mod:`DRIU+BN <bob.ip.binseg.configs.models.driu_bn>`/:py:mod:`DRIU+BN@SSL <bob.ip.binseg.configs.models.driu_bn_ssl>`
- :py:mod:`M2U-Net <bob.ip.binseg.configs.models.m2unet>`/:py:mod:`M2U-Net@SSL <bob.ip.binseg.configs.models.m2unet_ssl>`
* - :py:mod:`COVD-DRIVE <bob.ip.binseg.configs.datasets.drive.covd>`
- 0.788 (0.018)
- 0.797 (0.019)
- `0.789 (0.018) <m2unet_covd-drive.pth>`_
* - :py:mod:`COVD-DRIVE+SSL <bob.ip.binseg.configs.datasets.drive.ssl>`
- 0.785 (0.018)
- 0.783 (0.019)
- `0.791 (0.014) <m2unet_covd-drive_ssl.pth>`_
* - :py:mod:`COVD-STARE <bob.ip.binseg.configs.datasets.stare.covd>`
- 0.778 (0.117)
- 0.778 (0.122)
- `0.812 (0.046) <m2unet_covd-stare.pth>`_
* - :py:mod:`COVD-STARE+SSL <bob.ip.binseg.configs.datasets.stare.ssl>`
- 0.788 (0.102)
- 0.811 (0.074)
- `0.820 (0.044) <m2unet_covd-stare_ssl.pth>`_
* - :py:mod:`COVD-CHASEDB1 <bob.ip.binseg.configs.datasets.chasedb1.covd>`
- 0.796 (0.027)
- 0.791 (0.025)
- `0.788 (0.024) <m2unet_covd-chasedb1.pth>`_
* - :py:mod:`COVD-CHASEDB1+SSL <bob.ip.binseg.configs.datasets.chasedb1.ssl>`
- 0.796 (0.024)
- 0.798 (0.025)
- `0.799 (0.026) <m2unet_covd-chasedb1_ssl.pth>`_
* - :py:mod:`COVD-HRF <bob.ip.binseg.configs.datasets.hrf.covd>`
- 0.799 (0.044)
- 0.800 (0.045)
- `0.802 (0.045) <m2unet_covd-hrf.pth>`_
* - :py:mod:`COVD-HRF+SSL <bob.ip.binseg.configs.datasets.hrf.ssl>`
- 0.799 (0.044)
- 0.784 (0.048)
- `0.797 (0.044) <m2unet_covd-hrf_ssl.pth>`_
* - :py:mod:`COVD-IOSTAR-VESSEL <bob.ip.binseg.configs.datasets.iostar.covd>`
- 0.791 (0.021)
- 0.777 (0.032)
- `0.793 (0.015) <m2unet_covd-iostar.pth>`_
* - :py:mod:`COVD-IOSTAR-VESSEL+SSL <bob.ip.binseg.configs.datasets.iostar.ssl>`
- 0.797 (0.017)
- 0.811 (0.074)
- `0.785 (0.018) <m2unet_covd-iostar_ssl.pth>`_
M2U-Net Precision vs. Recall Curves
-----------------------------------
Precision vs. recall curves for each evaluated dataset. Note that here the
F1-score is calculated on a macro level (see paper for more details).
.. figure:: pr_CHASEDB1.png
:scale: 50 %
:align: center
:alt: model comparisons
CHASE_DB1: Precision vs Recall curve and F1 scores
.. figure:: pr_DRIVE.png
:scale: 50 %
:align: center
:alt: model comparisons
DRIVE: Precision vs Recall curve and F1 scores
.. figure:: pr_HRF.png
:scale: 50 %
:align: center
:alt: model comparisons
HRF: Precision vs Recall curve and F1 scores
.. figure:: pr_IOSTARVESSEL.png
:scale: 50 %
:align: center
:alt: model comparisons
IOSTAR: Precision vs Recall curve and F1 scores
.. figure:: pr_STARE.png
:scale: 50 %
:align: center
:alt: model comparisons
STARE: Precision vs Recall curve and F1 scores
.. include:: ../../links.rst
File moved
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment