Compare revisions

Daniel CARRON · Daniel CARRON · Daniel CARRON · Daniel CARRON · Daniel CARRON · Daniel CARRON
--- a/.gitignore
+++ b/.gitignore
@@ -26,3 +26,4 @@ _work/
 .mypy_cache/
 .pytest_cache/
 results*/
+trainlog.pdf
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -6,6 +6,10 @@
 # See https://pre-commit.com for more information
 # See https://pre-commit.com/hooks.html for more hooks
 repos:
+  - repo: https://github.com/numpy/numpydoc
+    rev: v1.6.0
+    hooks:
+      - id: numpydoc-validation
  - repo: https://github.com/psf/black
    rev: 23.12.1
    hooks:
@@ -14,6 +18,9 @@ repos:
    rev: v1.7.5
    hooks:
      - id: docformatter
+        args: [
+          --wrap-summaries=0,
+      ]
  - repo: https://github.com/pycqa/isort
    rev: 5.13.2
    hooks:

--- a/.reuse/dep5
+++ b/.reuse/dep5
@@ -11,6 +11,7 @@ Files:
 doc/extras.inv
 doc/extras.txt
 doc/catalog.json
+ doc/img/*.png
 doc/usage/img/*.png
 doc/results/img/*.jpg
 doc/results/img/*.png

--- a/doc/api.rst
+++ b/doc/api.rst
@@ -42,9 +42,12 @@ CNN and other models implemented.
   mednet.models.pasa
   mednet.models.alexnet
   mednet.models.densenet
-   mednet.models.normalizer
   mednet.models.logistic_regression
+   mednet.models.loss_weights
   mednet.models.mlp
+   mednet.models.normalizer
+   mednet.models.separate
+   mednet.models.transforms
   mednet.models.typing


@@ -58,11 +61,12 @@ Functions to actuate on the data.
 .. autosummary::
   :toctree: api/engine

-   mednet.engine.device
   mednet.engine.callbacks
-   mednet.engine.trainer
-   mednet.engine.predictor
+   mednet.engine.device
   mednet.engine.evaluator
+   mednet.engine.loggers
+   mednet.engine.predictor
+   mednet.engine.trainer


 .. _mednet.api.saliency:
@@ -75,9 +79,11 @@ Engines to generate and analyze saliency mapping techniques.
 .. autosummary::
   :toctree: api/saliency

-   mednet.engine.saliency.generator
   mednet.engine.saliency.completeness
+   mednet.engine.saliency.evaluator
+   mednet.engine.saliency.generator
   mednet.engine.saliency.interpretability
+   mednet.engine.saliency.viewer


 .. _mednet.api.utils:

--- a/doc/catalog.json
+++ b/doc/catalog.json
@@ -14,6 +14,15 @@
      "environment": "lightning"
    }
  },
+  "tensorboardx": {
+    "versions": {
+      "stable": "https://tensorboardx.readthedocs.io/en/stable/",
+      "latest": "https://tensorboardx.readthedocs.io/en/latest/"
+    },
+    "sources": {
+      "readthedocs": "tensorboardx"
+    }
+  },
  "tabulate": {
    "versions": {
      "latest": "https://tabulate.readthedocs.io/en/latest/",

--- a/doc/conf.py
+++ b/doc/conf.py
@@ -123,6 +123,7 @@ auto_intersphinx_packages = [
    "torch",
    "torchvision",
    "lightning",
+    "tensorboardx",
    ("clapper", "latest"),
    ("python", "3"),
 ]

--- a/doc/config.rst
+++ b/doc/config.rst
@@ -8,7 +8,7 @@ Preset Configurations
 ---------------------

 This module contains preset configurations for baseline CNN architectures and
-datamodules.
+DataModules.


 .. _mednet.config.models:
@@ -38,9 +38,9 @@ DataModule support
 ==================

 Base DataModules and raw data loaders for the various databases currently
-supported in this package, for your reference.  Each pre-configured data module
+supported in this package, for your reference.  Each pre-configured DataModule
 can receive the name of one or more splits as argument to build a fully
-functional data module that can be used in training, prediction or testing.
+functional DataModule that can be used in training, prediction or testing.

 .. autosummary::
   :toctree: api/config.datamodules
@@ -67,7 +67,7 @@ Pre-configured DataModules

 DataModules provide access to preset pytorch dataloaders for training,
 validating, testing and running prediction tasks.  Each of the pre-configured
-DataModule is based on one (or more) of the :ref:`supported base data modules
+DataModule is based on one (or more) of the :ref:`supported base DataModules
 <mednet.config.datamodules>`.

 .. autosummary::
@@ -97,8 +97,8 @@ Cross-validation DataModules

 We support cross-validation with precise preset folds.  In this section, you
 will find the configuration for the first fold (fold-0) for all supported
-datamodules.  Nine other folds are available for every configuration (from 1 to
-9), making up 10 folds per supported datamodule.
+DataModules.  Nine other folds are available for every configuration (from 1 to
+9), making up 10 folds per supported DataModule.


 .. autosummary::

--- a/doc/contribute.rst
+++ b/doc/contribute.rst
+.. Copyright © 2022 Idiap Research Institute <contact@idiap.ch>
+..
+.. SPDX-License-Identifier: GPL-3.0-or-later
+
+.. _mednet.contribute:
+
+===================================
+ Getting Involved and Contributing
+===================================
+
+We will happily accept external contributions, but substantial contributions
+require a signed Contributor or `Copyright License Agreement <cla_>`_ (CLA).
+Our CLA, based on `Project Harmony`_, leaves copyright with you (the
+contributor), but allows us to relicense the code, with a restriction based on
+the license the contribution was made under.
+
+Contact our `Technology Transfer Officer <tto_>`_ to get a copy of the CLA_ for
+this project.  If you work for a company and your contributions are tied to
+your job, ensure you have the legal right to sign this CLA, or refer to the
+responsible person during your e-mail exchange with our TTO_.
+
+
+.. include:: links.rst
--- a/doc/data-model.rst
+++ b/doc/data-model.rst
+.. Copyright © 2023 Idiap Research Institute <contact@idiap.ch>
+..
+.. SPDX-License-Identifier: GPL-3.0-or-later
+
+.. _mednet.datamodel:
+
+============
+ Data model
+============
+
+The data model implemented in this package is summarized in the following
+figure:
+
+.. figure:: img/data-model.png
+
+
+Each of the elements is described next.
+
+
+Database
+--------
+
+Data that is downloaded from a data provider, and contains samples in their raw
+data format. The database may contain both data and metadata, and is supposed
+to exist on disk (or any other storage device) in an arbitrary location that is
+user-configurable, in the user environment. For example, databases 1 and 2 for
+user A may be under ``/home/user-a/databases/database-1`` and
+``/home/user-a/databases/database-2``, while for user B, they may sit in
+``/groups/medical-data/DatabaseOne`` and ``/groups/medical-data/DatabaseTwo``.
+
+
+Sample
+------
+
+The in-memory representation of the raw database samples. In this package, it
+is specified as a two-tuple with a tensor, and metadata (typically label, name,
+etc.).
+
+
+RawDataLoader
+-------------
+
+A concrete "functor" that allows one to load the raw data and associated
+metadata, to create a in-memory Sample representation. RawDataLoaders are
+typically Database-specific due to raw data and metadata encoding varying quite
+a lot on different databases. RawDataLoaders may also embed various
+pre-processing transformations to render data readily usable such as
+pre-cropping of black pixel areas, or 16-bit to 8-bit auto-level conversion.
+
+
+TransformSequence
+-----------------
+
+A sequence of callables that allows one to transform torch.Tensor objects into
+other torch.Tensor objects, typically to crop, resize, convert Color-spaces,
+and the such on raw-data.
+
+
+DatabaseSplit
+-------------
+
+A dictionary that represents an organization of the available raw data in the
+database to perform an evaluation protocol (e.g. train, validation, test)
+through datasets (or subsets). It is represented as dictionary mapping dataset
+names to lists of "raw-data" sample representations, which vary in format
+depending on Database metadata availability. RawDataLoaders receive this raw
+representations and can convert these to in-memory Sample's.
+
+
+ConcatDatabaseSplit
+-------------------
+
+An extension of a DatabaseSplit, in which the split can be formed by
+cannibalising various other DatabaseSplits to construct a new evaluation
+protocol. Examples of this are cross-database tests, or the construction of
+multi-Database training and validation subsets.
+
+
+Dataset
+-------
+
+An iterable object over in-memory Samples, inherited from the pytorch Dataset
+definition. A dataset in our framework may be completely cached in memory or
+have in-memory representation of samples loaded on demand. After data loading,
+our datasets can optionally apply a TransformSequence, composed of
+pre-processing steps defined on a per-model level before optionally caching
+in-memory Sample representations. The "raw" representation of a dataset are the
+split dictionary values (ie. not the keys).
+
+
+DataModule
+----------
+
+A DataModule aggregates Splits and RawDataLoaders to provide lightning a
+known-interface to the complete evaluation protocol (train, validation,
+prediction and testing) required for a full experiment to take place. It
+automates control over data loading parallelisation and caching inside our
+framework, providing final access to readily-usable pytorch DataLoaders.
--- a/doc/img/data-model.dot
+++ b/doc/img/data-model.dot
+# SPDX-FileCopyrightText: Copyright © 2024 Idiap Research Institute <contact@idiap.ch>
+#
+# SPDX-License-Identifier: GPL-3.0-or-later
+
+digraph G {
+    rankdir = T;
+
+    fontname = "Helvetica"
+
+    node [
+        fontname = "Helvetica"
+        shape = "record"
+    ]
+
+    edge [
+        fontname = "Helvetica"
+    ]
+
+    Database [
+        label = "Database\l(on storage)"
+        shape = "cylinder"
+    ]
+
+    DatabaseSplit [
+        label = "{DatabaseSplit|+ __init__(description: JSON)\l+ splits() : dict[str, list]\l}"
+    ]
+
+    RawDataLoader [
+        label = "{RawDataLoader|+ datadir : path\l|+ sample(description : JSON) : Sample \l+ label(description : JSON) : int\l}"
+    ]
+
+    DataModule [
+        label = "{DataModule|- datasets : dict[str, torch.Dataset]\l+ model_transforms : TransformSequence\l|+ setup(stage: str)\l+ train_dataloader() : DataLoader\l+ val_dataloader() : dict[str, DataLoader]\l+ test_dataloader() : dict[str, DataLoader]\l+ predict_dataloader() : dict[str, DataLoader]\l}"
+    ]
+
+    CachingDataModule [
+        label = "{CachingDataModule (lightning.DataModule)}"
+        style = "dashed"
+    ]
+
+    Sample [
+        label = "{Sample (tuple)|+ tensor: torch.Tensor\l+ metadata: dict[str, Any]\l}"
+    ]
+
+    DataLoader [
+        label = "{DataLoader (torch.DataLoader)|+ __getitem__(key: int)\l+ __iter__()\l}"
+    ]
+
+    edge [
+        arrowhead = "empty"
+    ]
+
+    DataModule -> CachingDataModule
+
+    edge [
+        arrowhead = "diamond"
+        taillabel = "1..1"
+    ]
+
+    DatabaseSplit -> DataModule
+    RawDataLoader -> DataModule
+
+    edge [
+        arrowhead = "diamond"
+        taillabel = "1..*"
+    ]
+
+    Sample -> DataLoader
+
+    edge [
+        arrowhead = "none"
+        taillabel = ""
+        label = "generates"
+    ]
+
+    DataModule -> DataLoader
+
+    edge [
+        arrowhead = "none"
+        headlabel = "1..1"
+        label = "reads"
+    ]
+
+    RawDataLoader -> Database
+
+    { rank = same; Database; CachingDataModule; Sample; }
+    { rank = same; RawDataLoader; DatabaseSplit; DataLoader; }
+
+}
--- a/doc/img/data-model.png
+++ b/doc/img/data-model.png
--- a/doc/index.rst
+++ b/doc/index.rst
@@ -52,10 +52,12 @@ User Guide
   install
   usage/index
   results/index
+   data-model
   references
   cli
   config
   api
+   contribute


 Indices and tables

--- a/doc/install.rst
+++ b/doc/install.rst
@@ -90,10 +90,10 @@ Here is an example configuration file that may be useful as a starting point:

   .. code:: sh

-      mednet dataset list
+      mednet database list


-   You must procure and download datasets by yourself.  The raw data is not
+   You must procure and download databases by yourself.  The raw data is not
   included in this package as we are not authorised to redistribute it.

   To check whether the downloaded version is consistent with the structure
@@ -101,7 +101,7 @@ Here is an example configuration file that may be useful as a starting point:

   .. code:: sh

-      mednet dataset check montgomery
+      mednet database check <database_name>


 .. _mednet.setup.databases:
@@ -109,8 +109,8 @@ Here is an example configuration file that may be useful as a starting point:
 Supported Databases
 ===================

-Here is a list of currently supported datasets in this package, alongside
-notable properties.  Each dataset name is linked to the location where
+Here is a list of currently supported databases in this package, alongside
+notable properties.  Each database name is linked to the location where
 raw data can be downloaded.  The list of images in each split is available
 in the source code.

@@ -120,13 +120,13 @@ in the source code.
 Tuberculosis databases
 ~~~~~~~~~~~~~~~~~~~~~~

-The following datasets contain only the tuberculosis final diagnosis (0 or 1).
+The following databases contain only the tuberculosis final diagnosis (0 or 1).
 In addition to the splits presented in the following table, 10 folds
-(for cross-validation) randomly generated are available for these datasets.
+(for cross-validation) randomly generated are available for these databases.

 .. list-table::

-   * - Dataset
+   * - Database
     - Reference
     - H x W
     - Samples
@@ -156,20 +156,20 @@ In addition to the splits presented in the following table, 10 folds
     - 52


-.. _mednet.setup.datasets.tb+signs:
+.. _mednet.setup.databases.tb+signs:

 Tuberculosis multilabel databases
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-The following dataset contains the labels healthy, sick & non-TB, active TB,
-and latent TB. The implemented tbx11k dataset in this package is based on
+The following databases contain the labels healthy, sick & non-TB, active TB,
+and latent TB. The implemented tbx11k database in this package is based on
 the simplified version, which is just a more compact version of the original.
 In addition to the splits presented in the following table, 10 folds
-(for cross-validation) randomly generated are available for these datasets.
+(for cross-validation) randomly generated are available for these databases.

 .. list-table::

-   * - Dataset
+   * - Database
     - Reference
     - H x W
     - Samples
@@ -192,17 +192,17 @@ In addition to the splits presented in the following table, 10 folds
     - 2800


-.. _mednet.setup.datasets.tbmultilabel+signs:
+.. _mednet.setup.databases.tbmultilabel+signs:

-Tuberculosis + radiological findings dataset
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tuberculosis + radiological findings databases
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-The following dataset contains both the tuberculosis final diagnosis (0 or 1)
+The following databases contain both the tuberculosis final diagnosis (0 or 1)
 and radiological findings.

 .. list-table::

-   * - Dataset
+   * - Database
     - Reference
     - H x W
     - Samples
@@ -216,12 +216,12 @@ and radiological findings.
     - 0


-.. _mednet.setup.datasets.signs:
+.. _mednet.setup.databases.signs:

-Radiological findings datasets
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Radiological findings databases
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-The following dataset contains only the radiological findings without any
+The following database contains only the radiological findings without any
 information about tuberculosis.

 .. note::
@@ -231,7 +231,7 @@ information about tuberculosis.

 .. list-table::

-   * - Dataset
+   * - Database
     - Reference
     - H x W
     - Samples
@@ -247,20 +247,20 @@ information about tuberculosis.
     - 4'054


-.. _mednet.setup.datasets.hiv-tb:
+.. _mednet.setup.databases.hiv-tb:

-HIV-Tuberculosis datasets
-~~~~~~~~~~~~~~~~~~~~~~~~~
+HIV-Tuberculosis databases
+~~~~~~~~~~~~~~~~~~~~~~~~~~

-The following datasets contain only the tuberculosis final diagnosis (0 or 1)
+The following databases contain only the tuberculosis final diagnosis (0 or 1)
 and come from HIV infected patients. 10 folds (for cross-validation) randomly
-generated are available for these datasets.
+generated are available for these databases.

-Please contact the authors of these datasets to have access to the data.
+Please contact the authors of these databases to have access to the data.

 .. list-table::

-   * - Dataset
+   * - Database
     - Reference
     - H x W
     - Samples

--- a/doc/links.rst
+++ b/doc/links.rst
@@ -5,8 +5,11 @@
 .. place re-used URLs here, then include this file
 .. on your other RST sources.

-.. _conda: https://conda.io
 .. _idiap: http://www.idiap.ch
+.. _cla: https://en.wikipedia.org/wiki/Contributor_License_Agreement
+.. _project harmony: http://www.harmonyagreements.org/
+.. _tto: mailto:tto@idiap.ch
+.. _conda: https://conda.io
 .. _python: http://www.python.org
 .. _pip: https://pip.pypa.io/en/stable/
 .. _mamba: https://mamba.readthedocs.io/en/latest/index.html

--- a/doc/usage/aggregpred.rst
+++ b/doc/usage/aggregpred.rst
-.. Copyright © 2023 Idiap Research Institute <contact@idiap.ch>
-..
-.. SPDX-License-Identifier: GPL-3.0-or-later
-
-.. _mednet.usage.aggregpred:
-
-=======================================================
- Aggregate multiple prediction files into a single one
-=======================================================
-
-This guide explains how to aggregate multiple prediction files into a single
-one. It can be used when doing cross-validation to aggregate the predictions of
-k different models before evaluating the aggregated predictions. We input
-multiple prediction files (CSV files) and output a single one.
-
-Use the sub-command :ref:`aggregpred <mednet.cli>` aggregate your prediction
-files together:
-
-.. code:: sh
-
-   mednet aggregpred -vv path/to/fold0/predictions.csv path/to/fold1/predictions.csv --output-folder=aggregpred
-
-
-.. include:: ../links.rst
--- a/doc/usage/evaluation.rst
+++ b/doc/usage/evaluation.rst
@@ -8,7 +8,7 @@
 Inference and Evaluation
 ==========================

-This guides explains how to run inference or a complete evaluation using
+This guide explains how to run inference or a complete evaluation using
 command-line tools.  Inference produces probability of TB presence for input
 images, while evaluation will analyze such output against existing annotations
 and produce performance figures.
@@ -17,61 +17,42 @@ and produce performance figures.
 Inference
 ---------

-In inference (or prediction) mode, we input data, the trained model, and output
-a CSV file containing the prediction outputs for every input image.
+In inference (or prediction) mode, we input a model, a dataset, a model checkpoint generated during training, and output
+a json file containing the prediction outputs for every input image.

-To run inference, use the sub-command :ref:`predict <mednet.cli>` to run
-prediction on an existing dataset:
-
-.. code:: sh
-
-   mednet predict -vv <model> -w <path/to/model.pth> <dataset>
+To run inference, use the sub-command :ref:`predict <mednet.cli>`.

+Examples
+========

-Replace ``<model>`` and ``<dataset>`` by the appropriate :ref:`configuration
-files <mednet.config>`.  Replace ``<path/to/model.pth>`` to a path leading to
-the pre-trained model.
+To run inference using a trained Pasa CNN on the Montgomery dataset:

-.. tip::
+.. code:: sh

-   An option to generate grad-CAMs is available for the :py:mod:`DensenetRS
-   <mednet.config.models.densenet_rs>` model. To activate it, use the
-   ``--grad-cams`` argument.
+   mednet predict -vv pasa montgomery --weight=<path/to/model.ckpt> --output=<results/folder/predictions.json>

-.. tip::

-   An option to generate a relevance analysis plot is available. To activate
-   it, use the ``--relevance-analysis`` argument.
+Replace ``<path/to/model.ckpt>`` to a path leading to the pre-trained model.


 Evaluation
 ----------

-In evaluation, we input a dataset and predictions to generate performance
-summaries that help analysis of a trained model.  Evaluation is done using the
-:ref:`evaluate command <mednet.cli>` followed by the model and the annotated
-dataset configuration, and the path to the pretrained weights via the
-``--weight`` argument.
+In evaluation, we input predictions to generate performance summaries that help analysis of a trained model.
+The generated files are a .pdf containing various plots and a table of metrics for each dataset split.
+Evaluation is done using the :ref:`evaluate command <mednet.cli>` followed by the json file generated during
+the inference step and a threshold.

 Use ``mednet evaluate --help`` for more information.

-E.g. run evaluation on predictions from the Montgomery set, do the following:
-
-.. code:: sh
-
-   mednet evaluate -vv montgomery -p /predictions/folder -o /eval/results/folder
-
-
-Comparing Systems
-----------------
+Examples
+========

-To compare multiple systems together and generate combined plots and tables,
-use the :ref:`compare command <mednet.cli>`.  Use ``--help`` for a quick
-guide.
+To run evaluation on predictions generated in the inference step, using an optimal threshold computed from the validation set, do the following:

 .. code:: sh

-   mednet compare -vv A A/metrics.csv B B/metrics.csv --output-figure=plot.pdf --output-table=table.txt --threshold=0.5
+   mednet evaluate -vv --predictions=<path/to/predictions.json> --output-folder=<results/folder> --threshold=validation


 .. include:: ../links.rst
--- a/doc/usage/experiment.rst
+++ b/doc/usage/experiment.rst
+.. Copyright © 2023 Idiap Research Institute <contact@idiap.ch>
+..
+.. SPDX-License-Identifier: GPL-3.0-or-later
+
+.. _mednet.experiment:
+
+==============================
+ Running complete experiments
+==============================
+
+We provide an :ref:`experiment command <mednet.cli>`
+that runs training, followed by prediction and evaluation.
+After running, you will be able to find results from model fitting,
+prediction and evaluation under a single output directory.
+
+For example, to train a pasa model on the montgomery database
+evaluate its performance and output predictions and performance curves,
+run the following:
+
+.. code-block:: sh
+
+   $ mednet experiment -vv pasa montgomery
+   # check results in the "results" folder
+
+You may run the system on a GPU by using the ``--device=cuda:0`` option.
+
+
+.. include:: ../links.rst
--- a/doc/usage/index.rst
+++ b/doc/usage/index.rst
@@ -16,6 +16,7 @@ tuberculosis detection with support for the following activities.

 .. _mednet.usage.direct-detection:

+
 Direct detection
 ----------------

@@ -24,33 +25,31 @@ Direct detection
  automatically, via error back propagation. The objective of this phase is to
  produce a CNN model.
 * Inference (prediction): The CNN is used to generate TB predictions.
-* Evaluation: Predications are used to evaluate CNN performance against
+* Evaluation: Predictions are used to evaluate CNN performance against
  provided annotations, and to generate measure files and score tables. Optimal
-  thresholds are also calculated.
-* Comparison: Use predictions results to compare performance of multiple
-  systems.
+  thresholds can also be calculated.


-.. _mednet.usage.indirect-detection:
+.. \_mednet.usage.indirect-detection:

-Indirect detection
------------------
+.. Indirect detection
+  ------------------

-* Training (step 1): Images are fed to a Convolutional Neural Network (CNN),
+.. * Training (step 1): Images are fed to a Convolutional Neural Network (CNN),
  that is trained to detect the presence of radiological signs
  automatically, via error back propagation. The objective of this phase is to
  produce a CNN model.
-* Inference (prediction): The CNN is used to generate radiological signs
-  predictions.
-* Conversion of the radiological signs predictions into a new dataset.
-* Training (step 2): Radiological signs are fed to a shallow network, that is
-  trained to detect the presence of tuberculosis automatically, via error back
-  propagation. The objective of this phase is to produce a shallow model.
-* Inference (prediction): The shallow model is used to generate TB predictions.
-* Evaluation: Predications are used to evaluate CNN performance against
-  provided annotations, and to generate measure files and score tables.
-* Comparison: Use predictions results to compare performance of multiple
-  systems.
+  * Inference (prediction): The CNN is used to generate radiological signs
+    predictions.
+  * Conversion of the radiological signs predictions into a new dataset.
+  * Training (step 2): Radiological signs are fed to a shallow network, that is
+    trained to detect the presence of tuberculosis automatically, via error back
+    propagation. The objective of this phase is to produce a shallow model.
+  * Inference (prediction): The shallow model is used to generate TB predictions.
+  * Evaluation: Predications are used to evaluate CNN performance against
+    provided annotations, and to generate measure files and score tables.
+  * Comparison: Use predictions results to compare performance of multiple
+    systems.

 We provide :ref:`command-line interfaces (CLI) <mednet.cli>` that implement
 each of the phases above. This interface is configurable using :ref:`clapper's
@@ -63,7 +62,7 @@ to an application.

   For reproducibility, we recommend you stick to configuration files when
   parameterizing our CLI. Notice some of the options in the CLI interface
-   (e.g. ``--dataset``) cannot be passed via the actual command-line as it
+   (e.g. ``--datamodule``) cannot be passed via the actual command-line as it
   may require complex Python types that cannot be synthetized in a single
   input parameter.

@@ -80,12 +79,12 @@ Commands
 --------

 .. toctree::
-   :maxdepth: 2
+  :maxdepth: 2

-   training
-   evaluation
-   predtojson
-   aggregpred
+  experiment
+  training
+  evaluation
+  saliency


 .. include:: ../links.rst
--- a/doc/usage/predtojson.rst
+++ b/doc/usage/predtojson.rst
-.. Copyright © 2023 Idiap Research Institute <contact@idiap.ch>
-..
-.. SPDX-License-Identifier: GPL-3.0-or-later
-
-.. _mednet.usage.predtojson:
-
-========================================
- Converting predictions to JSON dataset
-========================================
-
-This guide explains how to convert radiological signs predictions from a model
-into a JSON dataset. It can be used to create new versions of TB datasets with
-the predicted radiological signs to be able to use a shallow model. We input
-predictions (CSV files) and output a ``dataset.json`` file.
-
-Use the sub-command :ref:`predtojson <mednet.cli>` to create your JSON dataset
-file:
-
-.. code:: sh
-
-   mednet predtojson -vv train train/predictions.csv test test/predictions.csv --output-folder=pred_to_json
-
-
-.. include:: ../links.rst
--- a/doc/usage/saliency.rst
+++ b/doc/usage/saliency.rst
+.. Copyright © 2023 Idiap Research Institute <contact@idiap.ch>
+..
+.. SPDX-License-Identifier: GPL-3.0-or-later
+
+.. _mednet.usage.saliency:
+
+==========
+ Saliency
+==========
+
+A saliency map highlights areas of interest within an image. In the context of TB detection, this would be the locations in a chest X-ray image where tuberculosis is present.
+
+This package provides scripts that can generate saliency maps and compute relevant metrics for interpretability purposes.
+
+Some of the scripts require the use of a database with human-annotated saliency information.
+
+Generation
+----------
+
+Saliency maps can be generated with the :ref:`saliency generate command <mednet.cli>`.
+They are represented as numpy arrays of the same size as thes images, with values in the range [0-1] and saved in .npy files.
+
+Several mapping algorithms are available to choose from, which can be specified with the -s option.
+
+Examples
+========
+
+Generates saliency maps for all prediction dataloaders on a DataModule,
+using a pre-trained pasa model, and saves them as numpy-pickeled
+objects on the output directory:
+
+.. code:: sh
+
+   mednet saliency generate -vv pasa tbx11k-v1-healthy-vs-atb --weight=path/to/model-at-lowest-validation-loss.ckpt --output-folder=path/to/output
+
+Viewing
+-------
+
+To overlay saliency maps over the original images, use the :ref:`saliency view command <mednet.cli>`.
+Results are saved as PNG images in which brigter pixels correspond to areas with higher saliency.
+
+Examples
+========
+
+Generates visualizations in form of heatmaps from existing saliency maps for a dataset configuration:
+
+.. code:: sh
+
+    # input-folder is the location of the saliency maps created with `mednet generate`
+    mednet saliency view -vv pasa tbx11k-v1-healthy-vs-atb --input-folder=parent_folder/gradcam/ --output-folder=path/to/visualizations
+
+
+Interpretability
+----------------
+
+Given a target label, the interpretability step computes the proportional energy and average saliency focus in a DataModule.
+
+The proportional energy is defined as the quantity of activation that lies within the ground truth boxes compared to the total sum of the activations.
+The average saliency focus is the sum of the values of the saliency map over the ground-truth bounding boxes, normalized by the total area covered by all ground-truth bounding boxes.
+
+This requires a DataModule containing human-annotated bounding boxes.
+
+Examples
+========
+
+Evaluate the generated saliency maps for their localization performance:
+
+.. code:: sh
+
+    mednet saliency interpretability -vv tbx11k-v1-healthy-vs-atb --input-folder=parent-folder/saliencies/ --output-json=path/to/interpretability-scores.json
+
+
+Completeness
+------------
+The saliency completeness script computes ROAD scores of saliency maps and saves them in a .json file.
+
+The ROAD algorithm estimates the explainability (in the completeness sense) of saliency maps by substituting
+relevant pixels in the input image by a local average, re-running prediction on the altered image,
+and measuring changes in the output classification score when said perturbations are in place.
+By substituting most or least relevant pixels with surrounding averages, the ROAD algorithm estimates
+the importance of such elements in the produced saliency map.
+
+More information can be found in [ROAD-2022]_.
+
+This requires a DataModule containing human-annotated bounding boxes.
+
+Examples
+========
+
+Calculates the ROAD scores for an existing dataset configuration and stores them in .json files:
+
+.. code:: sh
+
+    mednet saliency completeness -vv pasa tbx11k-v1-healthy-vs-atb --device="cuda:0" --weight=path/to/model-at-lowest-validation-loss.ckpt --output-json=path/to/completeness-scores.json
+
+
+Evaluation
+----------
+The saliency evaluation step generates tables and plots from the results of the interpretability and completeness steps.
+
+Examples
+========
+
+Tabulates and generates plots for two saliency map algorithms:
+
+.. code:: sh
+
+    mednet saliency evaluate -vv -e gradcam path/to/gradcam-completeness.json path/to/gradcam-interpretability.json -e gradcam++ path/to/gradcam++-completeness.json path/to/gradcam++-interpretability.json
+
+.. include:: ../links.rst
No results found