Skip to content
Snippets Groups Projects
Commit 10917f33 authored by Daniel CARRON's avatar Daniel CARRON :b:
Browse files

[doc] Update installation and usage

Update the installation and usage documentation to reflect changes made
in recent merges. Instructions relating to removed functionality has
been removed. Instructions relating to RS datasets has been commented
out for now as it needs more work.
parent 6ce6f1a6
No related branches found
No related tags found
1 merge request!15Update documentation
...@@ -90,10 +90,10 @@ Here is an example configuration file that may be useful as a starting point: ...@@ -90,10 +90,10 @@ Here is an example configuration file that may be useful as a starting point:
.. code:: sh .. code:: sh
mednet dataset list mednet database list
You must procure and download datasets by yourself. The raw data is not You must procure and download databases by yourself. The raw data is not
included in this package as we are not authorised to redistribute it. included in this package as we are not authorised to redistribute it.
To check whether the downloaded version is consistent with the structure To check whether the downloaded version is consistent with the structure
...@@ -101,7 +101,7 @@ Here is an example configuration file that may be useful as a starting point: ...@@ -101,7 +101,7 @@ Here is an example configuration file that may be useful as a starting point:
.. code:: sh .. code:: sh
mednet dataset check montgomery mednet database check <database_name>
.. _mednet.setup.databases: .. _mednet.setup.databases:
...@@ -109,8 +109,8 @@ Here is an example configuration file that may be useful as a starting point: ...@@ -109,8 +109,8 @@ Here is an example configuration file that may be useful as a starting point:
Supported Databases Supported Databases
=================== ===================
Here is a list of currently supported datasets in this package, alongside Here is a list of currently supported databases in this package, alongside
notable properties. Each dataset name is linked to the location where notable properties. Each database name is linked to the location where
raw data can be downloaded. The list of images in each split is available raw data can be downloaded. The list of images in each split is available
in the source code. in the source code.
...@@ -120,13 +120,13 @@ in the source code. ...@@ -120,13 +120,13 @@ in the source code.
Tuberculosis databases Tuberculosis databases
~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~
The following datasets contain only the tuberculosis final diagnosis (0 or 1). The following databases contain only the tuberculosis final diagnosis (0 or 1).
In addition to the splits presented in the following table, 10 folds In addition to the splits presented in the following table, 10 folds
(for cross-validation) randomly generated are available for these datasets. (for cross-validation) randomly generated are available for these databases.
.. list-table:: .. list-table::
* - Dataset * - Database
- Reference - Reference
- H x W - H x W
- Samples - Samples
...@@ -156,20 +156,20 @@ In addition to the splits presented in the following table, 10 folds ...@@ -156,20 +156,20 @@ In addition to the splits presented in the following table, 10 folds
- 52 - 52
.. _mednet.setup.datasets.tb+signs: .. _mednet.setup.databases.tb+signs:
Tuberculosis multilabel databases Tuberculosis multilabel databases
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The following dataset contains the labels healthy, sick & non-TB, active TB, The following databases contain the labels healthy, sick & non-TB, active TB,
and latent TB. The implemented tbx11k dataset in this package is based on and latent TB. The implemented tbx11k database in this package is based on
the simplified version, which is just a more compact version of the original. the simplified version, which is just a more compact version of the original.
In addition to the splits presented in the following table, 10 folds In addition to the splits presented in the following table, 10 folds
(for cross-validation) randomly generated are available for these datasets. (for cross-validation) randomly generated are available for these databases.
.. list-table:: .. list-table::
* - Dataset * - Database
- Reference - Reference
- H x W - H x W
- Samples - Samples
...@@ -192,17 +192,17 @@ In addition to the splits presented in the following table, 10 folds ...@@ -192,17 +192,17 @@ In addition to the splits presented in the following table, 10 folds
- 2800 - 2800
.. _mednet.setup.datasets.tbmultilabel+signs: .. _mednet.setup.databases.tbmultilabel+signs:
Tuberculosis + radiological findings dataset Tuberculosis + radiological findings databases
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The following dataset contains both the tuberculosis final diagnosis (0 or 1) The following databases contain both the tuberculosis final diagnosis (0 or 1)
and radiological findings. and radiological findings.
.. list-table:: .. list-table::
* - Dataset * - Database
- Reference - Reference
- H x W - H x W
- Samples - Samples
...@@ -216,12 +216,12 @@ and radiological findings. ...@@ -216,12 +216,12 @@ and radiological findings.
- 0 - 0
.. _mednet.setup.datasets.signs: .. _mednet.setup.databases.signs:
Radiological findings datasets Radiological findings databases
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The following dataset contains only the radiological findings without any The following database contains only the radiological findings without any
information about tuberculosis. information about tuberculosis.
.. note:: .. note::
...@@ -231,7 +231,7 @@ information about tuberculosis. ...@@ -231,7 +231,7 @@ information about tuberculosis.
.. list-table:: .. list-table::
* - Dataset * - Database
- Reference - Reference
- H x W - H x W
- Samples - Samples
...@@ -247,20 +247,20 @@ information about tuberculosis. ...@@ -247,20 +247,20 @@ information about tuberculosis.
- 4'054 - 4'054
.. _mednet.setup.datasets.hiv-tb: .. _mednet.setup.databases.hiv-tb:
HIV-Tuberculosis datasets HIV-Tuberculosis databases
~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~
The following datasets contain only the tuberculosis final diagnosis (0 or 1) The following databases contain only the tuberculosis final diagnosis (0 or 1)
and come from HIV infected patients. 10 folds (for cross-validation) randomly and come from HIV infected patients. 10 folds (for cross-validation) randomly
generated are available for these datasets. generated are available for these databases.
Please contact the authors of these datasets to have access to the data. Please contact the authors of these databases to have access to the data.
.. list-table:: .. list-table::
* - Dataset * - Database
- Reference - Reference
- H x W - H x W
- Samples - Samples
......
.. Copyright © 2023 Idiap Research Institute <contact@idiap.ch>
..
.. SPDX-License-Identifier: GPL-3.0-or-later
.. _mednet.usage.aggregpred:
=======================================================
Aggregate multiple prediction files into a single one
=======================================================
This guide explains how to aggregate multiple prediction files into a single
one. It can be used when doing cross-validation to aggregate the predictions of
k different models before evaluating the aggregated predictions. We input
multiple prediction files (CSV files) and output a single one.
Use the sub-command :ref:`aggregpred <mednet.cli>` aggregate your prediction
files together:
.. code:: sh
mednet aggregpred -vv path/to/fold0/predictions.csv path/to/fold1/predictions.csv --output-folder=aggregpred
.. include:: ../links.rst
...@@ -8,7 +8,7 @@ ...@@ -8,7 +8,7 @@
Inference and Evaluation Inference and Evaluation
========================== ==========================
This guides explains how to run inference or a complete evaluation using This guide explains how to run inference or a complete evaluation using
command-line tools. Inference produces probability of TB presence for input command-line tools. Inference produces probability of TB presence for input
images, while evaluation will analyze such output against existing annotations images, while evaluation will analyze such output against existing annotations
and produce performance figures. and produce performance figures.
...@@ -17,61 +17,42 @@ and produce performance figures. ...@@ -17,61 +17,42 @@ and produce performance figures.
Inference Inference
--------- ---------
In inference (or prediction) mode, we input data, the trained model, and output In inference (or prediction) mode, we input a model, a dataset, a model checkpoint generated during training, and output
a CSV file containing the prediction outputs for every input image. a json file containing the prediction outputs for every input image.
To run inference, use the sub-command :ref:`predict <mednet.cli>` to run To run inference, use the sub-command :ref:`predict <mednet.cli>`.
prediction on an existing dataset:
.. code:: sh
mednet predict -vv <model> -w <path/to/model.pth> <dataset>
Examples
========
Replace ``<model>`` and ``<dataset>`` by the appropriate :ref:`configuration To run inference using a trained Pasa CNN on the Montgomery dataset:
files <mednet.config>`. Replace ``<path/to/model.pth>`` to a path leading to
the pre-trained model.
.. tip:: .. code:: sh
An option to generate grad-CAMs is available for the :py:mod:`DensenetRS mednet predict -vv pasa montgomery --weight=<path/to/model.ckpt> --output=<results/folder/predictions.json>
<mednet.config.models.densenet_rs>` model. To activate it, use the
``--grad-cams`` argument.
.. tip::
An option to generate a relevance analysis plot is available. To activate Replace ``<path/to/model.ckpt>`` to a path leading to the pre-trained model.
it, use the ``--relevance-analysis`` argument.
Evaluation Evaluation
---------- ----------
In evaluation, we input a dataset and predictions to generate performance In evaluation, we input predictions to generate performance summaries that help analysis of a trained model.
summaries that help analysis of a trained model. Evaluation is done using the The generated files are a pdf containing various plots and a table of metrics for each dataset split.
:ref:`evaluate command <mednet.cli>` followed by the model and the annotated Evaluation is done using the :ref:`evaluate command <mednet.cli>` followed by the json file generated during
dataset configuration, and the path to the pretrained weights via the the inference step and a threshold.
``--weight`` argument.
Use ``mednet evaluate --help`` for more information. Use ``mednet evaluate --help`` for more information.
E.g. run evaluation on predictions from the Montgomery set, do the following: Examples
========
.. code:: sh
mednet evaluate -vv montgomery -p /predictions/folder -o /eval/results/folder
Comparing Systems
-----------------
To compare multiple systems together and generate combined plots and tables, To run evaluation on predictions generated in the inference step, using an optimal threshold computed from the validation set, do the following:
use the :ref:`compare command <mednet.cli>`. Use ``--help`` for a quick
guide.
.. code:: sh .. code:: sh
mednet compare -vv A A/metrics.csv B B/metrics.csv --output-figure=plot.pdf --output-table=table.txt --threshold=0.5 mednet evaluate -vv --predictions=<path/to/predictions.json> --output-folder=<results/folder> --threshold=validation
.. include:: ../links.rst .. include:: ../links.rst
...@@ -16,6 +16,7 @@ tuberculosis detection with support for the following activities. ...@@ -16,6 +16,7 @@ tuberculosis detection with support for the following activities.
.. _mednet.usage.direct-detection: .. _mednet.usage.direct-detection:
Direct detection Direct detection
---------------- ----------------
...@@ -24,33 +25,31 @@ Direct detection ...@@ -24,33 +25,31 @@ Direct detection
automatically, via error back propagation. The objective of this phase is to automatically, via error back propagation. The objective of this phase is to
produce a CNN model. produce a CNN model.
* Inference (prediction): The CNN is used to generate TB predictions. * Inference (prediction): The CNN is used to generate TB predictions.
* Evaluation: Predications are used to evaluate CNN performance against * Evaluation: Predictions are used to evaluate CNN performance against
provided annotations, and to generate measure files and score tables. Optimal provided annotations, and to generate measure files and score tables. Optimal
thresholds are also calculated. thresholds can also be calculated.
* Comparison: Use predictions results to compare performance of multiple
systems.
.. _mednet.usage.indirect-detection: .. \_mednet.usage.indirect-detection:
Indirect detection .. Indirect detection
------------------ ------------------
* Training (step 1): Images are fed to a Convolutional Neural Network (CNN), .. * Training (step 1): Images are fed to a Convolutional Neural Network (CNN),
that is trained to detect the presence of radiological signs that is trained to detect the presence of radiological signs
automatically, via error back propagation. The objective of this phase is to automatically, via error back propagation. The objective of this phase is to
produce a CNN model. produce a CNN model.
* Inference (prediction): The CNN is used to generate radiological signs * Inference (prediction): The CNN is used to generate radiological signs
predictions. predictions.
* Conversion of the radiological signs predictions into a new dataset. * Conversion of the radiological signs predictions into a new dataset.
* Training (step 2): Radiological signs are fed to a shallow network, that is * Training (step 2): Radiological signs are fed to a shallow network, that is
trained to detect the presence of tuberculosis automatically, via error back trained to detect the presence of tuberculosis automatically, via error back
propagation. The objective of this phase is to produce a shallow model. propagation. The objective of this phase is to produce a shallow model.
* Inference (prediction): The shallow model is used to generate TB predictions. * Inference (prediction): The shallow model is used to generate TB predictions.
* Evaluation: Predications are used to evaluate CNN performance against * Evaluation: Predications are used to evaluate CNN performance against
provided annotations, and to generate measure files and score tables. provided annotations, and to generate measure files and score tables.
* Comparison: Use predictions results to compare performance of multiple * Comparison: Use predictions results to compare performance of multiple
systems. systems.
We provide :ref:`command-line interfaces (CLI) <mednet.cli>` that implement We provide :ref:`command-line interfaces (CLI) <mednet.cli>` that implement
each of the phases above. This interface is configurable using :ref:`clapper's each of the phases above. This interface is configurable using :ref:`clapper's
...@@ -63,7 +62,7 @@ to an application. ...@@ -63,7 +62,7 @@ to an application.
For reproducibility, we recommend you stick to configuration files when For reproducibility, we recommend you stick to configuration files when
parameterizing our CLI. Notice some of the options in the CLI interface parameterizing our CLI. Notice some of the options in the CLI interface
(e.g. ``--dataset``) cannot be passed via the actual command-line as it (e.g. ``--datamodule``) cannot be passed via the actual command-line as it
may require complex Python types that cannot be synthetized in a single may require complex Python types that cannot be synthetized in a single
input parameter. input parameter.
...@@ -84,8 +83,6 @@ Commands ...@@ -84,8 +83,6 @@ Commands
training training
evaluation evaluation
predtojson
aggregpred
.. include:: ../links.rst .. include:: ../links.rst
.. Copyright © 2023 Idiap Research Institute <contact@idiap.ch>
..
.. SPDX-License-Identifier: GPL-3.0-or-later
.. _mednet.usage.predtojson:
========================================
Converting predictions to JSON dataset
========================================
This guide explains how to convert radiological signs predictions from a model
into a JSON dataset. It can be used to create new versions of TB datasets with
the predicted radiological signs to be able to use a shallow model. We input
predictions (CSV files) and output a ``dataset.json`` file.
Use the sub-command :ref:`predtojson <mednet.cli>` to create your JSON dataset
file:
.. code:: sh
mednet predtojson -vv train train/predictions.csv test test/predictions.csv --output-folder=pred_to_json
.. include:: ../links.rst
...@@ -19,7 +19,7 @@ containing more detailed instructions. ...@@ -19,7 +19,7 @@ containing more detailed instructions.
.. tip:: .. tip::
We strongly advice training with a GPU (using ``--device="cuda:0"``). We strongly advise training with a GPU (using ``--device="cuda:0"``).
Depending on the available GPU memory you might have to adjust your batch Depending on the available GPU memory you might have to adjust your batch
size (``--batch``). size (``--batch``).
...@@ -33,41 +33,35 @@ To train Pasa CNN on the Montgomery dataset: ...@@ -33,41 +33,35 @@ To train Pasa CNN on the Montgomery dataset:
mednet train -vv pasa montgomery --batch-size=4 --epochs=150 mednet train -vv pasa montgomery --batch-size=4 --epochs=150
To train DensenetRS CNN on the NIH CXR14 dataset:
.. code:: sh .. Logistic regressor or shallow network
-------------------------------------
mednet train -vv nih_cxr14 densenet_rs --batch-size=8 --epochs=10
Logistic regressor or shallow network
-------------------------------------
To train a logistic regressor or a shallow network, use the command-line To train a logistic regressor or a shallow network, use the command-line
interface (CLI) application ``mednet train``, available on your prompt. To use interface (CLI) application ``mednet train``, available on your prompt. To use
this CLI, you must define the input dataset that will be used to train the this CLI, you must define the input dataset that will be used to train the
model, as well as the type of model that will be trained. model, as well as the type of model that will be trained.
You may issue ``mednet train --help`` for a help message containing more You may issue ``mednet train --help`` for a help message containing more
detailed instructions. detailed instructions.
Examples Examples
======== ========
To train a logistic regressor using predictions from DensenetForRS on the To train a logistic regressor using predictions from DensenetForRS on the
Montgomery dataset: Montgomery dataset:
.. code:: sh .. code:: sh
mednet train -vv logistic_regression montgomery_rs --batch-size=4 --epochs=20 mednet train -vv logistic_regression montgomery_rs --batch-size=4 --epochs=20
To train an multi-layer perceptron (MLP) using predictions from a densenet To train an multi-layer perceptron (MLP) using predictions from a densenet
pre-trained to detect radiological findings (using NIH CXR-14), on the Shenzhen pre-trained to detect radiological findings (using NIH CXR-14), on the Shenzhen
dataset: dataset:
.. code:: sh .. code:: sh
mednet train -vv mlp shenzhen_rs --batch-size=4 --epochs=20 mednet train -vv mlp shenzhen_rs --batch-size=4 --epochs=20
.. include:: ../links.rst .. include:: ../links.rst
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment