diff --git a/doc/install.rst b/doc/install.rst
index 151bfef284149a33451542f16b5579f8bf487eb5..aaf5baaae32b3bdf6319edc2c31e393ef3e7c75e 100644
--- a/doc/install.rst
+++ b/doc/install.rst
@@ -90,10 +90,10 @@ Here is an example configuration file that may be useful as a starting point:
 
    .. code:: sh
 
-      mednet dataset list
+      mednet database list
 
 
-   You must procure and download datasets by yourself.  The raw data is not
+   You must procure and download databases by yourself.  The raw data is not
    included in this package as we are not authorised to redistribute it.
 
    To check whether the downloaded version is consistent with the structure
@@ -101,7 +101,7 @@ Here is an example configuration file that may be useful as a starting point:
 
    .. code:: sh
 
-      mednet dataset check montgomery
+      mednet database check <database_name>
 
 
 .. _mednet.setup.databases:
@@ -109,8 +109,8 @@ Here is an example configuration file that may be useful as a starting point:
 Supported Databases
 ===================
 
-Here is a list of currently supported datasets in this package, alongside
-notable properties.  Each dataset name is linked to the location where
+Here is a list of currently supported databases in this package, alongside
+notable properties.  Each database name is linked to the location where
 raw data can be downloaded.  The list of images in each split is available
 in the source code.
 
@@ -120,13 +120,13 @@ in the source code.
 Tuberculosis databases
 ~~~~~~~~~~~~~~~~~~~~~~
 
-The following datasets contain only the tuberculosis final diagnosis (0 or 1).
+The following databases contain only the tuberculosis final diagnosis (0 or 1).
 In addition to the splits presented in the following table, 10 folds
-(for cross-validation) randomly generated are available for these datasets.
+(for cross-validation) randomly generated are available for these databases.
 
 .. list-table::
 
-   * - Dataset
+   * - Database
      - Reference
      - H x W
      - Samples
@@ -156,20 +156,20 @@ In addition to the splits presented in the following table, 10 folds
      - 52
 
 
-.. _mednet.setup.datasets.tb+signs:
+.. _mednet.setup.databases.tb+signs:
 
 Tuberculosis multilabel databases
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-The following dataset contains the labels healthy, sick & non-TB, active TB,
-and latent TB. The implemented tbx11k dataset in this package is based on
+The following databases contain the labels healthy, sick & non-TB, active TB,
+and latent TB. The implemented tbx11k database in this package is based on
 the simplified version, which is just a more compact version of the original.
 In addition to the splits presented in the following table, 10 folds
-(for cross-validation) randomly generated are available for these datasets.
+(for cross-validation) randomly generated are available for these databases.
 
 .. list-table::
 
-   * - Dataset
+   * - Database
      - Reference
      - H x W
      - Samples
@@ -192,17 +192,17 @@ In addition to the splits presented in the following table, 10 folds
      - 2800
 
 
-.. _mednet.setup.datasets.tbmultilabel+signs:
+.. _mednet.setup.databases.tbmultilabel+signs:
 
-Tuberculosis + radiological findings dataset
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tuberculosis + radiological findings databases
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-The following dataset contains both the tuberculosis final diagnosis (0 or 1)
+The following databases contain both the tuberculosis final diagnosis (0 or 1)
 and radiological findings.
 
 .. list-table::
 
-   * - Dataset
+   * - Database
      - Reference
      - H x W
      - Samples
@@ -216,12 +216,12 @@ and radiological findings.
      - 0
 
 
-.. _mednet.setup.datasets.signs:
+.. _mednet.setup.databases.signs:
 
-Radiological findings datasets
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Radiological findings databases
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-The following dataset contains only the radiological findings without any
+The following database contains only the radiological findings without any
 information about tuberculosis.
 
 .. note::
@@ -231,7 +231,7 @@ information about tuberculosis.
 
 .. list-table::
 
-   * - Dataset
+   * - Database
      - Reference
      - H x W
      - Samples
@@ -247,20 +247,20 @@ information about tuberculosis.
      - 4'054
 
 
-.. _mednet.setup.datasets.hiv-tb:
+.. _mednet.setup.databases.hiv-tb:
 
-HIV-Tuberculosis datasets
-~~~~~~~~~~~~~~~~~~~~~~~~~
+HIV-Tuberculosis databases
+~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-The following datasets contain only the tuberculosis final diagnosis (0 or 1)
+The following databases contain only the tuberculosis final diagnosis (0 or 1)
 and come from HIV infected patients. 10 folds (for cross-validation) randomly
-generated are available for these datasets.
+generated are available for these databases.
 
-Please contact the authors of these datasets to have access to the data.
+Please contact the authors of these databases to have access to the data.
 
 .. list-table::
 
-   * - Dataset
+   * - Database
      - Reference
      - H x W
      - Samples
diff --git a/doc/usage/aggregpred.rst b/doc/usage/aggregpred.rst
deleted file mode 100644
index 661750bdfcdd541231833745cf99888c167c5247..0000000000000000000000000000000000000000
--- a/doc/usage/aggregpred.rst
+++ /dev/null
@@ -1,24 +0,0 @@
-.. Copyright Â© 2023 Idiap Research Institute <contact@idiap.ch>
-..
-.. SPDX-License-Identifier: GPL-3.0-or-later
-
-.. _mednet.usage.aggregpred:
-
-=======================================================
- Aggregate multiple prediction files into a single one
-=======================================================
-
-This guide explains how to aggregate multiple prediction files into a single
-one. It can be used when doing cross-validation to aggregate the predictions of
-k different models before evaluating the aggregated predictions. We input
-multiple prediction files (CSV files) and output a single one.
-
-Use the sub-command :ref:`aggregpred <mednet.cli>` aggregate your prediction
-files together:
-
-.. code:: sh
-
-   mednet aggregpred -vv path/to/fold0/predictions.csv path/to/fold1/predictions.csv --output-folder=aggregpred
-
-
-.. include:: ../links.rst
diff --git a/doc/usage/evaluation.rst b/doc/usage/evaluation.rst
index d490f991c105990d81f4d29237ac2fbc49cac941..cee0eb457cea665f31f5e9f1ff517482c982d376 100644
--- a/doc/usage/evaluation.rst
+++ b/doc/usage/evaluation.rst
@@ -8,7 +8,7 @@
  Inference and Evaluation
 ==========================
 
-This guides explains how to run inference or a complete evaluation using
+This guide explains how to run inference or a complete evaluation using
 command-line tools.  Inference produces probability of TB presence for input
 images, while evaluation will analyze such output against existing annotations
 and produce performance figures.
@@ -17,61 +17,42 @@ and produce performance figures.
 Inference
 ---------
 
-In inference (or prediction) mode, we input data, the trained model, and output
-a CSV file containing the prediction outputs for every input image.
+In inference (or prediction) mode, we input a model, a dataset, a model checkpoint generated during training, and output
+a json file containing the prediction outputs for every input image.
 
-To run inference, use the sub-command :ref:`predict <mednet.cli>` to run
-prediction on an existing dataset:
-
-.. code:: sh
-
-   mednet predict -vv <model> -w <path/to/model.pth> <dataset>
+To run inference, use the sub-command :ref:`predict <mednet.cli>`.
 
+Examples
+========
 
-Replace ``<model>`` and ``<dataset>`` by the appropriate :ref:`configuration
-files <mednet.config>`.  Replace ``<path/to/model.pth>`` to a path leading to
-the pre-trained model.
+To run inference using a trained Pasa CNN on the Montgomery dataset:
 
-.. tip::
+.. code:: sh
 
-   An option to generate grad-CAMs is available for the :py:mod:`DensenetRS
-   <mednet.config.models.densenet_rs>` model. To activate it, use the
-   ``--grad-cams`` argument.
+   mednet predict -vv pasa montgomery --weight=<path/to/model.ckpt> --output=<results/folder/predictions.json>
 
-.. tip::
 
-   An option to generate a relevance analysis plot is available. To activate
-   it, use the ``--relevance-analysis`` argument.
+Replace ``<path/to/model.ckpt>`` to a path leading to the pre-trained model.
 
 
 Evaluation
 ----------
 
-In evaluation, we input a dataset and predictions to generate performance
-summaries that help analysis of a trained model.  Evaluation is done using the
-:ref:`evaluate command <mednet.cli>` followed by the model and the annotated
-dataset configuration, and the path to the pretrained weights via the
-``--weight`` argument.
+In evaluation, we input predictions to generate performance summaries that help analysis of a trained model.
+The generated files are a pdf containing various plots and a table of metrics for each dataset split.
+Evaluation is done using the :ref:`evaluate command <mednet.cli>` followed by the json file generated during
+the inference step and a threshold.
 
 Use ``mednet evaluate --help`` for more information.
 
-E.g. run evaluation on predictions from the Montgomery set, do the following:
-
-.. code:: sh
-
-   mednet evaluate -vv montgomery -p /predictions/folder -o /eval/results/folder
-
-
-Comparing Systems
------------------
+Examples
+========
 
-To compare multiple systems together and generate combined plots and tables,
-use the :ref:`compare command <mednet.cli>`.  Use ``--help`` for a quick
-guide.
+To run evaluation on predictions generated in the inference step, using an optimal threshold computed from the validation set, do the following:
 
 .. code:: sh
 
-   mednet compare -vv A A/metrics.csv B B/metrics.csv --output-figure=plot.pdf --output-table=table.txt --threshold=0.5
+   mednet evaluate -vv --predictions=<path/to/predictions.json> --output-folder=<results/folder> --threshold=validation
 
 
 .. include:: ../links.rst
diff --git a/doc/usage/index.rst b/doc/usage/index.rst
index 05dd69181eb1aca0dd8b538b844ea3efd9367870..df365264fe67993c78b484976f52d0e883740e6a 100644
--- a/doc/usage/index.rst
+++ b/doc/usage/index.rst
@@ -16,6 +16,7 @@ tuberculosis detection with support for the following activities.
 
 .. _mednet.usage.direct-detection:
 
+
 Direct detection
 ----------------
 
@@ -24,33 +25,31 @@ Direct detection
   automatically, via error back propagation. The objective of this phase is to
   produce a CNN model.
 * Inference (prediction): The CNN is used to generate TB predictions.
-* Evaluation: Predications are used to evaluate CNN performance against
+* Evaluation: Predictions are used to evaluate CNN performance against
   provided annotations, and to generate measure files and score tables. Optimal
-  thresholds are also calculated.
-* Comparison: Use predictions results to compare performance of multiple
-  systems.
+  thresholds can also be calculated.
 
 
-.. _mednet.usage.indirect-detection:
+.. \_mednet.usage.indirect-detection:
 
-Indirect detection
-------------------
+.. Indirect detection
+  ------------------
 
-* Training (step 1): Images are fed to a Convolutional Neural Network (CNN),
+.. * Training (step 1): Images are fed to a Convolutional Neural Network (CNN),
   that is trained to detect the presence of radiological signs
   automatically, via error back propagation. The objective of this phase is to
   produce a CNN model.
-* Inference (prediction): The CNN is used to generate radiological signs
-  predictions.
-* Conversion of the radiological signs predictions into a new dataset.
-* Training (step 2): Radiological signs are fed to a shallow network, that is
-  trained to detect the presence of tuberculosis automatically, via error back
-  propagation. The objective of this phase is to produce a shallow model.
-* Inference (prediction): The shallow model is used to generate TB predictions.
-* Evaluation: Predications are used to evaluate CNN performance against
-  provided annotations, and to generate measure files and score tables.
-* Comparison: Use predictions results to compare performance of multiple
-  systems.
+  * Inference (prediction): The CNN is used to generate radiological signs
+    predictions.
+  * Conversion of the radiological signs predictions into a new dataset.
+  * Training (step 2): Radiological signs are fed to a shallow network, that is
+    trained to detect the presence of tuberculosis automatically, via error back
+    propagation. The objective of this phase is to produce a shallow model.
+  * Inference (prediction): The shallow model is used to generate TB predictions.
+  * Evaluation: Predications are used to evaluate CNN performance against
+    provided annotations, and to generate measure files and score tables.
+  * Comparison: Use predictions results to compare performance of multiple
+    systems.
 
 We provide :ref:`command-line interfaces (CLI) <mednet.cli>` that implement
 each of the phases above. This interface is configurable using :ref:`clapper's
@@ -63,7 +62,7 @@ to an application.
 
    For reproducibility, we recommend you stick to configuration files when
    parameterizing our CLI. Notice some of the options in the CLI interface
-   (e.g. ``--dataset``) cannot be passed via the actual command-line as it
+   (e.g. ``--datamodule``) cannot be passed via the actual command-line as it
    may require complex Python types that cannot be synthetized in a single
    input parameter.
 
@@ -84,8 +83,6 @@ Commands
 
    training
    evaluation
-   predtojson
-   aggregpred
 
 
 .. include:: ../links.rst
diff --git a/doc/usage/predtojson.rst b/doc/usage/predtojson.rst
deleted file mode 100644
index 30ff645a379f29b09d4898c12899c286929314fe..0000000000000000000000000000000000000000
--- a/doc/usage/predtojson.rst
+++ /dev/null
@@ -1,24 +0,0 @@
-.. Copyright Â© 2023 Idiap Research Institute <contact@idiap.ch>
-..
-.. SPDX-License-Identifier: GPL-3.0-or-later
-
-.. _mednet.usage.predtojson:
-
-========================================
- Converting predictions to JSON dataset
-========================================
-
-This guide explains how to convert radiological signs predictions from a model
-into a JSON dataset. It can be used to create new versions of TB datasets with
-the predicted radiological signs to be able to use a shallow model. We input
-predictions (CSV files) and output a ``dataset.json`` file.
-
-Use the sub-command :ref:`predtojson <mednet.cli>` to create your JSON dataset
-file:
-
-.. code:: sh
-
-   mednet predtojson -vv train train/predictions.csv test test/predictions.csv --output-folder=pred_to_json
-
-
-.. include:: ../links.rst
diff --git a/doc/usage/training.rst b/doc/usage/training.rst
index 4172732e030e6bdb5b7a9e345d2a5da75bf1022c..9ca5e01d7a9485dce41ac892a0de338ca6995cd5 100644
--- a/doc/usage/training.rst
+++ b/doc/usage/training.rst
@@ -19,7 +19,7 @@ containing more detailed instructions.
 
 .. tip::
 
-   We strongly advice training with a GPU (using ``--device="cuda:0"``).
+   We strongly advise training with a GPU (using ``--device="cuda:0"``).
    Depending on the available GPU memory you might have to adjust your batch
    size (``--batch``).
 
@@ -33,41 +33,35 @@ To train Pasa CNN on the Montgomery dataset:
 
    mednet train -vv pasa montgomery --batch-size=4 --epochs=150
 
-To train DensenetRS CNN on the NIH CXR14 dataset:
 
-.. code:: sh
-
-   mednet train -vv nih_cxr14 densenet_rs --batch-size=8 --epochs=10
-
-
-Logistic regressor or shallow network
--------------------------------------
+.. Logistic regressor or shallow network
+   -------------------------------------
 
-To train a logistic regressor or a shallow network, use the command-line
-interface (CLI) application ``mednet train``, available on your prompt. To use
-this CLI, you must define the input dataset that will be used to train the
-model, as well as the type of model that will be trained.
-You may issue ``mednet train --help`` for a help message containing more
-detailed instructions.
+   To train a logistic regressor or a shallow network, use the command-line
+   interface (CLI) application ``mednet train``, available on your prompt. To use
+   this CLI, you must define the input dataset that will be used to train the
+   model, as well as the type of model that will be trained.
+   You may issue ``mednet train --help`` for a help message containing more
+   detailed instructions.
 
-Examples
-========
+   Examples
+   ========
 
-To train a logistic regressor using predictions from DensenetForRS on the
-Montgomery dataset:
+   To train a logistic regressor using predictions from DensenetForRS on the
+   Montgomery dataset:
 
-.. code:: sh
+   .. code:: sh
 
-   mednet train -vv logistic_regression montgomery_rs --batch-size=4 --epochs=20
+      mednet train -vv logistic_regression montgomery_rs --batch-size=4 --epochs=20
 
 
-To train an multi-layer perceptron (MLP) using predictions from a densenet
-pre-trained to detect radiological findings (using NIH CXR-14), on the Shenzhen
-dataset:
+   To train an multi-layer perceptron (MLP) using predictions from a densenet
+   pre-trained to detect radiological findings (using NIH CXR-14), on the Shenzhen
+   dataset:
 
-.. code:: sh
+   .. code:: sh
 
-   mednet train -vv mlp shenzhen_rs --batch-size=4 --epochs=20
+      mednet train -vv mlp shenzhen_rs --batch-size=4 --epochs=20
 
 
 .. include:: ../links.rst