Updated documentation

25d57c52 · Manuel Günther · 21954612 · 25d57c52
Commit 25d57c52 authored 8 years ago by Manuel Günther
--- a/doc/filelist-guide.rst
+++ b/doc/filelist-guide.rst
@@ -15,20 +15,24 @@ All functions defined in that interface are properly instantiated, as soon as th
 Creating File Lists
 -------------------

-The initial step for using this package is to provide file lists specifying the ``'world'`` (training), ``'dev'`` (development) and ``'eval'`` (evaluation) set to be used by the biometric verification algorithm.
+The initial step for using this package is to provide file lists specifying the ``'world'`` (training; optional), ``'dev'`` (development; required) and ``'eval'`` (evaluation; optional) set to be used by the biometric verification algorithm.
 The summarized complete structure of the list base directory (here denoted as ``basedir``) containing all the files should be like this::

-  basedir -- norm -- train_world.lst
-         |       |-- train_optional_world_1.lst
-         |       |-- train_optional_world_2.lst
+    filelists_directory
+         |-- norm
+               |-- train_world.lst
+               |-- train_optional_world_1.lst
+               |-- train_optional_world_2.lst
         |
-         |-- dev -- for_models.lst
-         |      |-- for_probes.lst
-         |      |-- for_scores.lst
-         |      |-- for_tnorm.lst
-         |      |-- for_znorm.lst
+         |-- dev
+               |-- for_models.lst
+               |-- for_probes.lst
+               |-- for_scores.lst
+               |-- for_tnorm.lst
+               |-- for_znorm.lst
         |
-         |-- eval -- for_models.lst
+         |-- eval
+               |-- for_models.lst
               |-- for_probes.lst
               |-- for_scores.lst
               |-- for_tnorm.lst
@@ -53,7 +57,7 @@ A complete list of possible information is:

 The following list files need to be created:

- **For training**:
+- **For training** (optional):

  * *world file*, with default name ``train_world.lst``, in the default sub-directory ``norm``.
    It is a 2-column file with format:
@@ -62,14 +66,14 @@ The following list files need to be created:

      filename client_id

-  * two (optional) *world files*, with default names ``train_optional_world_1.lst`` and ``train_optional_world_2.lst``, in default sub-directory ``norm``.
+  * two *world files*, with default names ``train_optional_world_1.lst`` and ``train_optional_world_2.lst``, in default sub-directory ``norm``.
    The format is the same as for the world file.
    These files are not needed for most of biometric recognition algorithms, hence, they need to be specified only if the algorithm uses them.

 - **For enrollment**:

-  * two *model files* for the development and evaluation set, with default name ``for_models.lst`` in the default sub-directories ``dev`` and ``eval``, respectively.
-    They are 3-column files with format::
+  * one or two *model files* for the development (and evaluation) set, with default name ``for_models.lst`` in the default sub-directories ``dev`` (and ``eval``).
+    They are 3-column files with format:

    .. code-block:: text

@@ -80,45 +84,45 @@ The following list files need to be created:
  There exist two different ways to implement file lists used for scoring.

  * The first (and simpler) variant is to define a file list of probe files, where all probe files will be tested against all models.
-    Hence, you need to specify two *probe files* for the development and evaluation set, with default name ``for_probes.lst`` in the  default sub-directories ``dev`` and ``eval``, respectively.
+    Hence, you need to specify one or two *probe files* for the development (and evaluation) set, with default name ``for_probes.lst`` in the  default sub-directories ``dev`` (and ``eval``).
    They are 2-column files with format:

    .. code-block:: text

      filename client_id

-  * The other option is to specify a detailed list, which probe file should be be compared with which client model, i.e., two *score files* for the development and evaluation set, with default name ``for_scores.lst`` in the  sub-directories ``dev`` and ``eval``, respectively.
+  * The other option is to specify a detailed list, which probe file should be be compared with which client model, i.e., one or two *score files* for the development (and evaluation) set, with default name ``for_scores.lst`` in the  sub-directories ``dev`` (and ``eval``).
    These files need to be provided only if the scoring is to be done selectively, meaning by creating a sparse probe/model scoring matrix.
-    They are 4-column files with format::
+    They are 4-column files with format:

    .. code-block:: text

      filename model_id claimed_client_id client_id

- **For ZT score normalization**:
+  .. note:: The verification queries will use either only the probe or only the score files, so only one of them is mandatory.
+            If only one of the two files is available, the scoring technique will be automatically determined.
+            In case both probe and score files are provided, the user should set the parameter ``use_dense_probe_file_list``, which specifies the files to consider, when creating the object of the ``Database`` class.
+
+
+- **For ZT score normalization** (optional):

  Optionally, file lists for ZT score normalization can be added.
-  These are
+  These are:

-  * two *files for t-score normalization* for the development and evaluation set, with default name ``for_tnorm.lst`` in both sub-directories ``dev`` and ``eval``, respectively.
-    They are 3-column files with format::
+  * one or two *files for t-score normalization* for the development (and evaluation) set, with default name ``for_tnorm.lst`` in both sub-directories ``dev`` (and ``eval``).
+    They are 3-column files with format:

    .. code-block:: text

      filename model_id client_id

-  * two *files for z-score normalization* for the development and evaluation set, with default name ``for_znorm.lst`` in both sub-directories ``dev`` and ``eval``, respectively.
-    They are 2-column files with format::
+  * one or two *files for z-score normalization* for the development (and evaluation) set, with default name ``for_znorm.lst`` in both sub-directories ``dev`` (and ``eval``).
+    They are 2-column files with format:

    .. code-block:: text

      filename client_id

-.. note:: The verification queries will use either only the probe or only the score files, so only one of them is mandatory.
-          In case both probe and score files are provided, the user should set the parameter ``use_dense_probe_file_list``, which specifies the files to consider, when creating the object of the ``Database`` class.
-
-.. note:: If the database does not provide an evaluation set, the scoring files can be omitted.
-          Similarly, if the user only define **for scoring** files and omit the remaining ones, the only valid queries will be scoring-related ones.



@@ -145,17 +149,24 @@ Alternatively, if you have more protocols, you could do the following:
  >>> db = bob.bio.base.database.FileListBioDatabase('basedir', 'mydb', protocol='protocol')
  >>> db.objects()

+or specify the protocol while querying the database:
+
+.. code-block:: python
+
+  >>> db = bob.bio.base.database.FileListBioDatabase('basedir', 'mydb')
+  >>> db.objects(protocol='protocol')
+
 When a protocol is specified, it is appended to the base directory that contains the file lists.
-If you need to use another protocol, the best option is to create another instance.
-For instance, given two protocols 'P1' and 'P2' (with filelists contained in 'basedir/P1' and 'basedir/P2', respectively), the following would work:
+You can query the database with ``another`` protocol, simply as:

 .. code-block:: python

-  >>> db1 = bob.bio.base.database.FileListBioDatabase('basedir', 'mydb', protocol='P1')
-  >>> db2 = bob.bio.base.database.FileListBioDatabase('basedir', 'mydb', protocol='P2')
-  >>> db1.objects() # Get the objects for the protocol P1
-  >>> db2.objects() # Get the objects for the protocol P2
+  >>> db = bob.bio.base.database.FileListBioDatabase('basedir', 'mydb')
+  >>> db.objects(protocol='protocol')
+  >>> db.objects(protocol='another')
+
+and you retrieve the files stored in `basedir/protocol` and `basedir/another`, respectively.

-Note that if you use several protocols as explained above, the scoring part should be defined in the same way for all the protocols, either by using ``for_probes.lst`` or ``for_scores.lst``.
-This means that at the time of the database instantiation, it will be determined (or specified using the ``use_dense_probe_file_list`` optional argument), whether the protocols should use the content of ``for_probes.lst`` or ``for_scores.lst``.
-In particular, it is not possible to use a mixture of those for different protocols, once the database object has been created.
+.. note::
+   If you use several protocols as explained above, the ``use_dense_probe_file_list`` parameter is global for all protocols.
+   In case you have ``for_scores.lst`` in one and ``for_probes.lst`` in another protocol, it will automatically switch between the scoring strategies -- as long as you leave ``use_dense_probe_file_list=None``.