Changes

André Anjos · 376f4eee
--- a/Publishing-Reproducible-Papers-at-Idiap.md
+++ b/Publishing-Reproducible-Papers-at-Idiap.md
@@ -2,24 +2,46 @@ This guide is a step-by-step introduction and advice on how to publish your
 reproducible paper at Idiap.

 > This document is a **collaborative** effort between researchers at Idiap.
-> It is not meant as an exhaustive resource on software development or LaTeX writing.
-> It assumes you're proficient in those matters. If something differs between this guide and your experience, please modify it accordingly after discussing with your team mates.
+> It is not meant as an exhaustive resource on software development or LaTeX
+> writing. It assumes you're proficient in those matters. If something differs
+> between this guide and your experience, please modify it accordingly after
+> discussing with your team mates.


 ## Data

-If you have collected experimental data mentioned on your paper yourself, you need to publish your dataset separately, on Idiap’s data-distribution portal (DDP).
+If you have collected experimental data mentioned on your paper yourself, you
+need to publish your dataset separately, on Idiap’s data-distribution portal
+(DDP).

-To publish a new dataset on DDP, open a ticket on Idiap’s help-desk, and the gentle folk in the IT dept. will tell you exactly what to do. Just remember that the process usually takes at least a week, so try to plan things well in advance.
+To publish a new dataset on DDP, open a ticket on Idiap’s help-desk, and the
+gentle folk in the IT dept. will tell you exactly what to do. Just remember
+that the process usually takes at least a week, so try to plan things well in
+advance.

-Remember to include enough information on the DDP release so that your database *can* be re-used for someone even if they don't have access to your software package (e.g., somebody doing experiments using R, Julia or Matlab).
+Remember to include enough information on the DDP release so that your database
+*can* be re-used for someone even if they don't have access to your software
+package (e.g., somebody doing experiments using R, Julia or Matlab). This, for
+example, should include protocol descriptions and annotations if that is
+required to use your dataset.


 ## Software Package

-Software packages for a paper should live within the [Bob Gitlab Group](https://gitlab.idiap.ch/bob) and should be named `bob.paper.conferenceYEAR_subject` (e.g. `bob.paper.icb2018_veinrec`). This package should include the source code and instructions so *an independent researcher* can reproduce the specific results on your paper. A paper package **is** a valid Python package and, as such, it may be distributed on [PyPI](https://pypi.python.org).
-
-The code needs to be packaged in such a way that people downloading it can recreate the **exact software environment** in which you ran your experiments. If you do everything properly, someone else sitting in somewhere around the internet will be able to download your paper-package from https://pypi.python.org, unzip it and extract the files, and execute the following commands:
+Software packages for a paper should live within the [Bob Gitlab
+Group](https://gitlab.idiap.ch/bob) and should be named
+`bob.paper.conferenceYEAR_subject` (e.g. `bob.paper.icb2018_veinrec`). This
+package should include the source code and instructions so *an independent
+researcher* can reproduce the specific results on your paper. A paper package
+**is** a valid Python package and, as such, it may be distributed on
+[PyPI](https://pypi.python.org).
+
+The code needs to be packaged in such a way that people downloading it can
+recreate the **exact software environment** in which you ran your experiments.
+If you do everything properly, someone else sitting in somewhere around the
+internet will be able to download your paper-package from
+https://pypi.python.org, unzip it and extract the files, and execute the
+following commands:

 ```sh
 $ wget https://pypi.python.org/bob.paper....
@@ -32,13 +54,20 @@ $ source activate bob.paper.icb2018_veinrec
 $ buildout
 ```

-After doing the buildout, the user should be able to re-run your experiments, provided he/she has access to the data.
+After doing the buildout, the user should be able to re-run your experiments,
+provided he/she has access to the data.
+

 ### Package Organization

-There are several ways of publishing your code. Some people have taken the time to define a specific structure for organizing the code that makes it easy to publish it. This structure has been used to publish a lot of code based on Bob.
+> Note: Explanations on this section assume you have already become familiar
+> with Python packaging. Otherwise, please start from our [introductory guide](https://www.idiap.ch/software/bob/docs/bob/bob.extension/stable/pure_python.html)

-This is the typical structure of a paper package:
+There are several ways of publishing your code. Some people have taken the time
+to define a specific structure for organizing the code that makes it easy to
+publish it. This structure has been used to publish a lot of code based on Bob.
+
+This is the *typical* structure of a paper package:

 ```text
 bob/
@@ -54,19 +83,42 @@ setup.py

 A quick overview of these files:

-* `bob/` is the directory in which you put your source code. This includes scripts and whatever support files that you may need to implement all resources (tables and figures) on your paper. The files should be organized in subdirectories [matching your package name](https://www.idiap.ch/software/bob/docs/bob/bob.extension/stable/pure_python.html#anatomy-of-a-package).
-* `.gitlab-ci.yml` is the file that controls the CI instructions for testing your paper installation and whatever else you deem necessary
-* `COPYING` this is the license of your package. Normally, we set this to be GPLv3 as per Idiap advice. Just copy [this file](http://www.gnu.org/licenses/gpl.txt) and name it `COPYING` on the root of your package. 
-* `MANIFEST.in` should list files that you'd like to ship with your package
-* `README.rst` contains basic information about your paper including citations users may need to refer to in case they decide to use your publication on their own work. It should also include installation instructions for the package and, eventually, information on how to re-run your code **and** produce the results on your paper
-* `buildout.cfg` contains the basic recipe to create a working environment using your paper package
-* `environment.yml` contains the precise list of conda packages required to re-build, from scratch, the work environment in which you know the paper will successfuly run **and** produce the same results you published
-* `requirements.txt` contains the **direct** dependencies of your package (everything you `import` in **your** code). You don't need to include here *indirect* dependencies
-* `setup.py` corresponds to the Python packaging instructions. It reads `requirements.txt` and defines what this package name is and how to install it. Read more about it [here](https://www.idiap.ch/software/bob/docs/bob/bob.extension/stable/pure_python.html#setting-up-your-package)
+* `bob/` is the directory in which you put your source code. This includes
+  scripts and whatever support files that you may need to implement all
+  resources (tables and figures) on your paper. The files should be organized
+  in subdirectories [matching your package
+  name](https://www.idiap.ch/software/bob/docs/bob/bob.extension/stable/pure_python.html#anatomy-of-a-package).
+* `.gitlab-ci.yml` is the file that controls the CI instructions for testing
+  your paper installation and whatever else you deem necessary
+* `COPYING` this is the license of your package. Normally, we set this to be
+  GPLv3 as per Idiap advice. Just copy [this
+  file](http://www.gnu.org/licenses/gpl.txt) and name it `COPYING` on the root
+  of your package.
+* `MANIFEST.in` should list non-pythonic files that you'd like to ship with
+  your package
+* `README.rst` contains basic information about your paper including citations
+  users may need to refer to in case they decide to use your publication on
+  their own work. It should also include installation instructions for the
+  package and, eventually, information on how to re-run your code **and**
+  produce the results on your paper
+* `buildout.cfg` contains the basic recipe to create a working environment
+  using your paper package
+* `environment.yml` contains the precise list of conda packages required to
+  re-build, from scratch, the work environment in which you know the paper will
+  successfuly run **and** produce the same results you published
+* `requirements.txt` contains the **direct** dependencies of your package
+  (everything you `import` in **your** code). You don't need to include here
+  *indirect* dependencies
+* `setup.py` corresponds to the Python packaging instructions. It reads
+  `requirements.txt` and defines what this package name is and how to install
+  it. Read more about it [here](https://www.idiap.ch/software/bob/docs/bob/bob.extension/stable/pure_python.html#setting-up-your-package)

 More complex packaging *may* be required in special cases. For those, please refer to our complete [Bob extension guide](https://www.idiap.ch/software/bob/docs/bob/bob.extension/stable/index.html).

-Up-to-date templates for some of the above files may be found in [bob.admin](https://gitlab.idiap.ch/bob/bob.admin/tree/master/templates)
+Up-to-date templates for some of the above files may be found in
+[bob.admin](https://gitlab.idiap.ch/bob/bob.admin/tree/master/templates). Use
+those when in doubt.
+

 ### Checking the README file

@@ -76,12 +128,20 @@ You can check the `README.rst` file for warnings and errors like this:
 $ rst2html README.rst > /dev/null
 ```

-This should print eventual formatting errors you may have. You want to fix these **before** uploading your package to PyPI or the description there will be unformatted.
+This should print eventual formatting errors you may have. You want to fix
+these **before** uploading your package to PyPI or the description there will
+be unformatted.


 ### Continuous Integration

-Continuous integration is the ability to test your package every time you commit something to it (actually, when you push your changes back to gitlab). We advise you create a `.gitlab-ci.yml` file that reproduces your installation instructions and tries, at least, to check if the scripts can run. It does not have to be sophisticated. Something along the lines should do the trick:
+Continuous integration is the ability to test your package every time you
+commit something to it (actually, when you push your changes back to gitlab).
+We advise you create a `.gitlab-ci.yml` file that reproduces your installation
+instructions and tries, at least, to check if the scripts can run. It does not
+have to be sophisticated, like the ones we have for most Bob packages, just
+functional enough to test the basics. Something along the lines should do the
+trick:

 ```yaml
 test:
@@ -107,34 +167,63 @@ test:
    - docker
 ```

-The `tags` section of this YAML file is important as it tells the Gitlab CI infrastructure where to run your tests. Make sure you go to the "Settings / CI/CD" of your software package in Gitlab and enable the corresponding runners.
+The `tags` section of this YAML file is important as it tells the Gitlab CI
+infrastructure where to run your tests. Make sure you go to the "Settings /
+CI/CD" of your software package in Gitlab and enable the corresponding runners.


 ### Creating the `environment.yml` file

-In order to ensure that the user of your source code can **exactly** reproduce your published experimental results, you want to ensure that they are working in the **same environment.**  This means that the user should be working with the same versions of all the Python/Bob packages and package dependencies that you used when running your experiments.  An easy way to achieve this is to **freeze** your working environment into an `environment.yml` file, from which the user can then re-create the same working environment.
-
-Before we look at how to freeze a working environment, let's first consider how we would initially create the environment in which we wish to work.  Environment creation is based on conda and can vary depending on which packages you need.  For example, the `bob.paper.isba2018_entropy.env` environment for the `bob.paper.isba2018_entropy` paper package was created by executing the following command in the terminal:
+In order to ensure that the user of your source code can **exactly** reproduce
+your published experimental results, you want to ensure that they are working
+in the **same environment.**  This means that the user should be working with
+the same versions of all the Python/Bob packages and package dependencies that
+you used when running your experiments.  An easy way to achieve this is to
+**freeze** your working environment into an `environment.yml` file, from which
+the user can then re-create the same working environment.
+
+Before we look at how to freeze a working environment, let's first consider how
+we would initially create the environment in which we wish to work.
+Environment creation is based on conda and can vary depending on which packages
+you need.  For example, the `bob.paper.isba2018_entropy.env` environment for
+the `bob.paper.isba2018_entropy` paper package was created by executing the
+following command in the terminal:

 ```sh
 $ conda create -n bob.paper.isba2018_entropy.env --override-channels -c https://www.idiap.ch/software/bob/conda -c defaults python=2.7 bob-extras=2017.10.22 zc.buildout sphinx coverage nose
 ```
-To work in this environment, you must then navigate to your working directory and activate the environment.  Using the `bob.paper.isba2018_entropy` paper package as an example once again, this would be done by executing the following commands in your terminal:
+
+To work in this environment, you must then navigate to your working directory
+and activate the environment.  Using the `bob.paper.isba2018_entropy` paper
+package as an example once again, this would be done by executing the following
+commands in your terminal:

 ```sh
 $ cd bob.paper.isba2018_entropy
 $ source activate bob.paper.isba2018_entropy.env
 ```

-At this point, you are ready to freeze your environment with the following command:
+At this point, you are ready to freeze your environment with the following
+command:

 ```sh
 $ conda env export > environment.yml
 ```

-Now, open at your `environment.yml` file.  If it contains `zc.buildout`, remove the corresponding version number so that, if the version is upgraded at a later point, the user can still do `buildout` in their re-created environment.  You can also feel free to remove any packages in `environment.yml` that you know **for sure** are not needed by your paper package (if you are not sure, it's best not to remove anything).  Finally, remove the "prefix" section of your `environment.yml` file, since the user of your package does not need to know the path to your working directory (anyway, their path will be different).
-
-To make sure your frozen environment works as expected, test it on a different computer as follows, replacing `bob.paper.isba2018_entropy` with your package name and `bob.paper.isba2018_entropy.env` with the name of your previously-created environment:
+Now, open at your `environment.yml` file.  If it contains `zc.buildout` and
+`setuptools`, remove the corresponding version number so that, if the version
+is upgraded at a later point, the user can still do `buildout` in their
+re-created environment. You can also feel free to remove any packages in
+`environment.yml` that you know **for sure** are **not** needed by your paper
+package (if you are not sure, it's best not to remove anything).  Finally,
+remove the "prefix" section of your `environment.yml` file, since the user of
+your package does not need to know the path to your working directory (anyway,
+their path will be different).
+
+To make sure your frozen environment works as expected, test it on a different
+computer as follows, replacing `bob.paper.isba2018_entropy` with your package
+name and `bob.paper.isba2018_entropy.env` with the name of your
+previously-created environment:

 ```sh
 $ git clone https://gitlab.idiap.ch/bob/bob.paper.isba2018_entropy  # download package from GitLab
@@ -145,22 +234,44 @@ $ buildout  # generate the scripts necessary to run your experiments
 $ ./bin/verify.py vera-wld  # run your experiments
 ```

-When you run your experiments in the created environment, your results should be the same as those you originally obtained.
+When you run your experiments in the created environment, your results should
+be the same as those you originally obtained.

-Alternatively, you could simply test that your environment has been correctly created by incorporating the creation commands into your `.gitlab-ci.yml` file (see the "Continuous Integration" section, above).  Once you have created and edited your `environment.yml` as explained, commit the changes to Git and push to your project repository on GitLab.  If the pipeline for this commit succeeds, then your environment creation works as expected.
+Alternatively, you could simply test that your environment has been correctly
+created by incorporating the creation commands into your `.gitlab-ci.yml` file
+(see the "Continuous Integration" section, above).  Once you have created and
+edited your `environment.yml` as explained, commit the changes to Git and push
+to your project repository on GitLab.  If the pipeline for this commit
+succeeds, then your environment creation works as expected.
+
+And that's it!  All you need to do now is to include `environment.yml` in your
+`MANIFEST.in` file to make sure that your environment file is packaged along
+with your source code when creating a PyPI package.  Note that it is also a
+good idea to ensure that your environment creation and experiments work as
+expected when downloading your paper package from PyPI as opposed to cloning it
+from GitLab.

-And that's it!  All you need to do now is to include `environment.yml` in your `MANIFEST.in` file to make sure that your environment file is packaged along with your source code when creating a PyPI package.  Note that it is also a good idea to ensure that your environment creation and experiments work as expected when downloading your paper package from PyPI as opposed to cloning it from GitLab.

 ### Software Disclosure Agreement

-You should make your software package public. This normally has to go through a Software Disclosure agreement between you and Idiap. In order to kick-start the process open a help-desk ticket and go on from there. Include your supervisor in CC on that ticket, alongside with all involved partners. This process **can take up to a couple of weeks** to go through, as it may involve a software review.
+You should make your software package public. This normally has to go through a
+Software Disclosure agreement between you and Idiap. In order to kick-start the
+process open a help-desk ticket and go on from there. Include your supervisor
+in CC on that ticket, alongside with all involved partners. This process **can
+take up to a couple of weeks** to go through, as it may involve a software
+review.


 ### Publishing to PyPI

-After your software package is sedimented and tested to work, you can publish it to PyPI. Before doing so, make sure it is public and **read the section entitled "Software Disclosure Agreement"** above.
+After your software package is sedimented and tested to work, you can publish
+it to PyPI. Before doing so, make sure it is public and **read the section
+entitled "Software Disclosure Agreement"** above.

-We recommend you use [Twine](https://pypi.python.org/pypi/twine) to upload your software package to PyPI. You may pip-install it on your local conda-development environment to do so. Once the `twine` binary is in place, just execute the following commands:
+We recommend you use [Twine](https://pypi.python.org/pypi/twine) to upload your
+software package to PyPI. You may pip-install it on your local
+conda-development environment to do so. Once the `twine` binary is in place,
+just execute the following commands:

 ```sh
 #remember to use Python from your conda env
@@ -168,11 +279,16 @@ $ python setup.py sdist --formats zip
 $ twine upload dist/*.zip
 ```

-The `twine` command will require you enter a username and password for PyPI uploading. You *should* use our special account for this, so we keep track of all published packages. Ask people around for information.
+The `twine` command will require you enter a username and password for PyPI
+uploading. You *should* use our special account for this, so we keep track of
+all published packages. Ask people around for information.

-Once your package is uploaded to PyPI, you can paste the link from that server into your article. It should look like this: https://pypi.python.org/pypi/bob.paper.isba2018-entropy
+Once your package is uploaded to PyPI, you can paste the link from that server
+into your article. It should look like this:
+https://pypi.python.org/pypi/bob.paper.isba2018-entropy

-Do **not** include the version number on the link you paste on your article, or you won't be able to update the package in case of issues later on. 
+Do **not** include the version number on the link you paste on your article, or
+you won't be able to update the package in case of issues later on.


 ### Example software packages:
@@ -183,9 +299,16 @@ Do **not** include the version number on the link you paste on your article, or

 ## Paper (LaTeX) Source Code

-The source-code for your article should be in the [Biometrics Gitlab group](https://gitlab.idiap.ch/biometric/). If you don't have permissions to create a repository, ask for someone who does. The Gitlab project for a paper package should **not** be made public.
+The source-code for your article should be in the [Biometrics Gitlab
+group](https://gitlab.idiap.ch/biometric/). If you don't have permissions to
+create a repository, ask for someone who does. The Gitlab project for a paper
+package should **not** be made public.
+
+LaTeX source code projects should be named `paper.conferenceYEAR.subject`. For
+example, for the above software project name your LaTeX source code as
+`paper.icb2018.veinrec`. The contents of your package should be simple and
+include a `Makefile` to build the PDF of your paper from the sources.

-LaTeX source code projects should be named `paper.conferenceYEAR.subject`. For example, for the above software project name your LaTeX source code as `paper.icb2018.veinrec`. The contents of your package should be simple and include a `Makefile` to build the PDF of your paper from the sources.

 ### Examples

@@ -194,7 +317,8 @@ LaTeX source code projects should be named `paper.conferenceYEAR.subject`. For e

 ### Continuous Integration

-You can setup Gitlab CI to also test the build of your article at every push. Here is an example YAML file that does the trick:
+You can setup Gitlab CI to also test the build of your article at every push.
+Here is an example YAML file that does the trick:

 ```
 stages:
@@ -220,20 +344,38 @@ macosx:
  - beat-macosx
 ```

-These CI instructions will try to build your paper in both Linux and MacOSX-based installations. It will preserve the PDF as build artifact you can download and check. The PDF will be available for up to one week after the build ends.
+These CI instructions will try to build your paper in both Linux and
+MacOSX-based installations. It will preserve the PDF as build artifact you can
+download and check. The PDF will be available for up to one week after the
+build ends, which is a nice plus for sharing.

-Remember to activate the respective runners corresponding to the `tags` above on your Gitlab project `Settings / CI/CD` page.
+Remember to activate the respective runners corresponding to the `tags` above
+on your Gitlab project `Settings / CI/CD` page.


 ### Upload to the Idiap publications website

-Your paper **must** be listed on the [Idiap publications portal](https://publications.idiap.ch). You want to do this so that you can list these contributions on your annual report later when you'll have to write it. There are two instances in which input to this website must occur:
-
-1. When you **submit** your paper to a conference or journal, you should create a **Idiap-Internal-RR** (Research Report) that will remain *private* while you wait for your paper acceptance answer.
-2. If and when your paper is *accepted*, then you must create **another** entry on that website that will become public. In this case, don't choose **Idiap-Internal-RR** anymore, but the appropriate entry that will make it **public**. Cross-reference the internal research report on that entry.
-
-You may add the URL to your paper software-package on PyPI as an entry "note" on the article you're creating on the Idiap website. Remember to add **all** projects from which your work has received grants from at the appropriate form entry.
-
-*Carefully* act on this website when uploading your contributions. Public entries cannot be easily undone as the system synchronizes automatically with other publication portals in Switzerland.
-
-Once you have a link on the Idiap publications website, share this link with your supervisor and other parties involved.
\ No newline at end of file
+Your paper **must** be listed on the [Idiap publications
+portal](https://publications.idiap.ch). You want to do this so that you can
+list these contributions on your annual report later when you'll have to write
+it. There are two instances in which input to this website must occur:
+
+1. When you **submit** your paper to a conference or journal, you should create
+   a **Idiap-Internal-RR** (Research Report) that will remain *private* while
+   you wait for your paper acceptance answer.
+2. If and when your paper is *accepted*, then you must create **another** entry
+   on that website that will become public. In this case, don't choose
+   **Idiap-Internal-RR** anymore, but the appropriate entry that will make it
+   **public**. Cross-reference the internal research report on that entry.
+
+You may add the URL to your paper software-package on PyPI as an entry "note"
+on the article you're creating on the Idiap website. Remember to add **all**
+projects from which your work has received grants from at the appropriate form
+entry.
+
+*Carefully* act on this website when uploading your contributions. Public
+entries cannot be easily undone as the system synchronizes automatically with
+other publication portals in Switzerland.
+
+Once you have a link on the Idiap publications website, share this link with
+your supervisor and other parties involved.