... | ... | @@ -3,8 +3,194 @@ |
|
|
This guide is a step-by-step introduction and advice on how to publish your
|
|
|
reproducible paper at Idiap.
|
|
|
|
|
|
> This document is a collaborative effort between researchers at Idiap.
|
|
|
> This document is a **collaborative** effort between researchers at Idiap.
|
|
|
> It is not meant as an exhaustive resource on software development or LaTeX writing.
|
|
|
> It assumes you're proficient in those matters.
|
|
|
> It assumes you're proficient in those matters. If something differs between this guide and your experiment, please modify it accordingly after discussing with your team mates.
|
|
|
|
|
|
|
|
|
## Data
|
|
|
|
|
|
If you have collected experimental data mentioned on your paper yourself, you need to publish your dataset separately, on Idiap’s data-distribution portal (DDP).
|
|
|
|
|
|
To publish a new dataset on DDP, open a ticket on Idiap’s help-desk, and the gentle folk in the IT dept. will tell you exactly what to do. Just remember that the process usually takes at least a week, so try to plan things well in advance.
|
|
|
|
|
|
Remember to include enough information on the DDP release so that your database *can* be re-used for someone even if they don't have access to your software package (e.g., somebody doing experiments using R, Julia or Matlab).
|
|
|
|
|
|
|
|
|
## Software Package
|
|
|
|
|
|
Software packages for a paper should live within the [Bob Gitlab Group](https://gitlab.idiap.ch/bob) and should be named `bob.paper.conferenceYEAR_subject` (e.g. `bob.paper.icb2018_veinrec`). This package should include the source code and instructions so *an independent researcher* can reproduce the specific results on your paper. A paper package **is** a valid Python package and, as such, it may be distributed on [PyPI](https://pypi.python.org).
|
|
|
|
|
|
The code needs to be packaged in such a way that people downloading it can recreate the **exact software environment** in which you ran your experiments. If you do everything properly, someone else sitting in somewhere around the internet will be able to download your paper-package from https://pypi.python.org, unzip it and extract the files, and execute the following commands:
|
|
|
|
|
|
```sh
|
|
|
$ wget https://pypi.python.org/bob.paper....
|
|
|
$ unzip bob.paper.icb2018_veinrec.zip
|
|
|
$ cd bob.paper.icb2018_veinrec
|
|
|
# install miniconda from https://conda.io/miniconda.html
|
|
|
# make the "conda" binary available on your PATH
|
|
|
$ conda env create -f environment.yml
|
|
|
$ source activate bob.paper.icb2018_veinrec
|
|
|
$ buildout
|
|
|
```
|
|
|
|
|
|
After doing the buildout, the user should be able to re-run your experiments, provided he/she has access to the data.
|
|
|
|
|
|
### Package Organization
|
|
|
|
|
|
There are several ways of publishing your code. Some people have taken the time to define a specific structure for organizing the code that makes it easy to publish it. This structure has been used to publish a lot of code based on Bob.
|
|
|
|
|
|
This is the typical structure of a paper package:
|
|
|
|
|
|
```text
|
|
|
bob/
|
|
|
.gitlab-ci.yml
|
|
|
COPYING
|
|
|
MANIFEST.in
|
|
|
README.rst
|
|
|
buildout.cfg
|
|
|
environment.yml
|
|
|
requirements.txt
|
|
|
setup.py
|
|
|
```
|
|
|
|
|
|
A quick overview of these files:
|
|
|
|
|
|
* `bob/` is the directory in which you put your source code. This includes scripts and whatever support files that you may need to implement all resources (tables and figures) on your paper. The files should be organized in subdirectories [matching your package name](https://www.idiap.ch/software/bob/docs/bob/bob.extension/stable/pure_python.html#anatomy-of-a-package).
|
|
|
* `.gitlab-ci.yml` is the file that controls the CI instructions for testing your paper installation and whatever else you deem necessary
|
|
|
* `COPYING` this is the license of your package. Normally, we set this to be GPLv3 as per Idiap advice. Just copy [this file](http://www.gnu.org/licenses/gpl.txt) and name it `COPYING` on the root of your package.
|
|
|
* `MANIFEST.in` should list files that you'd like to ship with your package
|
|
|
* `README.rst` contains basic information about your paper including citations users may need to refer to in case they decide to use your publication on their own work. It should also include installation instructions for the package and, eventually, information on how to re-run your code **and** produce the results on your paper
|
|
|
* `buildout.cfg` contains the basic recipe to create a working environment using your paper package
|
|
|
* `environment.yml` contains the precise list of conda packages required to re-build, from scratch, the work environment in which you know the paper will successfuly run **and** produce the same results you published
|
|
|
* `requirements.txt` contains the **direct** dependencies of your package (everything you `import` in **your** code). You don't need to include here *indirect* dependencies
|
|
|
* `setup.py` corresponds to the Python packaging instructions. It reads `requirements.txt` and defines what this package name is and how to install it.
|
|
|
|
|
|
More complex packaging *may* be required in special cases. For those, please refer to our complete [Bob extension guide](https://www.idiap.ch/software/bob/docs/bob/bob.extension/stable/index.html).
|
|
|
|
|
|
Up-to-date templates for some of the above files may be found in [bob.admin](https://gitlab.idiap.ch/bob/bob.admin/tree/master/templates)
|
|
|
|
|
|
### Checking the README file
|
|
|
|
|
|
You can check the `README.rst` file for warnings and errors like this:
|
|
|
|
|
|
```sh
|
|
|
$ rst2html README.rst > /dev/null
|
|
|
```
|
|
|
|
|
|
This should print eventual formatting errors you may have. You want to fix these **before** uploading your package to PyPI or the description there will be unformatted.
|
|
|
|
|
|
|
|
|
### Continuous Integration
|
|
|
|
|
|
Continuous integration is the ability to test your package every time you commit something to it (actually, when you push your changes back to gitlab). We advise you create a `.gitlab-ci.yml` file that reproduces your installation instructions and tries, at least, to check if the scripts can run. It does not have to be sophisticated. Something along the lines should do the trick:
|
|
|
|
|
|
```yaml
|
|
|
test:
|
|
|
variables:
|
|
|
CONDA_ENVS_PATH: "conda-env"
|
|
|
CONDA_BLD_PATH: "conda-env"
|
|
|
script:
|
|
|
- hash -r
|
|
|
- conda config --set always_yes yes --set changeps1 no
|
|
|
- conda info -a
|
|
|
- sed -i "s|https://www.idiap.ch|http://www.idiap.ch|g" environment.yml
|
|
|
- conda env create -vvv --file environment.yml
|
|
|
- source activate bob.paper.icb2018_veinrec
|
|
|
- buildout
|
|
|
- #now test here your scripts by simply calling them - e.g. ./bin/table1.py
|
|
|
cache:
|
|
|
key: "$CI_BUILD_NAME"
|
|
|
paths:
|
|
|
- conda-env/.pkgs/*.tar.bz2
|
|
|
- conda-env/.pkgs/urls.txt
|
|
|
image: continuumio/miniconda
|
|
|
tags:
|
|
|
- docker
|
|
|
```
|
|
|
|
|
|
The `tags` section of this YAML file is important as it tells the Gitlab CI infrastructure where to run your tests. Make sure you go to the "Settings / CI/CD" of your software package in Gitlab and enable the corresponding runners.
|
|
|
|
|
|
|
|
|
### Organizing the contents of the `bob` directory
|
|
|
|
|
|
[tbd]
|
|
|
|
|
|
|
|
|
### Creating the `environment.yml` file
|
|
|
|
|
|
[tbd]
|
|
|
|
|
|
|
|
|
### Software Disclosure Agreement
|
|
|
|
|
|
You should make your software package public. This normally has to go through a Software Disclosure agreement between you and Idiap. In order to kick-start the process open a help-desk ticket and go on from there. Include your supervisor in CC on that ticket, alongside with all involved partners. This process **can take up to a couple of weeks** to go through, as it may involve a software review.
|
|
|
|
|
|
|
|
|
### Publishing to PyPI
|
|
|
|
|
|
After your software package is sedimented and tested to work, you can publish it to PyPI. Before doing so, make sure it is public and **read the section entitled "Software Disclosure Agreement"** above.
|
|
|
|
|
|
We recommend you use [Twine](https://pypi.python.org/pypi/twine) to upload your software package to PyPI. You may pip-install it on your local conda-development environment to do so. Once the `twine` binary is in place, just execute the following commands:
|
|
|
|
|
|
```sh
|
|
|
#remember to use Python from your conda env
|
|
|
$ python setup.py sdist --formats zip
|
|
|
$ twine upload dist/*.zip
|
|
|
```
|
|
|
|
|
|
The `twine` command will require you enter a username and password for PyPI uploading. You *should* use our special account for this, so we keep track of all published packages. Ask people around for information.
|
|
|
|
|
|
Once your package is uploaded to PyPI, you can paste the link from that server into your article. It should look like this: https://pypi.python.org/pypi/bob.paper.isba2018-entropy
|
|
|
|
|
|
Do **not** include the version number on the link you paste on your article, or you won't be able to update the package in case of issues later on.
|
|
|
|
|
|
|
|
|
### Example software packages:
|
|
|
|
|
|
* https://gitlab.idiap.ch/bob/bob.paper.icml2017
|
|
|
* https://gitlab.idiap.ch/bob/bob.paper.isba2018_entropy
|
|
|
|
|
|
|
|
|
## Paper (LaTeX) Source Code
|
|
|
|
|
|
The source-code for your article should be in the [Biometrics Gitlab group](https://gitlab.idiap.ch/biometric/). If you don't have permissions to create a repository, ask for someone who does. The Gitlab project for a paper package should **not** be made public.
|
|
|
|
|
|
LaTeX source code projects should be named `paper.conferenceYEAR.subject`. For example, for the above software project name your LaTeX source code as `paper.icb2018.veinrec`. The contents of your package should be simple and include a `Makefile` to build the PDF of your paper from the sources.
|
|
|
|
|
|
### Examples
|
|
|
|
|
|
* https://gitlab.idiap.ch/biometric/paper.icml2017.bob
|
|
|
* https://gitlab.idiap.ch/biometric/paper.jmlr14.bob
|
|
|
|
|
|
### Continuous Integration
|
|
|
|
|
|
You can setup Gitlab CI to also test the build of your article at every push. Here is an example YAML file that does the trick:
|
|
|
|
|
|
```
|
|
|
stages:
|
|
|
- build
|
|
|
|
|
|
.build_template: &build_job
|
|
|
stage: build
|
|
|
script:
|
|
|
- make
|
|
|
artifacts:
|
|
|
expire_in: 1 week
|
|
|
paths:
|
|
|
- main.pdf
|
|
|
|
|
|
linux:
|
|
|
<<: *build_job
|
|
|
tags:
|
|
|
- docker-build
|
|
|
|
|
|
macosx:
|
|
|
<<: *build_job
|
|
|
tags:
|
|
|
- beat-macosx
|
|
|
```
|
|
|
|
|
|
These CI instructions will try to build your paper in both Linux and MacOSX-based installations. It will preserve the PDF as build artifact you can download and check. The PDF will be available for up to one week after the build ends.
|
|
|
|
|
|
Remember to activate the respective runners corresponding to the `tags` above on your Gitlab project `Settings / CI/CD` page. |
|
|
\ No newline at end of file |