This guide is a step-by-step introduction and advice on how to publish your
reproducible paper at Idiap.
This document is a collaborative effort between researchers at Idiap.
It is not meant as an exhaustive resource on software development or LaTeX
writing. It assumes you're proficient in those matters. If something differs
between this guide and your experience, please modify it accordingly after
discussing with your team mates.
If you have collected experimental data mentioned on your paper yourself, you
need to publish your dataset separately, on Idiap’s data-distribution portal
To publish a new dataset on DDP, open a ticket on Idiap’s help-desk, and the
gentle folk in the IT dept. will tell you exactly what to do. Just remember
that the process usually takes at least a week, so try to plan things well in
Remember to include enough information on the DDP release so that your database
can be re-used for someone even if they don't have access to your software
package (e.g., somebody doing experiments using R, Julia or Matlab). This, for
example, should include protocol descriptions and annotations if that is
required to use your dataset.
Software packages for a paper should live within the Bob Gitlab
Group and should be named
bob.paper.conferenceYEAR_subject (e.g. bob.paper.icb2018_veinrec). This
package should include the source code and instructions so an independent
researcher can reproduce the specific results on your paper. A paper package
is a valid Python package and, as such, it may be distributed on
The code needs to be packaged in such a way that people downloading it can
recreate the exact software environment in which you ran your experiments.
If you do everything properly, someone else sitting in somewhere around the
internet will be able to download your paper-package from
https://pypi.python.org, unzip it and extract the files, and execute the
$ wget https://pypi.python.org/bob.paper....$ unzip bob.paper.icb2018_veinrec.zip$ cd bob.paper.icb2018_veinrec# install miniconda from https://conda.io/miniconda.html# make the "conda" binary available on your PATH$ conda env create -f environment.yml$ source activate bob.paper.icb2018_veinrec$ buildout
After doing the buildout, the user should be able to re-run your experiments,
provided he/she has access to the data.
Update to existing packages
Usually, your experimental framework depends on several (bob) packages that you may want to modify to suit your current needs. As soon as you begin editing existing packages, a good practice would be to discuss the planned changes (typically in a merge request) you intend to make. This would allow a smooth integration of your code and ease the release of the modified / enhanced package as soon as you are done with it.
When modifying existing packages, you basically have three choices to make sure that the functionalities you implemented will be available:
Merge all the changes you made and release a new version of the package(s) - this is the preferred way and the one you should aim for.
Check out the specific branch / commit (in your buildout.cfg)
Put the code directly in your paper package.
Note: Explanations on this section assume you have already become familiar
with Python packaging. Otherwise, please start from our introductory guide
There are several ways of publishing your code. Some people have taken the time
to define a specific structure for organizing the code that makes it easy to
publish it. This structure has been used to publish a lot of code based on Bob.
bob/ is the directory in which you put your source code. This includes
scripts and whatever support files that you may need to implement all
resources (tables and figures) on your paper. The files should be organized
in subdirectories matching your package
.gitlab-ci.yml is the file that controls the CI instructions for testing
your paper installation and whatever else you deem necessary
COPYING this is the license of your package. Normally, we set this to be
GPLv3 as per Idiap advice. Just copy this
file and name it COPYING on the root
of your package.
MANIFEST.in should list non-pythonic files that you'd like to ship with
README.rst contains basic information about your paper including citations
users may need to refer to in case they decide to use your publication on
their own work. It should also include installation instructions for the
package and, eventually, information on how to re-run your code and
produce the results on your paper
buildout.cfg contains the basic recipe to create a working environment
using your paper package
environment.yml contains the precise list of conda packages required to
re-build, from scratch, the work environment in which you know the paper will
successfuly run and produce the same results you published
requirements.txt contains the direct dependencies of your package
(everything you import in your code). You don't need to include here
setup.py corresponds to the Python packaging instructions. It reads
requirements.txt and defines what this package name is and how to install
it. Read more about it here
More complex packaging may be required in special cases. For those, please refer to our complete Bob extension guide.
Up-to-date templates for some of the above files may be found in
those when in doubt.
Checking the README file
You can check the README.rst file for warnings and errors like this:
$ rst2html README.rst > /dev/null
This should print eventual formatting errors you may have. You want to fix
these before uploading your package to PyPI or the description there will
Continuous integration is the ability to test your package every time you
commit something to it (actually, when you push your changes back to gitlab).
We advise you create a .gitlab-ci.yml file that reproduces your installation
instructions and tries, at least, to check if the scripts can run. It does not
have to be sophisticated, like the ones we have for most Bob packages, just
functional enough to test the basics. Something along the lines should do the
test:variables:CONDA_ENVS_PATH:"conda-env"CONDA_BLD_PATH:"conda-env"script:-hash -r-conda config --set always_yes yes --set changeps1 no-conda info -a-sed -i "s|https://www.idiap.ch|http://www.idiap.ch|g" environment.yml-conda env create -vvv --file environment.yml-source activate bob.paper.icb2018_veinrec-buildout-#now test here your scripts by simply calling them - e.g. ./bin/table1.pycache:key:"$CI_BUILD_NAME"paths:-conda-env/.pkgs/*.tar.bz2-conda-env/.pkgs/urls.txtimage:continuumio/minicondatags:-docker
The tags section of this YAML file is important as it tells the Gitlab CI
infrastructure where to run your tests. Make sure you go to the "Settings /
CI/CD" of your software package in Gitlab and enable the corresponding runners.
Creating the environment.yml file
In order to ensure that the user of your source code can exactly reproduce
your published experimental results, you want to ensure that they are working
in the same environment. This means that the user should be working with
the same versions of all the Python/Bob packages and package dependencies that
you used when running your experiments. An easy way to achieve this is to
freeze your working environment into an environment.yml file, from which
the user can then re-create the same working environment.
Before we look at how to freeze a working environment, let's first consider how
we would initially create the environment in which we wish to work.
Environment creation is based on conda and can vary depending on which packages
you need. For example, the bob.paper.isba2018_entropy.env environment for
the bob.paper.isba2018_entropy paper package was created by executing the
following command in the terminal:
To work in this environment, you must then navigate to your working directory
and activate the environment. Using the bob.paper.isba2018_entropy paper
package as an example once again, this would be done by executing the following
commands in your terminal:
$ cd bob.paper.isba2018_entropy$ source activate bob.paper.isba2018_entropy.env
At this point, you are ready to freeze your environment with the following
$ conda env export> environment.yml
Now, open at your environment.yml file. If it contains zc.buildout and
setuptools, remove the corresponding version number so that, if the version
is upgraded at a later point, the user can still do buildout in their
re-created environment. You can also feel free to remove any packages in
environment.yml that you know for sure are not needed by your paper
package (if you are not sure, it's best not to remove anything). Finally,
remove the "prefix" section of your environment.yml file, since the user of
your package does not need to know the path to your working directory (anyway,
their path will be different).
To make sure your frozen environment works as expected, test it on a different
computer as follows, replacing bob.paper.isba2018_entropy with your package
name and bob.paper.isba2018_entropy.env with the name of your
$ git clone https://gitlab.idiap.ch/bob/bob.paper.isba2018_entropy # download package from GitLab$ cd bob.paper.isba2018_entropy # navigate to your working directory$ conda env create -f environment.yml # create the working environment$ source activate bob.paper.icb2018_relative_entropy.env # activate the created environment$ buildout # generate the scripts necessary to run your experiments$ ./bin/verify.py vera-wld # run your experiments
When you run your experiments in the created environment, your results should
be the same as those you originally obtained.
Alternatively, you could simply test that your environment has been correctly
created by incorporating the creation commands into your .gitlab-ci.yml file
(see the "Continuous Integration" section, above). Once you have created and
edited your environment.yml as explained, commit the changes to Git and push
to your project repository on GitLab. If the pipeline for this commit
succeeds, then your environment creation works as expected.
And that's it! All you need to do now is to include environment.yml in your
MANIFEST.in file to make sure that your environment file is packaged along
with your source code when creating a PyPI package. Note that it is also a
good idea to ensure that your environment creation and experiments work as
expected when downloading your paper package from PyPI as opposed to cloning it
Software Disclosure Agreement
You should make your software package public. This normally has to go through a
Software Disclosure agreement between you and Idiap. In order to kick-start the
process open a help-desk ticket and go on from there. Include your supervisor
in CC on that ticket, alongside with all involved partners. This process can
take up to a couple of weeks to go through, as it may involve a software
pip-licenses or similar software
comes in handy when you are filling the software disclosure form.
Publishing to PyPI
After your software package is sedimented and tested to work, you can publish
it to PyPI. Before doing so, make sure it is public and read the section
entitled "Software Disclosure Agreement" above.
We recommend you use Twine to upload your
software package to PyPI. You may pip-install it on your local
conda-development environment to do so. Once the twine binary is in place,
just execute the following commands:
#remember to use Python from your conda env$ python setup.py sdist --formats zip$ twine upload dist/*.zip
The twine command will require you enter a username and password for PyPI
uploading. You should use our special account for this, so we keep track of
all published packages. Ask people around for information.
The source-code for your article should be in the Biometrics Gitlab
group. If you don't have permissions to
create a repository, ask for someone who does. The Gitlab project for a paper
package should not be made public.
LaTeX source code projects should be named paper.conferenceYEAR.subject. For
example, for the above software project name your LaTeX source code as
paper.icb2018.veinrec. The contents of your package should be simple and
include a Makefile to build the PDF of your paper from the sources.
These CI instructions will try to build your paper in both Linux and
MacOSX-based installations. It will preserve the PDF as build artifact you can
download and check. The PDF will be available for up to one week after the
build ends, which is a nice plus for sharing.
Remember to activate the respective runners corresponding to the tags above
on your Gitlab project Settings / CI/CD page.
Upload to the Idiap publications website
Your paper must be listed on the Idiap publications
portal. You want to do this so that you can
list these contributions on your annual report later when you'll have to write
it. There are two instances in which input to this website must occur:
When you submit your paper to a conference or journal, you should create
a Idiap-Internal-RR (Research Report) that will remain private while
you wait for your paper acceptance answer.
If and when your paper is accepted, then you must create another entry
on that website that will become public. In this case, don't choose
Idiap-Internal-RR anymore, but the appropriate entry that will make it
public. Cross-reference the internal research report on that entry.
You may add the URL to your paper software-package on PyPI as an entry "note"
on the article you're creating on the Idiap website. Remember to add all
projects from which your work has received grants from at the appropriate form
Carefully act on this website when uploading your contributions. Public
entries cannot be easily undone as the system synchronizes automatically with
other publication portals in Switzerland.
Once you have a link on the Idiap publications website, share this link with
your supervisor and other parties involved.