Baseline-style python configuration

Created by: siebenkopf

Indeed, I was thinking about a way to have a single configuration file that contains all options. However, so far I was not implementing this option as it is relatively easy to use a single configuration file already.

What you can currently do is:

have one configuration file, sag experiment.py (similar to what you have in 1.), where you specify the database, the preprocessor, the extractor, the algorithm and possibly the grid (using exactly these keywords). The protocol can be specified as part of the database, see: https://www.idiap.ch/software/bob/docs/latest/bioidiap/bob.bio.base/master/implemented.html#bob.bio.base.database.Database
call ./bin/verify.py --database experiment.py --preprocessor experiment.py --extractor experiment.py --algorithm experiment.py --grid experiment.py

What you currently cannot do:

have a single option for the configuration file (this might be easy to implement, e.g., using a --configuration-file flag, which can be overwritten by any other option such as --algorithm)
to select the groups in the configuration file (this is only possible on command line, but it might be possible to add that to the database configuration)
assign steps to be skipped (only possible as command line option), where I don't think that this is part of the configuration anyways
have a complex configuration file registered as resource (I don't know, how difficult that is to be implemented).

The second way you pointed out is also possible, i.e., create a command line that will be executed as an external process. This is in fact the way that I have designed experiments for a paper so far, e.g., as documented here http://pythonhosted.org/bob.chapter.FRICE/experiments.html#running-the-experiments (I don't have access to the according GitLab project any more, so I cannot link you to the actual source code). An older version of the same approach can be found here: https://gitlab.idiap.ch/biometric/xfacereclib-paper-iet2014/blob/master/xfacereclib/paper/IET2014/execute.py#L84

I am not sure, which of the two approaches work better. I am actually in favor of your first solution, when there is only a single experiment to be run, while i prefer the second solution when there are several experiments in the paper. Also, you can create custom plots using the second solution.

Created by: siebenkopf

Actually, there is even a third solution (which is based on the second one), i.e., to use the ./bin/grid_search.py script to run a series of experiments. This I have also used in the book chapter, see: http://pythonhosted.org/bob.chapter.FRICE/experiments.html#running-the-experiments

Created by: siebenkopf

I have tried to implement your ideas (in the config_file branch), and it seems to be easily possible to have a single configuration file: https://github.com/bioidiap/bob.bio.base/blob/config_file/bob/bio/base/tools/command_line.py#L50 including all options that you like to have:

database, preprocessor, extractor, algorithm
grid
protocol
groups

In principle, we can have all command line options being specified in the configuration file, including --result-directory and --skip.... It just needs to be added in the list of options here: https://github.com/bioidiap/bob.bio.base/blob/config_file/bob/bio/base/tools/command_line.py#L223

My first test seems to work, but test cases need to be implemented, and more options should be included. Also, currently there is no way to check that everything inside your configuration file will actually be accepted as options -- unused options will currently simply be ignored.

Note that already now, even in the configuration file you can use resources, e.g.,

import bob.bio.gmm
database = 'mobio-male'
extractor = 'dct-blocks'
algorithm = bob.bio.gmm.algorithm.GMM(...)
parallel = 4

Anything specified on the command line will have preference over the configuration file. Any of database, preprocessor, extractor, algorithm have to be specified, either on command line or in the configuration file. In the case above, e.g., the --preprocessor option needs to be specified on command line as it is not given in the configuration file.

Created by: anjos

This is very fast & good work - thanks a lot.

I can look into completing the options so as to have them all.

I have now made the other options work too and implemented some tests. Seems working from the testing perspective, though more tests are usually better.

@mguenther: Could you please take a look and merge this if it looks OK? We're eager to start using this new feature.

mentioned in merge request !35 (merged)

I think this looks good, thanks for implementing the tests. I have opened a merge-request !35 (merged). Unfortunately -- due to current issues with conda -- the compilation does not work for all systems. This also means that no test cases are run.

I have restarted the builds. Hopefully they pass now. I have checked out and tested them locally (with Python 2.7), and everything works fine. However, if you trust our code to work in other environments (or if you have tested it with Python 3), you are welcome to accept the merge request.

Oh, I see the error message in the failed builds: OSError: [Errno 28] No space left on device. You should check your machines...

Status changed to closed by commit 4f3c96b9

mentioned in commit 4f3c96b9

Baseline-style python configuration

this the preprocessing step for this baseline

more options for 'extrator' and 'algorithm' will follow

setup the grid processing options

Designs

Child items ...

Activity