!200 Database interface: Implemented a simple filelist database interface for the VanillaBiometrics based on CSVFiles. The
CSVDatasetDevEvalneeds to have the following format:
text my_dataset/ my_dataset/my_protocol/ my_dataset/my_protocol/train.csv my_dataset/my_protocol/train.csv/dev_enroll.csv my_dataset/my_protocol/train.csv/dev_probe.csv my_dataset/my_protocol/train.csv/eval_enroll.csv my_dataset/my_protocol/train.csv/eval_probe.csv ...where each CSV file needs to have the following format:
text PATH,SUBJECT path_1,subject_1 path_2,subject_2 path_i,subject_jThis formart allows the usage of metadata by following the pattern below:
text PATH,SUBJECT,METADATA_1,METADATA_2,METADATA_k path_1,subject_1,A,B,C path_2,subject_2,A,B,1 path_i,subject_j,2,3,4We can imagine other implementations of this. For instance,
CSVDatasetCrossValidationthat given a csv file, it splits "on-the-fly" several data for enrolling, probing and training. Or
CSVDatasetWithEyesAnnotation, that handles annotations for Face Rec pipelines. I still need to implement a mechanism that takes
zipfiles as input to
CSVDatasetDevEval. That way we can ship databases as simple zip files ping @ydayer @amohammadi I'll merge this tomorrow. I need this to support the efforts on
- !201 Correct the chain loading and click implementation: The vanilla biometrics script was not using our click API correctly, this fixes that. Please refer to bob.extension's docs for more info.
!194 New documentation bob.bio.base: Redoing bob.bio.base documentation We might rebrand this package to
!203 Fix Dask shutting down automatically when calling from a script: Defers the shutdown task to the user by putting the execution of Vanilla-Biometrics in its own function. The
bob bio pipelines vanilla-biometricscommand calls this function.
!180 [dask] Preparing bob.bio.base for dask pipelines: THIS IS A WIP Working out how
bob.bio.baseshould be with dask pipelines. Some things to be done in the MR is. - [x] Remove all traces of
verify.py. Pipelines needs to be defined with dask. This includes the removal of:
./baselines- [x] Move the biometrics pipeline from
bob.pipelinesto here - [ ] Document the package on: 1-) How to use the resources available (databases, algorithms, preprocessors, etc..), 2-) How to create new databases, 3-) How to create new pipelines (this should be in bob.pipelines) - [x] Rethink the file list interface - [ ] Review which dependencies to keep. For instance,
bob.math(consider use scikit-learn). - [x] Implement missing pipelines (for instance, score normalization/calibration pipeline) - [x] Database paths patched in
~/.bobrc- [ ] Redesign the resource.py as CLI commands
- !204 Remove stackable preprocessors and extractors: Now that we have standardized on scikit-learn transformers, scikit-learn alternatives such as FunctionTransformer, Pipeline, and FeatureUnion should be used
- !202 Dask annotators
!205 Fix annotators kwargs: The kwargs were not passed correctly in the FailSafe annotator. Removed the shutdown of dask clients in the
- !207 Fixed [typo]
!206 Created decorators: Added two decorators that checks
- !208 Fixing sphinx warnings: It's a WIP. I have 34 warnings to be fixed so far. ping @ydayer @amohammadi
- !209 Putting annotator back to the documentation
- !210 [annotators][FailSafe] improvements
- !212 Improvements on CheckpointWrapper: Related to bob.pipelines!48
!214 Optimize scoring if
allow_scoring_with_all_biometric_references == False: ..by caching biometric references. This speeds things up by a lot
- !211 [legacy databases] delay loading of annotations: Depends on bob.pipelines!47
!213 [VanillaBiometrics] Make some dask arguments visible in the CLI commands: Sometimes our heuristics that defines
n_workersis not enough to cover all range of problems. In this MR I'm adding the options
vanilla-biometricsCLI command, so that users can explicitly set, respectively, the dask bag partition size and the number of workers to start an experiment without having to play around with the API.
- !216 [click][VanillaBiometrics] Fixed click options: Closes #148
- !218 [test_utils] Do not re-download ATNT database: if the data is already inside the low-level db package. Fixes #127
- !219 Remove fiddling with bob.pipelines.Samples's internals
- !217 Add a method to retrieve all the samples of a dataset: Added a functionality to get all the samples at once. For legacy databases, it uses the all_files method. Closes #146
- !220 [scripts][pipelines] allow dask-clients as strings: Some strings like single-threaded, processes, ... are now allowed as options for the dask-client click option. Depends on bob.extension!122
- !221 Added the memory_demanding attribute as part of the legacy APi
!222 Follow-up to "Add a method to retrieve all the samples of a dataset": Closes #149 The
all_samplesmethods now correctly handles the 'train' group, returning only the samples of a group if that group is given as parameter. The 'groups' parameters contents are now verified with
bob.db.base.check_parameters_for_validity. Groups of 'legacy database interfaces' different from 'train', 'dev', or 'eval' are now handled correctly.
- !223 Used the DelayedSampleSetCached in the BioAlgorithmCheckpointWrapper: Depends on bob.pipelines!53
- !225 [test.utils] Handle ModuleNotFoundError when bob.db.atnt is not available
- !226 [legacy] Updated subject to reference_id in the lecacy DB interface
!227 Make compare samples work with DelayedSample instead of Sample: this allow us to use
- !230 [dask] Moved dask_get_partition_size from bob.bio.base to bob.pipelines
- !229 Check stateless: Small change where background model samples are not loaded if the main pipeline is stateless. The world set typically is much larger than the dev and eval set, so it is a bit of a waste to load. NB. The make_pipeline utility from scikit-learn generates pipelines that are never stateless, even if all transformers in the sequence are stateless, which makes this change useful.
- !228 moved VALID_DASK_CLIENT_STRINGS to bob.pipelines
!224 Adapting CSVDevEval to work with our current FileList Structure: This is a WIP, which means Work in Progress. In this MR I structured the files from the proposed CSV interface to be the same as in the current FileList we have. Furthermore, the current
.lstfiles can be used in the
CSVDatasetDevEvalwith some of the bells and whistles the current LST files has (e.g usage of external annotations). The goal is of the MR is to: 1. Be able to use the current file lists we have 2. Allow the extra functionalities we currently have with the CSVs.
- !231 Renamed CSVDatasetDevEval to CSVDataset: .... it's easier
- !232 Move code
- !233 CSVSampleLoaders as transformers: Depends on bob.pipelines!60 ping @amohammadi @ydayer
- !234 Fixed issues in the ScoreWriter
- !235 [ztnorm][scorewriter] Updates: - Used DelayedSampleSetCached in the ZTNormCheckpointWrapper - Cleaning up the CSVScoreWriter
!236 [Fix] Correction of scores path: This fixes an issue introduced in this commit where the path to the output scores file was wrong for non
.csvscore file format. ping @tiago.pereira
!238 Created the function
get_temp_directory: ... this is useful for legacy algorithms Related to bob.bio.face#39
- !239 Raised a warning when data is not available to process vanilla-biometrics
- !240 Doc cleanup: Did a pass in Grammaly and corrected the errors.
!241 Legacy database wrapper was not supporting this properly supporting the use case where
model_id<>client_id: Making the bob.bio.base database interface support samplesets where the reference_id is different from subject_id. There are some databases protocols that needs such support. Furthermore the legacy database interface was not supporting this properly ping @mguenther
- !242 [bob.piplines] sample loaders have moved in bob.pipelines: Depends on bob.pipelines!62
- !243 Add the vulnerability analysis commands: Moved from bob.pad.base
- !245 Score checkpoint are more robust: In this MR I'm changing the way we checkpoint scores. I moved from the joblib pickling to pickle and compress with gzip myself. This seems more robust with our file system.
- !200 Database interface: Implemented a simple filelist database interface for the VanillaBiometrics based on CSVFiles. The
- !183 Remove usage of numpy.testing.decorators: Fixes #132
- !170 Explain how new databases are configured: using bob's global configuration system
!191 Resolve "
check_existenceflag incorrectly handled in filelistdatabase query": Closes #134
- !198 Fixing and adding features to the scores generation script: I was using the score 'gen' script, but needed some features. Added: - A way of specifying how the number of scores is defined (number of subjects and probes, or manually specified); - A way to generate different scores for dev and eval; - Tests for the gen script. Changed: - The way the scores are generated (each probe against each reference model, instead of randomly). Fixed: - Duplicate click options (-p); - 'positive scores' were generated with a 'negative scores' variable.
- !178 Fixed issue with numpy.ndarray.resize py37
- !179 Separate semilogx and TPR options in ROC plots: Depends on bob.measure!98
- !176 Add bob bio annotate-samples command: This command works very similar to bob bio annotate except that works without a db interface. Instead, it requires a list of samples and two functions to do the job.
- !178 Fixed issue with numpy.ndarray.resize py37
- Python 3.7 support
- Breaking and significant changes
- Removed the old
- Functionality to load biometric scores are now in
- Added new scripts for plotting and evaluations. Refer to docs.
- Added a new baselines concept. Refer to docs.
- Detailed changes
- !147 Update installation instructions since conda's usage has changed.
- !148 Archive CSU: closes #109
!146 Add 4-5-col files related functionalities and add
click commands: In this merge: * Add loading functionalities from
bob.measure* Add the following click commands (as substitutes for old script evaluate.py) using 4- or 5 - scores input files: *
bob bio metrics*
bob bio roc*
bob bio det*
bob bio epc*
bob bio hist*
bob bio evaluate: calls all the above commands at once *
bob bio cmc*
bob bio dic*
bob bio genPlots follow ISO standards. The underlying implementation of the mentioned commands uses
bob.measurebase classes. Fixes #108
- !149 Set io-big flag for the demanding grid config: Closes bob.bio.base#110 Anyone cares to review this one? It's harmless.
- !143 Set of click commands for bio base: From #65 Provide commands in bio base: - bob bio metrics
- bob bio roc - bob bio evaluate (Very similar to evalute.py)
- !152 Removed unused import imp and solving #83: Closes #83
- !153 Added the protocol argument issue #111: Closes #111
- !154 Fixes in ROC and DET labels
- !157 Fixed bob bio dir x_labels and y_labels: The labels of the DIR plot were incorrect.
- !155 Write parameters in a temporary config file to enable chain loading: Fixes #116
- !150 Exposing the method groups in our FileDatabase API
- !158 Add prefix aliasing for Click commands
- !160 Titltes: Allows a list of titles Fixes #121. Requires bob.measure!67
- !159 Resolve "Documentation does not include a link to the recordings of the IJCB tutorial": Closes #122
- !161 Change --eval option default and Various fixes: fixes #112. Add and clean histo options. See bob.measure!67#note_30951 Requires bob.measure!68
- !163 Reduce repition between commands: Depends on bob.measure!70
- !162 Removed traces of evaluate.py in the documentation
- !164 Fix test according to changes in nbins option
- !165 Set names for different bio metrics: Bio specific names for metrics when using bob.measure Metrics
- !166 Add a command for multi protocol (N-fold cross validation) analysis: Similar to bob.measure!79
- !167 Various fixes: Requires bob.measure!82 Similar to bob.measure!82 for bio commands
- !168 Documentation changes in bob bio annotate: Depends on bob.extension!86
- !156 Using the proper verify script depending on system: Closes #119
- !151 Created the Baselines Concept
- !169 Change assert to assert_click_runner_result
- Migrate to conda based CI
- Updated docs and tests
- Added allow_missing_files option and added tests
- Removed write_commands function in grid_search (closes #71)
- Added annotator, updated documentation accordingly
- Allow for comment lines in file-lists
- Improves verbosity for preprocessing, extraction and enrollment
- bob.db.base.Database is deprecated.
- Using config file loading mechanism from bob.extension
- Fix debug message (closes #103)
- Fixed the exception that is raised when score file is not found
- Mentions bob.bio.vein on bob.bio.base docs (closes #104)
- Added metadata in preprocessing, feature extraction, and algorithm
- Removed write_commands function in grid_search (closes #71)
- Improve replace directories
- Changed backend in
- Changed backend in
- !89 [ci] Fixed issue with matplotlib
- !95 Removed class client from FileListDatabase
- !94 Implemented the plots for the Detection Identification Rate (open-set identification)
- !93 Corrections in the documentation
!92 Documented the configuration file input for the
!90 Implemented a way to use, at the same time, 4 and 5 columns score file in the
- !88 Improved ROC and CMC plots
- !87 Fixed * ZT files are processed even when no ZT processing is wanted
- !81 Droped dependency on Latex
- !86 Implemented five-column score file -* only during concatenation
- !83 Improve suggestion on variable naming convention to match python's standard
Algorithm.read_probemethod, since this is already solved via
- !82 Add a function to read features with generators
- use super() in class constructors
- re-arranged command line options and more reusable code of command line
- a slightly updated documentation
- Updated code to pass original_directory and original_extension to
bob.db.base.Database, as now required
- Updated the mailing list and issues links
- Used SVG build badges
- Fixed PLDA test files (standarization of the SVD signal in
bob.math#10) (issue #61)
- Moved the code to do fileList datasets to
- Introduced a minimum file size for every element in the toolchain (merge #58)
- Fixed issue with the default resources (issue #53)
- Moved some functionality of
- The implementation
annotationsmethod now is mandatory in
- Fixed issue in the PLDA when it's a linear projection before the training is not set (issue #56)
- Fixed issue with the
verify.pyscript (issue #51)
- Fixed issue with the grid options offered by the
verify.pyscript (issue #46)
Outcome of the refactoring 2016 (new database interface).
103 commits in total
3eff0cef Increased latest version to 3.0.1b0 [skip ci]
43626a2c Increased stable version to 3.0.0
df87afe2 Merge branch 'patch-doc' into 'master'
4d17eb1a Introduce our mailing list for reporting issues
5d114973 Point to gitlab pages
aa5aec27 Merge branch 'patch-1' into 'master'
0c2eeb95 Make sure default value is not a pointer
a77c013b Made the default for --result-directory and --temp-directory more obvious that it is not the real path, see /scratch/mgunther/test_checkouts/bob.bio.caffe/caffe/bin/caffe 'train' --solver=solver.prototxt -log_dir '.' -snapshot 1e-4/snapshots_iter_50000.solverstate , fixes #43
a4e3bd85 Revert "Use the 'default' keyword on the argparse option"
7c8fc29b Merge branch 'revert-18-master' into 'master'
f4c7d6ca Merge branch 'pre-release' into 'master'
104407b3 Removed the superflouos original_data_list_files from the FileSelector and updated preprocessor accordingly
b88c4e60 Fixing sphinx warnings
5021ffe1 [ci] Update with new ci
7a7b01d3 Merge branch 'groups_vs_group' into 'master'
6f0753ff Merge branch '26-confusing-defaults-documentation-on-bob-bio-base-scripts' into 'master'
927f30ef Use the 'default' keyword on the argparse option
6bd84ca7 Updated documentation of --temp-directory and --result-directory to be less confusing
e8315bee Merge branch 'issue-8-remove-database-configuration' into 'master'
8190e864 Merge branch 'issue-8-remove-database-configuration' into 'master'
e7722c0c Merge branch 'issue-8-remove-database-configuration' of gitlab.idiap.ch:bob/bob.bio.base into issue-8-remove-database-configuration
0df3423e [refactoring2016] Moved from bob.bio.db to bob.bio.base the scripts that test the hight level implementations
4a1731f6 Turned lambda function into named function
554419d6 Merge branch 'issue-8-remove-database-configuration' into 'master'
fafbda49 Merge branch 'issue-8-remove-database-configuration' of gitlab.idiap.ch:bob/bob.bio.base into issue-8-remove-database-configuration
0ef1c256 [refactoring2016] Moved bob.bio.db.BioDatabase to bob.bio.base
ed0617d8 [refactoring2016] Merged 29-preprocessor-does-not-use-the-load-method-of-the-biofile-class with issue-8-remove-database-configuration
f8a50678 [refactoring2016] Updated the annotations function
3030866d [refactoring2016] Updated documentation and buildout
67916c31 [refactoring2016] Moved bob.bio.db.BioDatabase to bob.bio.base
3d6d2148 Using derived class BioFile.load functions as default for read_original_data
fdd1a334 Updated name of load_function to be read_original_data (to be consistent with the name of the function); used BioFile.load instead of bob.db.base.File.load as default.
04ff36c2 Updated dummy databases to work with bob.db.atnt directly and return BioFile or BioFileSet objects
1d88685f First version of proposed handling of original data
10f2493c Revert "use the File.load method if possible."
753d3f93 Revert "Fix for bob.bio.video"
28a640f6 Merge branch 'config_file' into 'master'
453519ea Turned recursion over config's into an iteration
630867cc [resources] Avoid exec() as it has bugs on Python 2.7
9eb1e90c Fix use of read_config_file() at other points
74e20601 [develop.cfg] Add bob.io.image for dev
a0022914 [config_file] Implement multi-config readout
451ca636 Merge branch 'issue-8-remove-database-configuration' of gitlab.idiap.ch:bob/bob.bio.base into issue-8-remove-database-configuration
432b2302 [refactoring2016] Updated the annotations function
ce5b6498 [refactoring2016] Updated documentation and buildout
104010ae [refactoring2016] Moved bob.bio.db.BioDatabase to bob.bio.base
b26bf656 [refactoring2016] Merged 29-preprocessor-does-not-use-the-load-method-of-the-biofile-class with issue-8-remove-database-configuration
9bc7a152 Use string mode to open tempfile for py3 compat
9f5f7e6d Better fix w/o leaving something untested
25b5d50f Fix py3 compatibility issue
8c3b1190 Fixes and tests for the config-file feature
988a7ca3 Make import relative
de870e6c Fix argument parsing destination
ffcb9385 Add more cmdline options
43a4efcb Introduced the --configuration-file command line option
94864906 [refactoring2016] Updated the annotations function
6b55fda6 [refactoring2016] Updated documentation and buildout
7d29b773 [refactoring2016] Moved bob.bio.db.BioDatabase to bob.bio.base
457b89f6 [config_file] Make it an optional argument rather than an option
4f3c96b9 Merge branch 'config_file' into 'master'
bd1a8af8 Use string mode to open tempfile for py3 compat
bfd4ae32 Better fix w/o leaving something untested
3ed8d9bb Fix py3 compatibility issue
cc3e85e6 Using derived class BioFile.load functions as default for read_original_data
14e685d2 Fixes and tests for the config-file feature
16fc352c Make import relative
63d3b73e Updated name of load_function to be read_original_data (to be consistent with the name of the function); used BioFile.load instead of bob.db.base.File.load as default.
4373e7e0 Updated dummy databases to work with bob.db.atnt directly and return BioFile or BioFileSet objects
a7a9a0d3 First version of proposed handling of original data
d69bcf71 Revert "use the File.load method if possible."
ed59a2f8 Revert "Fix for bob.bio.video"
35b38820 Updated test for new score.py script
eee28aa7 Fixed score.py script to use projector in the right way
347bf423 Merge branch '10-create-a-test-to-prevent-some-errors-for-scoring-tools' into 'master'
9351e02a Fix develop.cfg
567d530b Add a test when the algorithm does not do projection
17c7af71 Fix argument parsing destination
bb814339 Add more cmdline options
7c3e4856 Merge branch 'master' into config_file
24bda38f update filesets dummy database
521a44a7 Fix the dummy database
098de3fe Merge branch 'master' into config_file
a84d12d9 Merge branch 'fixed_tests' into 'master'
cf12e92a fix failing tests
b55ecdb9 fix documentation
36b4b243 Fix for bob.bio.video
8203fdf9 [manifest] Ship test requirements with package
bf64ff8f [test] Add requirement to gridtk for testing
6c833675 [many] Remove license headers
175cf005 [preproc] Change dep bob.db.verification.utils -> bob.db.base
108c4f38 refactoring_2016 is now merged.
5aa6664e drop DatabaseBob
11a7e229 use the File.load method if possible.
7c95bb72 Replaced eval() with more appropriate setattr() in resources
96a16667 Introduced the --configuration-file command line option
b9c8beb4 Increased latest version to 2.0.11b0 [skip ci]
e990d550 Revert "Fixed a bug when there is no argument called env"