Wishes for the next major release (1.2.0)
Created by: laurentes
I've decided to create a thread to help us to converge towards a better Bob for the next major release. I doubt we will have the time to deal with everything shortly, but there will be at least a trace of it. Feel free to update this thread with your thoughts/wishes.
To consolidate what is already there
The user guide is already fairly good. I would not say the same about the documentation of the python bindings that show up when using the
help()function of python. For many functions/classes, this documentation is very limited and unhelpful.
The library is becoming larger. We should have stricter rule when defining class methods that refers to the same concept. For instance, very often, we have a variable related to the input/feature dimensionality. Depending on the class, it might be obtained using a 'getDimD()', 'inputDims()', or whatsoever. I think this is very annoying from a user perspective. Same for the API of the trainers.
For many functions, we have two different kinds of python bindings: one kind which follows the C++ API (e.g. void f(const BA& input, BA& output)), and one which is more 'pythonic' (e.g. BA f(const BA& input)). These two bindings often share the same python function name. I don't think this is a good strategy, as these sometimes leads to impossible cases when the C++ API has many overloaded functions. I think the function name should reflect this difference. OpenCV strategy was roughly to used two different namespaces (cv for the C like functions, and cv2 for the pythonic one). We could do something slightly different such as appending the function name with something like '_c' or '_cc' to clearly highlight this fact.
As previously discussed, the LBP code could be refactored in a single more generic but parametrized class.(Already done by Manuel)
Learning from our previous experiences, I would be in favour of making mandatory a version number to any class that provides a 'save_to_hdf5' method.
Add a tutorial for the Audio Processing module(Already done by Elie)
Provide is_similar_to(const Object& b, const double epsilon=1e-8) functions for all C++ classes that have double members. The default "==" and "!=" operators are largely useless when we want to provide code that runs on several platforms (i.e., 32 and 64 bit machines).
Whenever a C++ class uses some random initialization, make sure that you can seed this randomness.(cf. issue #121 (closed) )
To integrate new features
A. I would be nice that the combination of NumPy/SciPy/Bob roughly provides the same functionalities as the Matlab built-in functions. It would also be good to use similar function names such that it is easy for a user to move accross these platforms. A (too exhaustive) list can be found here: http://www.mathworks.ch/ch/help/matlab/functionlist.html
B. More Machine Learning algorithms:
B.1 In particular to add a Hidden Markov Model implementation, as there was one in late Torch 3.
B.2. To add a deformable and parts-based object recognition system (Felzenszwalb-like)
B.3 Integration of the i-vector framework (Already done by Laurent)
C. More Audio Processing features:
C.1 Possibility to
compute and plot spectrograms (Already done)
C.2 Audio codec to deal with wav and sphere files
C.3 Provide a bridge to HTK
C.4 Boosted Binary Features (cf. Anindya's thesis)
D. Image Processing Tools
D.1 GLCM features (Grey-Level Co-occurence Matrix) (Already done by Ivana)
Make compiling C++ bindings on satellite packages easier - we can move most of the functionality currently implemented on those packages like https://github/com/bioidiap/xbob.optflow.liu to the core of Bob. (done by André)
F. New metrics: F1-score, precision and recall; cost versus training set size - This should be simple and an excellent coding exercise. If anyone wants to give it a go, please let me know. More info: http://en.wikipedia.org/wiki/F1_score
G. Better use of optimization library (L-BFGS-B) for NNet backprop implementation - this is a somewhat larger piece of work that goes around revisiting the NNet implementation to separate the optimizer from the trainer.
To be more widely supported
To make the Windows/Cygwin port functional(cf. issue #82 (closed) )
- To generate RPM like package:
- To generate a package for the TinyCore distribution: http://distro.ibiblio.org/tinycorelinux/downloads.html In particular, it would be nice to automatize the process of generating a VirtualBox VDI with bob install, potentially based on the tiny linux distribution TinyCore.