bob issueshttps://gitlab.idiap.ch/bob/bob/-/issues2017-08-07T12:16:54Zhttps://gitlab.idiap.ch/bob/bob/-/issues/235bob verification databases do not use the `original_directory` and `original_...2017-08-07T12:16:54ZManuel Günthersiebenkopf@googlemail.combob verification databases do not use the `original_directory` and `original_extension` parametersSorry that I saw this soo late, after the new database packages have been published already.
I think, during the reimplementation of the databases, something got lost. In the old `bob.db.verification.database.Database` interface, at lea...Sorry that I saw this soo late, after the new database packages have been published already.
I think, during the reimplementation of the databases, something got lost. In the old `bob.db.verification.database.Database` interface, at least two parameters were accepted: `original_directory` and `original_extension`, and there was a method called `original_file_names`, which was using these parameters.
Now, this functionality seems to be completely lost. For example, `bob.db.mobio` has no way of getting the original file names, i.e., the `original_directory` and `original_extension` are not stored in the database anymore. On the other hand, you can still specify these parameters in the constructor:
https://gitlab.idiap.ch/bob/bob.db.mobio/blob/master/bob/db/mobio/query.py#L40
but they are not used anywhere in the code.
I know that most of this functionality was moved to `bob.bio.base.database.BioDatabase`. Hence, I see two different ways of handling this:
> 1. Leave the implementation in `bob.bio.base` and remove the unused keywords in the `bob.db` Database constructors. In this way, the `bob.db` databases do not have the capability to query their original data files.
> 2. Move the functionality of the old `bob.db.verification.utils.Database` into `bob.db.base` (and remove it from `bob.bio.base`). In this way, the databases themselves know their original data.
In a similar manner, the `annotations` function inside the databases are arbitrary. When annotation files are read from file (for example in `bob.db.mobio`), an implementation is provided in `bob.bio.base.database.BioDatabase`: https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/database/database.py#L265, as well as in `bob.db.mobio`: https://gitlab.idiap.ch/bob/bob.db.mobio/blob/master/bob/db/mobio/query.py#L602, both of which use the same basic functionality: https://gitlab.idiap.ch/bob/bob.db.base/blob/master/bob/db/base/annotations.py#L35
Hence, to be consistent with option 1. above, we would probably want to *remove* this functionality from `bob.db.mobio`. In fact, in `bob.bio.face`, the `annotations` functionality inside `bob.db.mobio` is not used at all.
On the other hand, there are databases, which store the annotations internally, such as `bob.db.gbu`: https://gitlab.idiap.ch/bob/bob.db.gbu/blob/master/bob/db/gbu/models.py#L51 Hence, for these databases, the `bob.bio.base.database.BioDatabase.annotations`:https://gitlab.idiap.ch/bob/bob.bio.base/blob/master/bob/bio/base/database/database.py#L265 functions need to be overwritten, i.e., in order to use the annotations from those databases. However, I cannot see this happening, e.g., in `bob.bio.face.database.GBUBioDatabase` https://gitlab.idiap.ch/bob/bob.bio.face/blob/master/bob/bio/face/database/gbu.py#L16
Hence, for these databases there is currently **no way** to obtain the annotations from the original `bob.db` databases. Again, there are two solutions:
> A. provide a default implementation for these cases in `bob.bio.base.database.BioDatabase.annotations`, i.e., by checking if the low-level database has an `annotations` function.
> B. Provide these implementations in all derived classes from `BioDatabase`, where the low-level database has annotations stored internally.
I can check, which of the `bob.db` databases are affected and open according issues there. But first, we have to decide, which way to go. I personally would vote for options `1.` and `A.`, as they would require the least modifications, But I can also see the benefits of options `2.` and `B.`, which require more work, as `2.` would add more information to the low-level `bob.db` databases, and `B.` would be cleaner.
@amohammadi @andre.anjos @tiago.pereira @sebastien.marcel What is your opinion? Did I miss something here? Is `bob.db.gbu` (and others) really currently not working?May 2017 Hackathonhttps://gitlab.idiap.ch/bob/bob/-/issues/184bob.sp.Quantization has weird border handling2015-08-18T15:57:28ZAndré Anjosbob.sp.Quantization has weird border handling*Created by: siebenkopf*
By chance, I had a look at the ``bob.sp.Quantization`` class. It seems that this class has several issues, especially in border cases:
1. the __call__ function returns 0 in two cases: when the element is in the...*Created by: siebenkopf*
By chance, I had a look at the ``bob.sp.Quantization`` class. It seems that this class has several issues, especially in border cases:
1. the __call__ function returns 0 in two cases: when the element is in the first range, **or** when the element is below the lowest threshold
2. the __call__ function returns the highest index in two cases: when the element is in the last range, **or** when the element is above the highest threshold
In fact, point (2) cannot even be distinguished in the C++ implementation of the function since the highest threshold in not even stored in the range of thresholds. Usually, when there are 4 ranges, it requires 5 thresholds, but this class holds only 4.https://gitlab.idiap.ch/bob/bob/-/issues/183-DWITH_PERFTOOLS option does not work2019-04-19T22:41:25ZAndré Anjos-DWITH_PERFTOOLS option does not work*Created by: laurentes*
It seems that this option does not work anymore on the master branch.
I don't know yet if this also affect the 1.2 branch.
The problems seems to be caused by the use of WITH_PERFTOOLS as a C-like defined vari...*Created by: laurentes*
It seems that this option does not work anymore on the master branch.
I don't know yet if this also affect the 1.2 branch.
The problems seems to be caused by the use of WITH_PERFTOOLS as a C-like defined variable, whereas this is initially a cmake variable.
The easiest solution is to perform the inclusion check at the cmake level rather than by the C preprocessor. A good example is what was done for libsvm.v2.0https://gitlab.idiap.ch/bob/bob/-/issues/179No support of log determinant2014-01-04T16:02:46ZAndré AnjosNo support of log determinant*Created by: laurentes*
As said [here](http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.det.html), determinant computation is subject to underflow/overflow.
When computing log determinant, this might be avoided by dir...*Created by: laurentes*
As said [here](http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.det.html), determinant computation is subject to underflow/overflow.
When computing log determinant, this might be avoided by directly working in the log domain. In the python universe, we may rely on numpy.linalg.slogdet(). We should consider to provide such a function in the C++ universe.
Currently, the PLDAMachine class may be subject to underflow/overflow.v2.0https://gitlab.idiap.ch/bob/bob/-/issues/178ROC and DET plots have wrong axis2016-08-04T09:32:30ZAndré AnjosROC and DET plots have wrong axis*Created by: siebenkopf*
I always found the way, ROC and DET plots are plotted using ``bob.measure.plot.roc`` and ``bob.measure.plot.det``, wrong. For some reason, someone has decided to plot the FRR axis in the abscissa and the FAR in ...*Created by: siebenkopf*
I always found the way, ROC and DET plots are plotted using ``bob.measure.plot.roc`` and ``bob.measure.plot.det``, wrong. For some reason, someone has decided to plot the FRR axis in the abscissa and the FAR in the ordinate. I have never seen plots like this before, and also the "Handbook of Biometrics" shows ROC and DET plots with the FAR in the abscissa (page 9). Also wikipedia pages show DET and ROC plots this way.
Normally, I would not consider the way the ROC is plotted as a bug. But in fact, I was using the function ``bob.measure.det``, and I blindly expected getting FAR and FRR in this order. Unfortunately, this was not the case, which lead my plots to be reverted (FAR was FRR and vice versa).
To be conform with the god of Biometrics, we should change both:
- the axis of the ROC / DET plots
- the order in which ``bob.measure.roc`` and ``bob.measure.det`` return the results.https://gitlab.idiap.ch/bob/bob/-/issues/176Shift to a non-GPL library for FFT/DCT computation2016-08-04T09:32:26ZAndré AnjosShift to a non-GPL library for FFT/DCT computation*Created by: laurentes*
Bob (1.2.x) currently relies on FFTW for FFT/DCT computation. FFTW has a GPL license. We are now considering to turn the license of Bob from GPL to BSD. This would imply that we should not link any more against G...*Created by: laurentes*
Bob (1.2.x) currently relies on FFTW for FFT/DCT computation. FFTW has a GPL license. We are now considering to turn the license of Bob from GPL to BSD. This would imply that we should not link any more against GPL libraries. FFTW is the only GPL dependence that we have.
Furthermore, we are looking for alternatives to FFTW. There are already naive implementations of DFT/DCT in bob, which are used for testing purposes. But there are really slow for large arrays. We are hence looking for more optimized source code. I have performed few tests to rely on two different BSD-like FFT libraries:
1. Kiss FFT (C and C++ implementation): I was not able to make the C implementation working with 'double' instead of default 'float'. This just provides wrong outputs. And the documentation is quite poor. The C++ implementation is working with 'double', but it does only support 1D FFT (No nD FFT or DCT computation). However, I still had to tweak/fix the code to make it compatible with all the platforms we are supporting.
2. NumPy's FFT implementation (C code based on former FFTPACK fortran's implementation [P.S.: Note that SciPy also provides a different FFT implementation based on the original FFTPACK fortran's code]:. NumPy's implementation only provide FFT (not DCT). The code is much larger than the one of kiss FFT (2k lines vs 200 lines), but is probably more reliable, since it has been used for several years by this wide deployed library.
For both solutions, we won't add a new dependence, but instead we would just cannibalize the FFT source code into bob central repository (There is no Ubuntu/OS X packages for kiss FFT any way). I have pushed FFT/DCT implementations in the master branch that rely on both libraries (separately), to keep track of all the tests I did. Solution 2. is my favourite so far. If we go for it, I will just remove the FFTW and kiss FFT-based implementations, and renamed the FFT1DNumpy (2D/DCT, etc.) classes into FFT1D. The documentation should then be carefully updated. As the underlying implementation's will be different, this may slightly affect the outputs/features/results generated with FFTW.v2.0https://gitlab.idiap.ch/bob/bob/-/issues/174bob.ip.draw_... methods take arguments in wrong order2013-11-15T19:55:28ZAndré Anjosbob.ip.draw_... methods take arguments in wrong order*Created by: siebenkopf*
Usually, points in Bob are given in (y,x) order, and functions always take (y,x) as arguments (in this order). Having a look at the documentation of the bob.ip.draw_point (and similar) functions, they take argum...*Created by: siebenkopf*
Usually, points in Bob are given in (y,x) order, and functions always take (y,x) as arguments (in this order). Having a look at the documentation of the bob.ip.draw_point (and similar) functions, they take arguments in order (x,y).
A fix of this would be nice, to have a consistent order of arguments in Bob.v2.0https://gitlab.idiap.ch/bob/bob/-/issues/169bob.core.random has many classes for different data types2016-08-04T09:32:16ZAndré Anjosbob.core.random has many classes for different data types*Created by: siebenkopf*
Having a look at the bob.core.random module, I can find several bindings of classes for different data types. Instead of having all these classes, I would suggest two solutions:
1. We have one class for each ...*Created by: siebenkopf*
Having a look at the bob.core.random module, I can find several bindings of classes for different data types. Instead of having all these classes, I would suggest two solutions:
1. We have one class for each distribution type and a dtype-like parameter for the constructor.
2. We have only one class *overall*, having the dtype and the distribution type as parameters.
Either of the solutions will break the API, but I think we should avoid these data type specific classes and functions. In C++, these classes are templated, anyways...v2.0https://gitlab.idiap.ch/bob/bob/-/issues/154bob.k and others are fun but unexpected2013-07-23T06:53:53ZAndré Anjosbob.k and others are fun but unexpected*Created by: khoury*
bob.k, bob.core.k, bob.core.random.k, bob.io.k, bob.ip.k, bob.sp.k, bob.measure.k and others are caused by the lines:
__all__ = [k for k in dir() if not k.startswith('_')]
This should likely be replaced by:
__all...*Created by: khoury*
bob.k, bob.core.k, bob.core.random.k, bob.io.k, bob.ip.k, bob.sp.k, bob.measure.k and others are caused by the lines:
__all__ = [k for k in dir() if not k.startswith('_')]
This should likely be replaced by:
__all__ = dir()https://gitlab.idiap.ch/bob/bob/-/issues/151Making python bindings more consistent when using blitz arrays and std::vecto...2016-08-04T09:31:43ZAndré AnjosMaking python bindings more consistent when using blitz arrays and std::vector of blitz arrays*Created by: laurentes*
Our current python bindings that relies on C++ methods/functions that take blitz arrays as arguments are quite heterogeneous. Ideally, we should follow this way:
Given a class:
```c++
class Myclass {
publ...*Created by: laurentes*
Our current python bindings that relies on C++ methods/functions that take blitz arrays as arguments are quite heterogeneous. Ideally, we should follow this way:
Given a class:
```c++
class Myclass {
public:
void setW(const blitz::Array<double,1>& w) { m_w = w; }
const blitz::Array<double,1>& getW() { return m_w; }
private:
blitz::Array<double,1> m_w;
};
```
The python binding could be done as follows:
```c++
static void py_setW(bob::Myclass& m, bob::python::const_ndarray w) {
machine.setW(w.bz<double,1>());
}
class_<bob::Myclass, boost::shared_ptr<bob::Myclass> >("Myclass",
"This class implements ...", init<>())
.add_property("w", make_function(&bob::Myclass::getW, return_value_policy<copy_const_reference>()), &py_setW, "Paramaters for ...")
```
For the getter, this will make a copy of the array and cast it into a NumPy array.
For the setter, this allows various type (NumPy array, Python list) to be supported, and exceptions are managed by the bz<>() method.
For std::vector of blitz::Array's, the following could be done:
Given a class:
```c++
class Myclass {
public:
void setW(const std::vector<blitz::Array<double,1> >& w) { m_w = ... }
const std::vector<blitz::Array<double,1> >& getW() { return m_w; }
private:
std::vector<blitz::Array<double,1> > m_w;
};
```
The python binding could be done as follows:
```c++
static void py_setW(bob::Myclass& m, object w) {
stl_input_iterator<bob::python::const_ndarray> dbegin(w), dend;
std::vector<bob::python::const_ndarray> wdata(dbegin, dend);
std::vector<blitz::Array<double,1> > wb;
for(size_t i=0; i<wdata.size(); ++i)
wb.push_back(wdata[i].bz<double,1>());
machine.setW(wb);
}
static object py_getW(bob::Myclass& m) {
boost::python::list l;
const std::vector<double,1>& w = m.getW();
for(size_t i=0; i<w.size(); ++i)
l.append(w[i]);
return boost::python::tuple(l);
}
class_<bob::Myclass, boost::shared_ptr<bob::Myclass> >("Myclass",
"This class implements ...", init<>())
.add_property("w", &py_getW, &py_setW, "Paramaters for ...")
```
This way, the setter allows heterogeneous python type (NumPy array, python list) and the getters relies on the copy automagically done from blitz to NumPy.https://gitlab.idiap.ch/bob/bob/-/issues/150Turn within-class and between-class scatter matrices computation into a 'publ...2016-08-04T09:31:41ZAndré AnjosTurn within-class and between-class scatter matrices computation into a 'public' feature*Created by: laurentes*
The computation of the within-class and between-class scatter matrices are currently done in two different classes: FisherLDATrainer and WCCNTrainer. To avoid this duplication of code, we should move this feature...*Created by: laurentes*
The computation of the within-class and between-class scatter matrices are currently done in two different classes: FisherLDATrainer and WCCNTrainer. To avoid this duplication of code, we should move this feature into the math module of bob.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/148Binding for color conversions in bob.ip2013-07-01T12:58:03ZAndré AnjosBinding for color conversions in bob.ip*Created by: siebenkopf*
Currently, we have color conversion functions bound to python, which are dependent on the data type. E.g. we have four different functions to convert RGB to Gray:
- rgb_to_gray(image, image)
- rgb_to_gray_f(fl...*Created by: siebenkopf*
Currently, we have color conversion functions bound to python, which are dependent on the data type. E.g. we have four different functions to convert RGB to Gray:
- rgb_to_gray(image, image)
- rgb_to_gray_f(float, float, float)
- rgb_to_gray_u8(int, int, int)
- rgb_to_gray_u16(int, int, int)
At least for the latter three functions I would rather consider a more pythonic way, like
- rgb_to_gray(float, float, float, dtype)
so that this function can be called with any data type. During binding, one could select the appropriate C++ implementation...
v1.2https://gitlab.idiap.ch/bob/bob/-/issues/143Provide is_similar_to() for all Machines in Bob2019-10-17T06:10:24ZAndré AnjosProvide is_similar_to() for all Machines in Bob*Created by: anjos*
> This issue was migrated from bug #104
Provide `is_similar_to(const Object& b, const double epsilon=1e-8)` functions for all C++ classes that have double members. The default `==` and `!=` operators are largely u...*Created by: anjos*
> This issue was migrated from bug #104
Provide `is_similar_to(const Object& b, const double epsilon=1e-8)` functions for all C++ classes that have double members. The default `==` and `!=` operators are largely useless when we want to provide code that runs on several platforms (i.e., 32 and 64 bit machines).
> Note (AA): Many classes already implement that. As soon as the remaining classes
> get fixed, this bug can be closed.https://gitlab.idiap.ch/bob/bob/-/issues/142HDF5 files for Machines and version numbering2016-08-04T09:31:30ZAndré AnjosHDF5 files for Machines and version numbering*Created by: anjos*
> This issue was migrated from bug #104
Learning from our previous experiences, I would be in favour of making mandatory a version number to any class that provides a 'save_to_hdf5' method.*Created by: anjos*
> This issue was migrated from bug #104
Learning from our previous experiences, I would be in favour of making mandatory a version number to any class that provides a 'save_to_hdf5' method.v2.0https://gitlab.idiap.ch/bob/bob/-/issues/134FisherLDATrainer return more than C-1 dimensions2016-08-04T09:31:15ZAndré AnjosFisherLDATrainer return more than C-1 dimensions*Created by: siebenkopf*
From a theoretical point of view, the LDA projection matrix is limited to C-1 dimensions, where C is the number of classes in your problem. Nevertheless, the FisherLDATrainer returns N-1 dimensions, where N is t...*Created by: siebenkopf*
From a theoretical point of view, the LDA projection matrix is limited to C-1 dimensions, where C is the number of classes in your problem. Nevertheless, the FisherLDATrainer returns N-1 dimensions, where N is the dimension of the feature vectors.
Having a look at some toy example we found that only C-1 eigenvectors of LDA have eigenvalues higher than zero, while the rest is very close to zero, as is expectable from the theoretical point of view. Hence, it would make sense to limit the number of dimensions to C-1 since the remaining eigenvectors are subjected to precision errors.
Anyhow, I found that these zero-eigenvalue eigenvectors are valuable. Hence, it would be nice to have an option to still retain them, while by default they should be removed.https://gitlab.idiap.ch/bob/bob/-/issues/124LBP implementation is overcomplicated2016-08-04T09:30:57ZAndré AnjosLBP implementation is overcomplicated*Created by: laurentes*
The current implementation of the LBP is made complicated by the use of the LBP abstract class. This class should likely be refactored and made parametrizable, to avoid the definition of the additional LBP4, LBP8...*Created by: laurentes*
The current implementation of the LBP is made complicated by the use of the LBP abstract class. This class should likely be refactored and made parametrizable, to avoid the definition of the additional LBP4, LBP8 and LBP16 classes that bring an extra layer of complexity and make the code much more difficult to maintain.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/121Random initialization of arrays is inconsistent2016-08-04T09:30:50ZAndré AnjosRandom initialization of arrays is inconsistent*Created by: laurentes*
At the C++ level, there are several options to generate random numbers. Across the library, this is currently not consistent: We sometimes rely on blitz++ ranlib, sometimes on boost.
In addition, few classes all...*Created by: laurentes*
At the C++ level, there are several options to generate random numbers. Across the library, this is currently not consistent: We sometimes rely on blitz++ ranlib, sometimes on boost.
In addition, few classes allow the user to set a boost random number generator, whereas others allow him to set a seed.
We have decided to follow this approach:
- Always use boost at the C++ level
- Classes that make use of random numbers should provide a way to set the boost random number generator.
We still have to discuss whether it is better to handle the boost random number generator through a reference or a boost::shared_ptr.
The goal is aim to converge to this design. This will involve:
- ~~Make the JFATrainer use boost rather than ranlib~~
- Remove the seed attribute from:
* ~~KMeansTrainer~~
* ~~PLDABaseTrainer~~ (done by @laurentes)
- Check if we keep using a reference (or a boost::shared_ptr) in the following classes
* MLP
* DataShuffler
Please be aware that this will slightly affect the results afterwards, as the initial random matrices will be differentv1.2https://gitlab.idiap.ch/bob/bob/-/issues/119Proper definition and usage of the abstract Trainer template class2016-08-04T09:30:47ZAndré AnjosProper definition and usage of the abstract Trainer template class*Created by: laurentes*
For the major release 1.2.0, I would be in favour of consolidating the abstract Trainer class. There are trainer class that do not inherit from it. In this case, inheritance might help us to uniformise the API.*Created by: laurentes*
For the major release 1.2.0, I would be in favour of consolidating the abstract Trainer class. There are trainer class that do not inherit from it. In this case, inheritance might help us to uniformise the API.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/118Proper definition of the abstract Machine template class2016-08-04T09:30:45ZAndré AnjosProper definition of the abstract Machine template class*Created by: laurentes*
For the major release 1.2.0, I would be in favour of clearly:
1. Defining what a machine is: To my mind, it is something that can be trained (like the term 'machine' in 'machine learning'), and that ouputs somet...*Created by: laurentes*
For the major release 1.2.0, I would be in favour of clearly:
1. Defining what a machine is: To my mind, it is something that can be trained (like the term 'machine' in 'machine learning'), and that ouputs something given some input.
2. Updating the current abstract class API. To my mind, a machine should have:
- forward's methods
- load/save methods
- copy constructor, assignment operator, and comparison operators (==, != and is_similar_to)
Once done, we should update the 'machine' module accordingly. In addition, for each template, the generic Machine can be bound into python, which will help us to have a more consistent API.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/116I-vector extractor and ISV/JFA consolidation2016-08-04T09:30:41ZAndré AnjosI-vector extractor and ISV/JFA consolidation*Created by: laurentes*
As discussed in ticket #104 and internally at Idiap, I've finally started to work on an i-vector extractor. As there are strong links with Joint Factor Analysis, I will have to factorize some pieces of code (ISV/...*Created by: laurentes*
As discussed in ticket #104 and internally at Idiap, I've finally started to work on an i-vector extractor. As there are strong links with Joint Factor Analysis, I will have to factorize some pieces of code (ISV/JFA machine and trainer). This will affect the API of the master branch.
After some preliminary work, this will involve:
- To define separate classes for ISV and JFA (for both machine and trainer)
- To rely on the newly defined functions from bob::core for the operator==(), operator!=() and is_similar_to() methods which will make the code easier to maintain and smaller
- To define a helper method to be able to load ISV/JFA models previously saved
- To add a seed parameter (with getter/setter) to the trainers
- Wrt. API standardization, a machine must have a forward() method, as well as the possibility to be saved/loaded to/from an HDF5 file. I will hence update the bob::machine::Machine class accordingly. This way, we could set the API once (for the forward()/load()/save() methods and eventually others) for all the classes (of a specific template specialization). This would at the end lead to much less work to make sure that the API remains consistent across machines.
- Another point discussed was whether a trainer might change the hyperparameters of a machine. I'm against this option, and the optimal strategy is probably as follows: a machine has hyperparameters (features/subspace dimensionality, etc.). When a machine is passed to a trainer, these hyperparameters must have been set previously. This way, we clearly identify what are the duties of a trainer, and we don't have parameters which are both parameters of a machine and a trainer.
- To add a tutorial on this part
- To update the docstring of the python bindings with references (likely to Crim's work and our work)