bob issueshttps://gitlab.idiap.ch/bob/bob/-/issues2016-08-04T09:31:41Zhttps://gitlab.idiap.ch/bob/bob/-/issues/150Turn within-class and between-class scatter matrices computation into a 'publ...2016-08-04T09:31:41ZAndré AnjosTurn within-class and between-class scatter matrices computation into a 'public' feature*Created by: laurentes*
The computation of the within-class and between-class scatter matrices are currently done in two different classes: FisherLDATrainer and WCCNTrainer. To avoid this duplication of code, we should move this feature...*Created by: laurentes*
The computation of the within-class and between-class scatter matrices are currently done in two different classes: FisherLDATrainer and WCCNTrainer. To avoid this duplication of code, we should move this feature into the math module of bob.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/148Binding for color conversions in bob.ip2013-07-01T12:58:03ZAndré AnjosBinding for color conversions in bob.ip*Created by: siebenkopf*
Currently, we have color conversion functions bound to python, which are dependent on the data type. E.g. we have four different functions to convert RGB to Gray:
- rgb_to_gray(image, image)
- rgb_to_gray_f(fl...*Created by: siebenkopf*
Currently, we have color conversion functions bound to python, which are dependent on the data type. E.g. we have four different functions to convert RGB to Gray:
- rgb_to_gray(image, image)
- rgb_to_gray_f(float, float, float)
- rgb_to_gray_u8(int, int, int)
- rgb_to_gray_u16(int, int, int)
At least for the latter three functions I would rather consider a more pythonic way, like
- rgb_to_gray(float, float, float, dtype)
so that this function can be called with any data type. During binding, one could select the appropriate C++ implementation...
v1.2https://gitlab.idiap.ch/bob/bob/-/issues/135Sphinx Autosummary and Bob's documentation2016-08-04T09:31:17ZAndré AnjosSphinx Autosummary and Bob's documentation*Created by: anjos*
I'd propose we stop having our manuals with every submodule including all methods and classes in that submodule, and went more like the manuals for NumPy and SciPy (http://docs.scipy.org/doc/numpy/reference/routines....*Created by: anjos*
I'd propose we stop having our manuals with every submodule including all methods and classes in that submodule, and went more like the manuals for NumPy and SciPy (http://docs.scipy.org/doc/numpy/reference/routines.math.html) in which every method or class gets its own dedicated page, with an upfront summary. This can be done with the sphinx autosummary extension (http://sphinx-doc.org/latest/ext/autosummary.html).
That should make it easier to browse and reference to our documentation.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/131Bob misses a naive Fisher LDA implementation2016-08-04T09:31:09ZAndré AnjosBob misses a naive Fisher LDA implementation*Created by: anjos*
The current implementation of FisherLDA on Bob uses Lapack's `dsygv`, which is supposed to be more numerically stable than using `dsyevd` since it does not require the inversion of Sw. It can still fail in certain co...*Created by: anjos*
The current implementation of FisherLDA on Bob uses Lapack's `dsygv`, which is supposed to be more numerically stable than using `dsyevd` since it does not require the inversion of Sw. It can still fail in certain conditions. Another implementation that would still use `dsyevd` would be possible using the pseudo-inverse instead of the inverse of Sw and that could be more robust - but slower - in certain cases.
Lapack does not provide a pseudo-inverse function, but that should be easily implementable using QR factorization or SVD:
http://icl.cs.utk.edu/lapack-forum/archives/lapack/msg01395.htmlv1.2https://gitlab.idiap.ch/bob/bob/-/issues/130Bob misses a Covariance-based PCA trainer2019-07-16T14:50:50ZAndré AnjosBob misses a Covariance-based PCA trainer*Created by: anjos*
This should be relatively easy to implement and, as long as the number of training examples is greater than the number of features in each sample, it should produce faster results than the SVDPCATrainer. Memory-wise,...*Created by: anjos*
This should be relatively easy to implement and, as long as the number of training examples is greater than the number of features in each sample, it should produce faster results than the SVDPCATrainer. Memory-wise, it should be less efficient though.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/128MLP machine revamping for introducing new features2016-08-04T09:31:03ZAndré AnjosMLP machine revamping for introducing new features*Created by: laurentes*
We would like to revamp the MLP machine to have the following features:
- ~~Possibility to get the outputs of each layer (Both before and after applying the activation function)~~
- ~~Possibility to set a diffe...*Created by: laurentes*
We would like to revamp the MLP machine to have the following features:
- ~~Possibility to get the outputs of each layer (Both before and after applying the activation function)~~
- ~~Possibility to set a different activation function for the last/output layer (useful when using MLP for regression)~~
- ~~Possibility to do backward propagation directly using the machine (To avoid code duplication within the trainers)~~
~~In particular, this should simplify the definition of new trainers for this MLP machine, but will (slightly) increase the cost when processing data (Saving intermediate outputs from each layer).~~ (fixed by adding a base MLP trainer class)v1.2https://gitlab.idiap.ch/bob/bob/-/issues/124LBP implementation is overcomplicated2016-08-04T09:30:57ZAndré AnjosLBP implementation is overcomplicated*Created by: laurentes*
The current implementation of the LBP is made complicated by the use of the LBP abstract class. This class should likely be refactored and made parametrizable, to avoid the definition of the additional LBP4, LBP8...*Created by: laurentes*
The current implementation of the LBP is made complicated by the use of the LBP abstract class. This class should likely be refactored and made parametrizable, to avoid the definition of the additional LBP4, LBP8 and LBP16 classes that bring an extra layer of complexity and make the code much more difficult to maintain.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/121Random initialization of arrays is inconsistent2016-08-04T09:30:50ZAndré AnjosRandom initialization of arrays is inconsistent*Created by: laurentes*
At the C++ level, there are several options to generate random numbers. Across the library, this is currently not consistent: We sometimes rely on blitz++ ranlib, sometimes on boost.
In addition, few classes all...*Created by: laurentes*
At the C++ level, there are several options to generate random numbers. Across the library, this is currently not consistent: We sometimes rely on blitz++ ranlib, sometimes on boost.
In addition, few classes allow the user to set a boost random number generator, whereas others allow him to set a seed.
We have decided to follow this approach:
- Always use boost at the C++ level
- Classes that make use of random numbers should provide a way to set the boost random number generator.
We still have to discuss whether it is better to handle the boost random number generator through a reference or a boost::shared_ptr.
The goal is aim to converge to this design. This will involve:
- ~~Make the JFATrainer use boost rather than ranlib~~
- Remove the seed attribute from:
* ~~KMeansTrainer~~
* ~~PLDABaseTrainer~~ (done by @laurentes)
- Check if we keep using a reference (or a boost::shared_ptr) in the following classes
* MLP
* DataShuffler
Please be aware that this will slightly affect the results afterwards, as the initial random matrices will be differentv1.2https://gitlab.idiap.ch/bob/bob/-/issues/1202D PCA implementation is incomplete (and untested)2013-05-02T12:29:31ZAndré Anjos2D PCA implementation is incomplete (and untested)*Created by: laurentes*
Ages ago, I've started to implement the 2D PCA algorithm described in the following paper:
[Two-Dimensional PCA: A New Approach to Appearance-Based Face Representation and Recognition, Yang et. al, TPAMI 2004](h...*Created by: laurentes*
Ages ago, I've started to implement the 2D PCA algorithm described in the following paper:
[Two-Dimensional PCA: A New Approach to Appearance-Based Face Representation and Recognition, Yang et. al, TPAMI 2004](http://repository.lib.polyu.edu.hk/jspui/bitstream/10397/190/1/137.pdf).
This should be finalized or removed before the next major release.
More recent work on the Generalized PCA (e.g. Vidal, and co. http://cis.jhu.edu/~rvidal/) might be considered.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/119Proper definition and usage of the abstract Trainer template class2016-08-04T09:30:47ZAndré AnjosProper definition and usage of the abstract Trainer template class*Created by: laurentes*
For the major release 1.2.0, I would be in favour of consolidating the abstract Trainer class. There are trainer class that do not inherit from it. In this case, inheritance might help us to uniformise the API.*Created by: laurentes*
For the major release 1.2.0, I would be in favour of consolidating the abstract Trainer class. There are trainer class that do not inherit from it. In this case, inheritance might help us to uniformise the API.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/118Proper definition of the abstract Machine template class2016-08-04T09:30:45ZAndré AnjosProper definition of the abstract Machine template class*Created by: laurentes*
For the major release 1.2.0, I would be in favour of clearly:
1. Defining what a machine is: To my mind, it is something that can be trained (like the term 'machine' in 'machine learning'), and that ouputs somet...*Created by: laurentes*
For the major release 1.2.0, I would be in favour of clearly:
1. Defining what a machine is: To my mind, it is something that can be trained (like the term 'machine' in 'machine learning'), and that ouputs something given some input.
2. Updating the current abstract class API. To my mind, a machine should have:
- forward's methods
- load/save methods
- copy constructor, assignment operator, and comparison operators (==, != and is_similar_to)
Once done, we should update the 'machine' module accordingly. In addition, for each template, the generic Machine can be bound into python, which will help us to have a more consistent API.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/110IP bug for rgb_to_hsl: returns NaNs2016-08-04T09:30:30ZAndré AnjosIP bug for rgb_to_hsl: returns NaNs*Created by: csmccool*
The floating point implementations of "rgb_to_hsl" return NaNs if you pass an RGB array of ones (1.,1.,1.), however, the integer based methods don't seem to have this issue. Below are some examples of the problem ...*Created by: csmccool*
The floating point implementations of "rgb_to_hsl" return NaNs if you pass an RGB array of ones (1.,1.,1.), however, the integer based methods don't seem to have this issue. Below are some examples of the problem using the python interface:
```python
import bob;
import scipy;
bob.ip.rgb_to_hsl(scipy.array([[[1.]],[[1.]],[[1.]]])) # Using a scipy array or numpy array
```
RETURNS
```python
array([[[ nan]],
[[ nan]],
[[ 1.]]])
```
While
```python
bob.ip.rgb_to_hsl_f(1.,1.,1.)
```
RETURNS
```python
(nan, nan, 1.0)
```
The integer based methods seem to be ok as can be seen below:
```python
bob.ip.rgb_to_hsl_u8(255,255,255)
RETURNS
(0, 0, 255)
bob.ip.rgb_to_hsl_u16(65535,65535,65535)
RETURNS
(0, 0, 65535)
```
Cheers,
Chris.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/105Distance in KMeansMachine is square Euclidean2016-08-04T09:30:21ZAndré AnjosDistance in KMeansMachine is square Euclidean*Created by: laurentes*
Preparing the Bob tutorial, I've just noticed that the distance returned by the getDistanceFromMean() method of the KMeansMachine class is not the Euclidean distance, but the square Euclidean distance. This is no...*Created by: laurentes*
Preparing the Bob tutorial, I've just noticed that the distance returned by the getDistanceFromMean() method of the KMeansMachine class is not the Euclidean distance, but the square Euclidean distance. This is not a real problem because of the bijection between these two, but to my mind, this needs to be updated.
For this purpose, we need to update/add tests, and make sure that it does not affect our baseline results.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/103Cepstral features finalization2016-08-04T09:30:16ZAndré AnjosCepstral features finalization*Created by: laurentes*
In order to finalize the Cepstral features extraction, there is a need to:
- Add default values in constructor (+python bindings) (done by @laurentes but please check default values)
- Make parameters names c...*Created by: laurentes*
In order to finalize the Cepstral features extraction, there is a need to:
- Add default values in constructor (+python bindings) (done by @laurentes but please check default values)
- Make parameters names comparable to the ones from existing software such as HTK
- Document methods in header and add inline comments directly in the code when required
- Add more tests with various parameters
- Add copy constructor, assignment operator, comparison operatorsv1.2https://gitlab.idiap.ch/bob/bob/-/issues/94Some mismatch in EER between bob and Bosaris2016-08-04T09:29:50ZAndré AnjosSome mismatch in EER between bob and Bosaris*Created by: khoury*
The EER computed by bob_compute_perf.py is little bit higher than the one using Bosaris Toolkit.
Here is the link for Bosaris Toolkit:
https://sites.google.com/site/bosaristoolkit/*Created by: khoury*
The EER computed by bob_compute_perf.py is little bit higher than the one using Bosaris Toolkit.
Here is the link for Bosaris Toolkit:
https://sites.google.com/site/bosaristoolkit/v1.2https://gitlab.idiap.ch/bob/bob/-/issues/86Arrayset new features and improvements2016-08-04T09:29:36ZAndré AnjosArrayset new features and improvements*Created by: laurentes*
There are two new features that would be nice for the bob.io.Arrayset class.
1. There is currently no way to clear the content of an arrayset. The two options are a. to iterate over and delete the elements one b...*Created by: laurentes*
There are two new features that would be nice for the bob.io.Arrayset class.
1. There is currently no way to clear the content of an arrayset. The two options are a. to iterate over and delete the elements one by one or b. to delete the python object. It would be nice to introduce a clear() method.
2. There are only two possible states for an Arrayset:
* ``external`` where the content is in a file
* ``inlined`` where the content is stored in a set of bob.io.Arrays (as many as the number of samples in the Arrayset).
The ``inlined`` version might introduce a significant overhead when the Arraysets consists of many small Arrays, as in this case, there are as many Array headers as samples.
One solution would be to have a third state which stores the sample in a single Array, and where samples are obtained by slicing this Array over the third dimension.
Below is an example to highlight the problem:
Firstly, we allocate 50 samples of dimensions 1024*1024 double's, that is 400MBytes overall.
```python
import numpy, bob
A=bob.io.Arrayset(numpy.random.rand(50,1024*1024))
```
The memory usage reported (using command line tool "top") is roughly 400MB.
Secondly, we allocate 1024*1024 samples of dimensions 50 double's, that is 400MBytes overall as well.
```python
import numpy, bob
A=bob.io.Arrayset(numpy.random.rand(1024*1024,50))
```
The memory usage reported is about 800MB, which means an overhead of 100%. In particular, this occurs in common UBM/GMM experiments. Furthermore, it would be nice to find a fix.
Any suggestion is welcome.v1.2https://gitlab.idiap.ch/bob/bob/-/issues/27Usage of abbreviations in namespaces2016-08-04T09:27:37ZAndré AnjosUsage of abbreviations in namespaces*Created by: siebenkopf*
When I try to read some C++ code of Bob, I often step over an abbreviated namespace (like tp for bob::python). Also in the python code, sometimes packages are abbreviated or, even worse, "from xxx import yyy as ...*Created by: siebenkopf*
When I try to read some C++ code of Bob, I often step over an abbreviated namespace (like tp for bob::python). Also in the python code, sometimes packages are abbreviated or, even worse, "from xxx import yyy as zzz" is used. This makes the code really hard to read since I always have to scroll up to the definition of abbreviation to understand, what is actually called.
I definitely would go for removing all these abbreviations and using the fully qualified names. I know that writing code this way may take a little longer, but I think it is worth the time. And most of the editors provide automatic source completion, that speeds up the typing.
What do you think about this?v1.2