beat.backend.python issues
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues
2019-04-11T08:58:17Z
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/21
Add BEAT classifier to setup.py
2019-04-11T08:58:17Z
Samuel GAIST
Add BEAT classifier to setup.py
The classifier has been added to Pypi so it now can be used.
See #19
The classifier has been added to Pypi so it now can be used.
See #19
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/22
Update schema tests to conform to beat.core!65
2019-04-25T07:08:43Z
Jaden DIEFENBAUGH
Update schema tests to conform to beat.core!65
The algorithm/plotter schemas have been refined in beat.core!65 to more closely reflect the actual restrictions of these object types. This means that some valid metadata tests are now invalid, and needs to be updated.
This is blocking ...
The algorithm/plotter schemas have been refined in beat.core!65 to more closely reflect the actual restrictions of these object types. This means that some valid metadata tests are now invalid, and needs to be updated.
This is blocking beat.core!65 (see the [note about the failing test](https://gitlab.idiap.ch/beat/beat.core/merge_requests/65#note_41115) for info on how this was found)
Jaden DIEFENBAUGH
Jaden DIEFENBAUGH
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/23
Improve storage classes
2019-05-08T08:53:02Z
Samuel GAIST
Improve storage classes
The Storage and CodeStorage classes share a parallel goals, part of their code is exactly the same yet they are two distinct classes.
Fix this by creating a common base class so they share the configuration member variable and their check.
The Storage and CodeStorage classes share a parallel goals, part of their code is exactly the same yet they are two distinct classes.
Fix this by creating a common base class so they share the configuration member variable and their check.
Soft loops
Samuel GAIST
Samuel GAIST
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/24
Assert usage cleanup
2019-05-10T08:16:51Z
Samuel GAIST
Assert usage cleanup
Following bandit warning about usage of assert in code, this issue is used to track down the cleanup of these statements found in beat.backend.python.
Relates to beat/beat.core#72
Following bandit warning about usage of assert in code, this issue is used to track down the cleanup of these statements found in beat.backend.python.
Relates to beat/beat.core#72
Soft loops
Samuel GAIST
Samuel GAIST
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/25
Handle duplicate key in json data
2019-06-17T05:48:34Z
Samuel GAIST
Handle duplicate key in json data
Currently loading a json file that contains a key multiple times will result in the last entry being used.
To avoid getting strange result, implement a hook that will raise an error and stop there.
Currently loading a json file that contains a key multiple times will result in the last entry being used.
To avoid getting strange result, implement a hook that will raise an error and stop there.
Soft loops
Samuel GAIST
Samuel GAIST
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/26
Validation fails with utf-8 error
2019-09-25T09:25:30Z
Samuel GAIST
Validation fails with utf-8 error
Under some circumstances the values sent for validation may trigger an "utf-8" decoding error.
This comes from the fact that all zmq packets received are decoded before being passed to the callbacks corresponding to the received command...
Under some circumstances the values sent for validation may trigger an "utf-8" decoding error.
This comes from the fact that all zmq packets received are decoded before being passed to the callbacks corresponding to the received command.
The solution here is to pass the received data as is to the callbacks and they are responsible for decoding the data if needed.
Soft loops
Samuel GAIST
Samuel GAIST
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/27
Improve loop evaluator synchronized write
2019-10-01T10:08:15Z
Samuel GAIST
Improve loop evaluator synchronized write
The loop evaluator write method is call once per output write. However, a processor block may have several outputs which means that the evaluator write method will be called as many times as an output is written. So in the case of a sequ...
The loop evaluator write method is call once per output write. However, a processor block may have several outputs which means that the evaluator write method will be called as many times as an output is written. So in the case of a sequential processor algorithm with 2 outputs and 3 input data, the evaluator write method will be call 6 times.
After talking with @andre.anjos, the current solution proposed is to add the output name written to as a parameter of the evaluator write method so that the developer can chose on which one he wants to "synchronize" the evaluator output.
Soft loops
Samuel GAIST
Samuel GAIST
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/28
License file location is wrong in conda recipe
2019-11-14T15:58:26Z
André Anjos
License file location is wrong in conda recipe
It should read `LICENSE` instead of `../LICENSE`.
The same should be propagated to all relevant packages.
It should read `LICENSE` instead of `../LICENSE`.
The same should be propagated to all relevant packages.
Samuel GAIST
Samuel GAIST
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/30
Error when waiting on a loop that wasn't started
2020-02-06T17:36:09Z
Samuel GAIST
Error when waiting on a loop that wasn't started
If for some reason the loop, or database, `process` method was not called and `wait` is called, a runtime error will occur because the message handler was not started as expected by the current code.
If for some reason the loop, or database, `process` method was not called and `wait` is called, a runtime error will occur because the message handler was not started as expected by the current code.
Soft loops
Samuel GAIST
Samuel GAIST
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/31
Support bytes python type
2020-05-27T16:23:17Z
Samuel GAIST
Support bytes python type
Currently there's no way to send bytes directly from one algorithm to another.
The current "workaround" is to build the bytes output then use base64 encoding, use a string based type to write the data to the output and on the other end ...
Currently there's no way to send bytes directly from one algorithm to another.
The current "workaround" is to build the bytes output then use base64 encoding, use a string based type to write the data to the output and on the other end load it back from string.
Example of working code:
Output:
```python
obj = dict(field1=data.value*2, field2=data.value)
dumped = pickle.dumps(obj) # binary
encoded = base64.b64encode(dumped) # base64 bytes
string = encoded.decode("ascii") # "stringified"
outputs["out_data"].write({"value": string})
```
**WARNING** Do **not** call str(encoded), it won't be a "real string":
```python
example = b"whatever"
example_str = str(example)
example_str
# output is "b'whatever'"
```
Input:
```python
in_data = inputs["in_data"].data # string data
decoded = base64.b64decode(in_data.value) # decode to bytes
obj = pickle.loads(decoded) # obj is a dict and can be read
```
This issue tracks the implementation of the support for the python bytes type.
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/32
Multiprocessing support for data sources
2020-06-08T14:00:57Z
André Anjos
Multiprocessing support for data sources
As discussed in today's debugging session with @samuel.gaist and @amohammadi, using a `DataLoader` object in a multiprocessing context is hard:
1. Typically, the underlying `DataSource`'s `fileobj`'s are opened by the time the process i...
As discussed in today's debugging session with @samuel.gaist and @amohammadi, using a `DataLoader` object in a multiprocessing context is hard:
1. Typically, the underlying `DataSource`'s `fileobj`'s are opened by the time the process is forked
2. Deep copying the object (which goes through pickling and unpickling it) does not properly reset underlying `fileobj` pointers, which makes multiple processes access the same underlying OS-level file handler, causing unwanted behaviour.
To sort this out, we discussed 2 possible additions to this package:
1. `DataLoader` should have a `reset()` method that resets all underlying `DataSource` opened files, so that they can be correctly copied across multiple processes (e.g. in the event of a `fork()`). It should be relatively easy to do a `reset()` operation across all inputs of a user algorithm, to ensure all data sources are properly reset before an eventual user-guided `fork()`.
2. The underlying `DataSource` should have its pickle/unpickle behaviour patched (via overwriting the `__setstate__` slot of `DataSource`, see reference below), so that unpickling a data source (e.g. indirectly via a data loader deep copy), will call `self.reset()` after its state is unpickled. This would allow a `DataLoader` object to be sent over current mechanisms for inter-process communication (e.g. MPI or `multiprocessing.Queue`), transparently.
References:
* Python fileobj handling: https://stackoverflow.com/questions/1834556/does-a-file-object-automatically-close-when-its-reference-count-hits-zero
* Pickle user guide (see in particular `__getstate__` and `__setstate__` on how to overwrite the pickle/unpickle actions): https://docs.python.org/3/library/pickle.html#object.__getstate__
* On sharing (opened) file pointers in a POSIX system after a `fork()` is issued: https://stackoverflow.com/questions/33899548/file-pointers-after-returning-from-a-forked-child-process
Samuel GAIST
Samuel GAIST
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/33
Add multi-processing support to RemoteDataSource
2020-06-08T14:00:56Z
Samuel GAIST
Add multi-processing support to RemoteDataSource
Following #32
The current implementation of the processing block to database container uses a one to one connection through ZMQ exclusive pair.
In order to also allow people to use multiprocessing with a dataset as input to a block, th...
Following #32
The current implementation of the processing block to database container uses a one to one connection through ZMQ exclusive pair.
In order to also allow people to use multiprocessing with a dataset as input to a block, this much be changed to something different. Essentially multiple clients to one server in request/response mode.
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/34
The database View class could be simplified
2020-06-29T12:45:33Z
Amir MOHAMMADI
The database View class could be simplified
I don't know exactly the internals of beat but I wonder why the database view class needs to be this complicated?
There may be some technical reasons behind this but I don't think this should be exposed to the users.
These Views are just...
I don't know exactly the internals of beat but I wonder why the database view class needs to be this complicated?
There may be some technical reasons behind this but I don't think this should be exposed to the users.
These Views are just a container as far as I can see. So, I wonder if having something like:
```python
class Train:
"""The training set"""
def __init__(self, parameters=None, root_folder=None):
self.parameters = parameters
self.root_folder = root_folder
# initialize the db interface here
# e.g. load a csv file
import pandas as pd
self.df = pd.read_csv(...)
def __len__(self):
# return the total number of items in here
return len(self.df)
def __getitem__(self, index):
# return the nth row in here
row = self.df.iloc[index]
# load the data here if necessary
row.image = load(row.filename)
return row
```
would not satisfy the requirements.
Interactivity and Intuitive API
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/35
BEAT componenets are tied to a prefix
2022-03-03T17:35:06Z
Amir MOHAMMADI
BEAT componenets are tied to a prefix
Throughout all code and components of BEAT, a prefix is required and this requirement makes it impossible to define and run BEAT experiments interactively.
Here is a tentative plan for refactoring the code:
1. [ ] Update BEAT compone...
Throughout all code and components of BEAT, a prefix is required and this requirement makes it impossible to define and run BEAT experiments interactively.
Here is a tentative plan for refactoring the code:
1. [ ] Update BEAT component classes so that they can be created on the fly without pointing to a prefix
2. [ ] Implement a global config object to keep track of user's config such as where the prefix is or what the username is. This will help users provide less information when creating objects on the fly.
3. [ ] Dynamic creation of experiment/toolchain with running the python code in a kind of graph mode. This will be similar to how graphs are constructed in Python using tensorflow or dask.
3. [ ] We would also need a singleton class to hold the prefix objects in memory to avoid passing around caches.
https://gitlab.idiap.ch/beat/beat.backend.python/-/issues/36
Editing the json file of database views does not invalidate the cache.
2020-07-28T09:45:09Z
Amir MOHAMMADI
Editing the json file of database views does not invalidate the cache.
Steps to reproduce:
```
$ beat cache clear
$ beat exp pull amohammadi/tutorial/eigenface/1/atnt-eigenfaces-67-comp
$ beat exp run amohammadi/tutorial/eigenface/1/atnt-eigenfaces-67-comp
Index for database atnt/5 not found, building it
li...
Steps to reproduce:
```
$ beat cache clear
$ beat exp pull amohammadi/tutorial/eigenface/1/atnt-eigenfaces-67-comp
$ beat exp run amohammadi/tutorial/eigenface/1/atnt-eigenfaces-67-comp
Index for database atnt/5 not found, building it
lib/python3.7/site-packages/setuptools/distutils_patch.py:26: UserWarning: Distutils was imported before Setuptools. This usage is discouraged and may exhibit undesirable behaviors or errors. Please use Setuptools' objects directly or at least import Setuptools first.
"Distutils was imported before Setuptools. This usage is discouraged "
Index for database atnt/5 not found, building it
Index for database atnt/5 not found, building it
Running `amohammadi/pca/1' for block `linear_machine_training'
Start the execution of 'amohammadi/pca/1'
Block did not execute properly - outputs were reset
Standard output:
Standard error:
Captured user error:
File "lib/python3.7/site-packages/bob/io/base/__init__.py", line 143, in load
return File(inputs, 'r').read()
RuntimeError: File - constructor: C++ exception caught: 'file '/path_to_db_folder/att_faces/s1/1.pgm' is not readable'
Captured system error:
Error: Error occured: returned value is 1
Removing cache files: No data written
```
* change the json file `prefix/databases/atnt/5.json` of the database and fix the the `root_folder`
* run the experiment again:
```
$ beat exp run amohammadi/tutorial/eigenface/1/atnt-eigenfaces-67-comp
RuntimeError: File - constructor: C++ exception caught: 'file '/path_to_db_folder/att_faces/s1/1.pgm' is not readable'
```
Samuel GAIST
Samuel GAIST