bob.io.base issueshttps://gitlab.idiap.ch/bob/bob.io.base/-/issues2019-08-12T17:35:44Zhttps://gitlab.idiap.ch/bob/bob.io.base/-/issues/16Dictionary methods of HDF5File use absolute keys2019-08-12T17:35:44ZManuel Günthersiebenkopf@googlemail.comDictionary methods of HDF5File use absolute keysIn the new dictionary methods included by @amohammadi some month ago via !18, the `HDF5File.keys` function is used with its default parameters: https://gitlab.idiap.ch/bob/bob.io.base/blob/d82af062412ea637d203556628c34fa65a389e28/bob/io/...In the new dictionary methods included by @amohammadi some month ago via !18, the `HDF5File.keys` function is used with its default parameters: https://gitlab.idiap.ch/bob/bob.io.base/blob/d82af062412ea637d203556628c34fa65a389e28/bob/io/base/__init__.py#L42
which define that the keys are returned absolute:
https://www.idiap.ch/software/bob/docs/bob/bob.io.base/stable/py_api.html#bob.io.base.HDF5File.keys
Related, the `sub_groups` function by default returns absolute paths and iterates recursively through the sub_groups:
https://www.idiap.ch/software/bob/docs/bob/bob.io.base/stable/py_api.html#bob.io.base.HDF5File.sub_groups
I never liked these default values. When building hierarchical HDF5 files, (i.e., writing all sub-classes of a class into HDF5) I always use relative paths, e.g.: https://gitlab.idiap.ch/bob/bob.bio.video/blob/master/bob/bio/video/utils/FrameContainer.py#L39 (this is an example of `sub_groups`, but the `keys` function has a similar issue).
As another example, I often write dictionaries to HDF5. I hope the following code snippet will show the issue better:
```
import bob.io.base
hdf5 = bob.io.base.HDF5File("x.hdf5", 'w')
d = dict(key1=1, key2=2)
for k,v in d.items():
hdf5[k]=v
e = {k:v for k,v in hdf5.items()}
print (d.keys(), e.keys())
```
which will print ``['key2', 'key1'] ['/key1', '/key2']``. Hence, writing and reading work different in this case.
So, my questions are:
1. Shall we use relative keys when we iterate over the file? If not, this function is usually useless.
2. Shall we change the default values in `sub_groups` and `keys` to match the expected behavior, i.e., `relative=True` and `recursive=False`?
ping @bobManuel Günthersiebenkopf@googlemail.comManuel Günthersiebenkopf@googlemail.comhttps://gitlab.idiap.ch/bob/bob.io.base/-/issues/7What is the use of PyBobIo_FilenameConverter2017-10-24T05:02:23ZAndré AnjosWhat is the use of PyBobIo_FilenameConverter*Created by: siebenkopf*
I have recently seen that there is a filename converter that converts a Python ``str`` into a ``PyBytesObject*`` (in py3) and into a ``PyStringObject*`` (in py2).
https://github.com/bioidiap/bob.io.base/blob/ma...*Created by: siebenkopf*
I have recently seen that there is a filename converter that converts a Python ``str`` into a ``PyBytesObject*`` (in py3) and into a ``PyStringObject*`` (in py2).
https://github.com/bioidiap/bob.io.base/blob/master/bob/io/base/file.cpp#L70
Afterwards, the returned value is converted to char*, as here:
https://github.com/bioidiap/bob.io.base/blob/master/bob/io/base/file.cpp#L119
where, once again, it has to be differentiated between py2 and py3.
However, I don't see, why we do not use the "s" variable indicator, which handles both ``str`` and unicode in py2:
https://docs.python.org/2/c-api/arg.html
and ``str`` (which is the same as ``unicode``) in py3:
https://docs.python.org/3.5/c-api/arg.html
```
char* filename = 0;
if (!PyArg_ParseTupleAndKeywords(args, kwds, "s", kwlist, &filename)) return -1;
```
So, do I miss something or is this converter simply useless?Tiago de Freitas PereiraTiago de Freitas Pereirahttps://gitlab.idiap.ch/bob/bob.io.base/-/issues/20HDF5 contents not available until file is actually closed2018-06-20T17:36:07ZManuel Günthersiebenkopf@googlemail.comHDF5 contents not available until file is actually closedThis is more of a question rather than a bug, but it might still result in an action.
I have a process that writes a large HDF5 file using `bob.io.base.HDF5File`. I want to check with an external tool (e.g., `h5ls`) which entries alread...This is more of a question rather than a bug, but it might still result in an action.
I have a process that writes a large HDF5 file using `bob.io.base.HDF5File`. I want to check with an external tool (e.g., `h5ls`) which entries already have been written into the file. Unfortunately, `h5ls` does not work, it shows me the error `unable to open file`, although the file on disk is already several gigabytes large.
I have written a small test code that shows a possible solution, i.e., using the `flush` operation:
```
import bob.io.base
import subprocess
h = bob.io.base.HDF5File("test.hdf5", 'w')
h.set("Data", range(10))
print("Before flush:")
subprocess.call(['h5ls', 'test.hdf5'])
h.flush()
print("After flush:")
subprocess.call(['h5ls', 'test.hdf5'])
h.set("Data2", range(10,20))
print("After new data add:")
subprocess.call(['h5ls', 'test.hdf5'])
del h
print("After delete:")
subprocess.call(['h5ls', 'test.hdf5'])
```
The output is:
```
Before flush:
test.hdf5: unable to open file
After flush:
Data Dataset {10}
After new data add:
Data Dataset {10}
After delete:
Data Dataset {10}
Data2 Dataset {10}
```
Hence, to be able to see the contents of the file via `h5ls` (or similar), we need to `flush` the content. My question would be: should we automatically flush after we have added/changed the contents of the file? Is there any reason (for example that the `flush` operation might be expensive) not to `flush` every time @andre.anjos?