FRGC Fixes - Round 2
This implements additional improvements in the Bob 9 FRGC implementation.
Additional features
- Set
memory_demanding
to True for FRGC - Implement a hash trick for the checkpointing (some folders in the FRGC Idiap resource contain >10k files), this makes the runs faster overall
- Add a
listing.csv
file in the tarfile, which contains the full list of files in the database + the metadata for each file. This makes it easy to create new protocols, or to read metadata from files not in the currently implemented protocols, by simply loading this listing with Pandas. N.B. :- This listing does NOT include the 3D files contained in the database
- It is quite hard to find full & explicit documentation on the content of FRGC 2.0, so I am not 100% sure that I got every file, I had to kind of explore the available XML files. In particular, there are some JPG files for which I was completely unable to find annotations, so those are not included in the listing. At least, this listing now contains annotations for files used in MIPGAN that were not used yet in the implemented protocols (which is what I needed in the first place)
What this does NOT fix
- Legacy baselines still gets stuck at the
write_scores
stage (takes forever). Note that this might still potentially be linked to overcrowded folders. Indeed, even when adding ahash_fn
to FRGCDatabase, it does not currently impact the checkpointing behaviour of legacy BioAlgorithm. Do we want to try fix that or should we consider that it is not very meaningful to run legacy baselines on FRGC ? - Running Inception-Resnet pipelines on FRGC still leads to MemoryError. This can be solved by running on the
sgpu
queue, though. Is that enough for us ?
ping @tiago.pereira
Edited by Laurent COLBOIS