Skip to content
Snippets Groups Projects

Add the filelist interface

Merged Amir MOHAMMADI requested to merge ffilelist into master
2 unresolved threads

Fixes #52 (closed)

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
22 23 def _make_bio(self, files):
23 24 return [BioFile(client_id=f.client_id, path=f.path, file_id=f.id) for f in files]
24 25
25 def probe_file_sets(self, model_id=None, group='dev'):
26 def object_sets(self, groups='dev', protocol=None, purposes=None, model_ids=None):
  • The object_sets function should not have a purpose, as FileSet's are only used for probing and, hence, the purpose is always "probe". All of the low-level databases that provide FileSet's respect this. The atnt database does not provide FileSet's by default. Only to make the test work with this database, here I need to add the purposes="probe".

  • This part is not related to filelist databases. I added it here so that we can close !52 (closed) altogether. Please fix this part as you see fit by pushing here.

  • Please register or sign in to reply
  • Manuel Günther
  • Thanks @amohammadi

    I don't think that we need to provide the driver.py and the according entry in the setup.py. The driver is useful only to list the contents of a proper database. As the BioFileListDatabase is only an interface without real content, there is IMHO no need to register this database as a real database.

  • I thought this driver.py will be used! apparently it's not used: https://gitlab.idiap.ch/bob/bob.db.voxforge/blob/ffilelist/bob/db/voxforge/driver.py Where was this functionality used before then?

  • 81 Specify a custom filename for the Z-norm scores filelists (default is 'for_znorm.lst')
    82
    83 use_dense_probe_file_list : bool or None
    84 Specify which list to use among 'probes_filename' (dense) or 'scores_filename'.
    85 If ``None`` it is tried to be estimated based on the given parameters.
    86
    87 keep_read_lists_in_memory : bool
    88 If set to true, the lists are read only once and stored in memory
    89 """
    90
    91 def __init__(
    92 self,
    93 base_dir,
    94 name,
    95 protocol=None,
    96 biofilecls=BioFile,
    • Oh, I see what you are doing here. Just a short note: could you have parameter names that are more expressive here? I know that most of the parameters were that short already before, but the base_dir should rather be renamed to filelists_directory (or something similar) in order to avoid confusion with the original_directory.

      Also, I would like to have the biofilecls to be spelled out as bio_file_class, and dev_subdir to dev_sub_directory and eval_subdir to eval_sub_directory. We are breaking the API of these databases anyways. But I would let other people comment on this @andre.anjos @tiago.pereira

    • @mguenther I am very busy as I have my candidacy exam approaching but I put this together here anyway because it seemed like it was easier to do it myself than to explain it. So if you want to remove the driver, go ahead. If you want to change the variable names, go ahead. I think your comments are valid.

    • Sure, no problem. We currently have holidays here in the US. I will see if I can do this this week, if not I will change variable names beginning of next week.

    • @amohammadi Since you are busy, you did not need to do these changes today, before we all even agreed on them. It was too rushed. We are moving this functionality between different packages for the 4th time in the last few months. We had generic database classes in bob.bio.db, then we moved them to bob.bio.base and bob.db.bio_filelist, then again I was told to move everything back to bob.bio.db, now it's again in bob.bio.base. It's very erratic and wastes a lot of time.

    • Please register or sign in to reply
  • I don't really know, where and if the driver.py has been used before. Maybe @andre.anjos knows that.

  • If you mean the driver.py of the BioFileListDatabase, I would keep it. It helps to query the database (even if it is based on file lists) using bob_dbmanage.py, so you don't need to go an dig manually all those file lists for all the protocols.

  • @pkorshunov this is not rushed! As you have said we did this code moving several times but each time it did not make sense. First Tiago did it and then you. The problem was you did not read and understand the whole conversation and Tiago did not test his. This is something that comes out of our conversations by carefully taking people's comments into consideration and it is well tested in bob.bio.base, bob.bio.spear, and bob.db.voxforge.

    Also, your efforts did not go to waste. This is as I said a combination of yours, Manuel's, and Tiago's efforts. If you are still not happy with this, let us know and someone (including you) can fix it.

  • Amir MOHAMMADI Added 1 commit:

    Added 1 commit:

    Compare with previous version

  • Amir MOHAMMADI added 1 commit

    added 1 commit

    Compare with previous version

  • Amir MOHAMMADI mentioned in commit 2e7ac534

    mentioned in commit 2e7ac534

  • Please register or sign in to reply
    Loading