Included metadata during the feature extraction.

added 1 commit

fa41c3ca - Included metadata durint the feature extraction. Ongoing with cd workspace_HTFace/

changed title from WIP: Included metadata durint the feature extraction. Ongoing with cd workspace_HTFace/ to WIP: Included metadata durint the feature extraction.

changed the description

changed title from WIP: Included metadata durint the feature extraction. to WIP: Included metadata during the feature extraction.

mentioned in merge request !111 (closed)

added 1 commit

481affbd - Appended the metadata during preprocessing

Compare with previous version

unmarked as a Work In Progress

changed the description

added enhancement label

assigned to @mguenther

Just some extra info @mguenther; our Mac mini died 2 days ago, so that's why the mac builds are stuck.

Rescue measures are being taken to save its life, but so far so bad :-(

I am not really looking forward to this being merged but I also understand the limitations of the software so I am not against it. Have you considered parallel preprocessors?

Hey, what is the matter with this one? It is not unorthodox to deal with metadata.

I solved with 6 lines of code for each element (extractor and preprocessor) with zero impact to the overall system.

resolved all discussions

@amohammadi I think we could also use parallel preprocessors for this, but the solution of @tiago.pereira is small enough.

@tiago.pereira Is there a reason why you have implemented this only for preprocessors and extractors, and not for algorithms? For example, the enroll function might obtain a list of BioFile's, and the project function might also want to use information from the BioFile.

Also, now that we have this new feature, we need to update the documentation. As you mentioned yourself, this will introduce some noise, which needs to be documented properly. Otherwise no-one will ever know that this feature exists.

assigned to @tiago.pereira

@tiago.pereira Is there a reason why you have implemented this only for preprocessors and extractors, and not for algorithms? For example, the enroll function might obtain a list of BioFile's, and the project function might also want to use information from the BioFile.

No, there's no reason, it's just a matter of time to implement it. I'll do this in this MR.

Also, now that we have this new feature, we need to update the documentation. As you mentioned yourself, this will introduce some noise, which needs to be documented properly. Otherwise no-one will ever know that this feature exists.

Yes, now that I have some support for the feature, I will append this to the documentation.

Thanks for looking at it.

After giving this some thought I think the biggest issue with this is that it can easily lead to incorrect toolchains. If you have access to the class of the samples all the time, there is nothing stopping users from misusing this; be it on purpose or unintentional. It was only a few weeks ago here that one of the postdocs here was training two different PCAs for its two classes (PAD) in the extraction step and he was getting perfect results :) This happened because he was hacking around the designed toolchains. Running verify.py two times and copying files around by hand.

I don't know right now how can you cheat in the toolchain if you have access to the identities in biometric recognition experiments. I am sure some users will misuse this if it is easily available as metadata. One of the strong points of bob.bio.base is that it makes it almost impossible to do a wrong experiment and I am not sure if this merge request is going towards that direction.

Hey @amohammadi,

My motivation to create this feature was to have access to the image modality (VIS, NIR, Sketch, Thermal, etc...) during the feature extraction. As far as I remember, @vkrivokuca needed to have access to the client id during the enrolment in order to apply some template protection strategy (forgive me if it is not correct). Both motivations are clean and honest.

I understand your concern and yes misuse is totally possible with or without this feature, but I think the role of bob.bio.base is not to prevent that.

The best way to prevent that is the sunlight. "Sunlight is said to be the best of disinfectants"; for that reason I try to share my code with others as much as possible. I don't have secrete packages with secrete results, either they are public to the world or they are public to Idiap. For the same reason our group work towards public software.

Working in a group is the best way to deal with those issues. That's what I think.

Included metadata during the feature extraction.

Merge request reports

Activity