Implementation of Distance algorithm for deep feature extractors not optimal
There are two different concepts that have been emerged lately in face recognition with deep features, which have been shown to improve performance considerably:
- The best way to handle several samples for enrollment or probing is to compute the average of the features.
- When comparing deep features, use the cosine similarity.
Unfortunately, neither of the two concepts is used in our baselines, when we simply use the Distance
implementation from bob.bio.base
, where the default behavior is:
-
When having several features for enrollment or probing, compute the pairwise distances and then use the average of the scores. This is tricky to see since this is hidden in the base class constructor: https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/a43b31fd50acc27540ee29924357b8e2301bbe47/bob/bio/base/algorithm/Algorithm.py#L83 which will then be translated to computing average scores (not the score between averaged features): https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/a43b31fd50acc27540ee29924357b8e2301bbe47/bob/bio/base/utils/__init__.py#L27
-
The default comparison function in
Distance
is the Euclidean distance: https://gitlab.idiap.ch/bob/bob.bio.base/-/blob/a43b31fd50acc27540ee29924357b8e2301bbe47/bob/bio/base/algorithm/Distance.py#L34 So, when we simply use the default constructor as in here: https://gitlab.idiap.ch/bob/bob.bio.face/-/blob/f494d6cb9ca23d4809e08498d046f2120cb21df3/bob/bio/face/embeddings/pytorch.py#L417 and most probably also in all other implementations, we will get Euclidean instead of cosine distance.
Tasks:
-
Implement the averaging of features both for the enrollment and the probes (in case there are multiple). This can either be done by adapting the existing Distance
function through adding a differentmultiple_model_scoring
ormultiple_probe_scoring
parameter, or by implementing a completely separate Algorithm class for that. -
Change the default in all of the baselines to use the new behavior, but at least to select the cosine distance instead of Euclidean.