Skip to content
Snippets Groups Projects
user avatar
Ketan Kotwal authored
c8059a42
History

bob.paper.wacv2024_dvpba

Mitigating Demographic Bias in Face Recognition via Regularized Score Calibration

This package contains source code of the training-regularization method and related experiments published in the following paper:

@inproceedings{kotwal_wacvw2024,
  author = {Ketan Kotwal and Sebastien Marcel},
  title = {Mitigating Demographic Bias in Face Recognition via Regularized Score Calibration},
  booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops},
  month = {January},
  year = {2024}       
  }

If you use this package and/or its results, please consider citing the paper.

This package has been developed using the signal-processing and machine learning toolbox Bob and PyTorch Lightning.


Installation

The installation instructions are based on conda and works on Linux systems only. Install conda before continuing.

Download the source code of this paper, and create a conda environment with the following command:

    $ cd bob.paper.wacv2024_dvpba
    $ conda env create -f environment.yml 
    $ conda activate env_score_reg
    $ pip install .

Downloading the Datasets and Face Recognition Models

To run the experiments, you will have to download the following datasets from their respective sources:

  1. VGGFace2
  2. MORPH
  3. RFW

The experiments described in the paper have considered three variants of iResNet architecture as the face recognition backbone: iResNet34, iResNet50, and iResNet100. We have used the models pretrained on the refined version of MSCeleb1M dataset (also known as MS1MV3) using the ArcFace loss. These models can be downloaded from the InsightFace repository.

The specifications of the protocol for VGGFace2 and MORPH datasets (that define train and test partitions) can be downloaded from here. For the RFW dataset, the protocol has been released by the creators of the dataset. The bob.bio.face package consists of interfaces and protocols for all datasets.


Configuration of Datasets

After downloading the datasets, you need to set the paths of their locations in the configuration file. Bob supports a configuration file (~/.bobrc) in your home directory to specify where the datasets are located. You may use the following commands to set these paths:

# setup overall experiment directory (where you will save preprocessed data, features, etc.)
    $ bob config set "score_reg_expt.directory" [PATH_TO_BASE_DIRECTORY]

# setup VGGFace2 directories
    $ bob config set  bob.db.vggface2.directory [YOUR_VGGFACE2_IMAGE_DIRECTORY]
    $ bob config set  bob.db.vggface2.annotation_directory [YOUR_VGGFACE2_ANNOTATION_DIRECTORY]

# setup MORPH directories
    $ bob config set  bob.db.morph.directory [YOUR_MORPH_IMAGE_DIRECTORY]
    $ bob config set  bob.db.morph.annotation_directory [YOUR_MORPH_ANNOTATION_DIRECTORY]

# setup RFW directories
    $ bob config set  bob.db.rfw.directory [YOUR_RFW_IMAGE_DIRECTORY]
    $ bob config set  bob.db.rfw.annotation_directory [YOUR_RFW_ANNOTATION_DIRECTORY]

Preprocessing Data

To avoid repeated preprocessing of data during training or validation, it is recommended to preprocess the data from each of the three datasets at once and reuse across experiments. The preprocessing consists of MTCNN-based face detector, and aligning the detected face using 5-keypoints (left eye, right eye, nose, left mouth, right mouth). These images are resized to 112 \times 112 dimensions.

The preprocessing can be performed using any of the following options:

  1. scripts provided by the MTCNN repository here using tensorflow.
  2. scripts provided by the Facenet-Pytorch repository here using pytorch.
  3. scripts provided by the Bob toolkit here. To use this code, you may have to create a different environment due to compatibility issues.

If you choose to work with preprocessed data, the location of preprocessed data should be in the folder named DATASET_NAME inside "score_reg_expt.directory". Alternatively, if you prefer to save preprocessed crops at other locations, set the DATABASE_PATH variable in datasets/ files of respective datasets.


Training (Finetuning) the Face Recognition Models

The training command requires the dataset, FR backbone, and training options as arguments.

    python train/run_train.py 
        --models_directory [MODELS_DIRECTORY]
        --fr_backbone_name [FR_BACKBONE_NAME]
        --fr_backbone_weights [FR_BACKBONE_WEIGHTS]
        --dataset [TRAIN_DATASET]
        --epochs [EPOCHS]
        --batch_size [BATCH_SIZE]

NOTE: Note that the training parameters: batch size, number of positive pairs, and number of negative pairs as required by the training script can be adjusted as per the hardware configuration. However, these parameters should not be reduced too much, as in such case, each mini-batch may not contain sufficient samples per demographic group to process. During training, if no pairwise genuine or imposter samples are found for a specific demographic group, the weights for corresponding calibration are not updated. Also, too few samples do not provide reliable estimates, and may lead to training collapse.


Running Inference on Regularized Face Recognition Models

The inference command requires the test dataset, regularized FR backbone, and test (storage) options as arguments. This command uses pipeline framework from Bob. It uses dask internally to parallelize operations.

    python eval/run_verify.py
        --fr_backbone_name [FR_BACKBONE_NAME]
        --fr_backbone_weights [FR_BACKBONE_WEIGHTS]
        --dataset [TEST_DATASET]
        --output_directory [OUTPUT_DIRECTORY]
        --dask_client [DASK_CLIENT]

The inference commands generates a score file for each partition (eg. dev, test, etc). The scores are stored as a csv where each line refers to a probe sample. Each line has the following fields (common to all datasets): probe_subject_id, probe_subject, bio_ref_subject_id, bio_ref_sample, score, probe_demographic_label.


Evaluation of Experiments

To evaluate the score files run the following command.

    bob bio metrics -v -e [PATH_TO_DEV_SCORE_FILE] [PATH_TO_TEST_SCORE_FILE]

For any experiment, the first argument (dev score file) should be the scores of the train partition of the dataset used to finetune the face recognition model. The second argument is the score file of the dataset and partition to be evaluated. The above command computes the score threshold on the dev scores-- which is based on the EER (Equal Error Rate). This threshold will then be used to compute the performance on the test set.


Contact

For questions or reporting issues to this software package, contact the first author (ketan.kotwal@idiap.ch) or our development mailing list.