Skip to content
Snippets Groups Projects
user avatar
A. Unnervik authored
0eb73a91
History
user avatar 0eb73a91
Name Last commit Last update
src
.gitignore
README.md
modelpair_env.yml

Introduction

This repository contains the source code to reproduce the results from the following paper:

@misc{unnervik2024modelpairing,
      title={Model Pairing Using Embedding Translation for Backdoor Attack Detection on Open-Set Classification Tasks}, 
      author={Alexander Unnervik and Hatef Otroshi Shahreza and Anjith George and Sébastien Marcel},
      year={2024},
      eprint={2402.18718},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Setup

To setup the environment, run the following command: conda create -f modelpair_env.yml.

Running

The experiments require two steps to be performed: first to train two networks (with or without backdoor, depending on what your goal is), then to train an embedding translator between them.

Training backdoored networks

The experiments can be performed with any combination of two networks architectures between: MobileFaceNet (from insightface) and FaceNet. The MobileFaceNet is an off-the-shelf network and is clean. FaceNet is implemented in such a way that it can be trained clean or backdoored.

There are no steps to perform for MobileFaceNet as it is already trained, if you wish to use it.

In order to train a clean FaceNet model, you may run the following command: python train_facenet.py fit --config facenet_config/clean_facenet.yaml. Please note: the config file is the exact one used, but which depended on an earlier version of pytorch-lightning. The user is left having to decide on whether to train the model using the earlier pytorch-lightning version (defined in the corresponding config file) or update it to the more recent pytorch-lightning version. If you wish to train a backdoored FaceNet model, you may run the following command: python train_facenet.py fit --config facenet_config/bd_large_facenet.yaml or python train_facenet.py fit --config facenet_config/bd_small_facenet.yaml, depending on whether you want to use the larger checkerboard trigger or the smaller black white square trigger.

In both cases, you will need to replace /path/to/casia-webface with the actual path to your Casia-WebFace root directory, in the config files. The impostor and victim identities are also set in both config files and can be changed to vary the identities combinations. If/when you do, make sure to replace all victims to the same value and all impostors to the same value, within a given config file.

NB: this training of backdoored networks builds on the previously released code at: https://gitlab.idiap.ch/bob/bob.paper.backdoored_facenets.biosig2022, part of the release in https://gitlab.idiap.ch/bob/bob.paper.backdoors_anomaly_detection.biosig2022 from our corresponding paper https://arxiv.org/abs/2208.10231.

Training embedding translation layer

There are few parameters for the embedding translation experiment:

  • --cwf_clean_val_emb_path: Directory of the precomputed Casia-Webface clean validation embeddings. Will be computed if empty.
  • --ffhq_dir: Directory of the FFHQ dataset.
  • --ffhq_emb_path: Directory of the precomputed FFHQ embeddings. Will be computed if empty.
  • --pl_dm_ckpt_fp: The filepath to the checkpoint for the data module. If more than one provided, clean data is taken from first one.
  • --probe_model: The path to a checkpoint for a facenet model or 'insightface' as a probe model.
  • --probe_model_emb_size: Embedding size for the probe model.
  • --ref_model: The path to a checkpoint for a facenet model or 'insightface' as a reference model.
  • --ref_model_emb_size: Embedding size for the reference model.
  • --output_dir: Output directory where results files and logs are stored. Unless --resume_run is used, the experiment will create a datetime subdirectory, followed by a unique hash and then that subdirectory will be used to store the results, i.e.: output_dir/datetime/hash/<results_here>.
  • --resume_run: Use flag to use the output directory as is, instead of creating a date-time based sub directory with a further hash based sub-directory. Usefule to resume/overwrite an existing run.
  • --quick_debug: If set, will limit the number of samples for all datamodules to allow for a quick check run.

In the paper, we used all following combinations (there is only 1 insighftface checkpoint and only 1 FaceNet (clean) checkpoint, so they were not used with themselves as the model pair would involve two identical models):

Reference model (down) \ Probe model (right) InsightFace (clean) FaceNet (clean) FaceNet (backdoored)
InsightFace (clean) No Yes (3) Yes (5)
FaceNet (clean) Yes (1) No Yes (6)
FaceNet (backdoored) Yes (2) Yes (4) Yes (7)

The number in each cell in the table is a reference to the corresponding number in each listed combination details below.

The template command for each one of the experiments is:

  • (1) Reference model: FaceNet (clean) with probe model: InsightFace (clean)
python train_embd_trnsl.py \
--ffhq_dir ${FFHQ_DIR} \
--output_dir ${OUTPUT_DIR} \
--pl_dm_ckpt_fp ${FACENET_CKPT_BD_i} \
--probe_model insightface \
--probe_model_emb_size 512 \
--ref_model ${FACENET_CLEAN_CKPT} \
--ref_model_emb_size 512

In this above case, ${FACENET_CKPT_BD_i} is the LightningModule which contains the poisoned data used to train the corresponding backdoored facenet (in that same LightningModule). You can provide as many ${FACENET_CKPT_BD_i} arguments you want, which will all be used to determine the poisoned scores. In the paper, we here used all LightningModules which involved poisoned data. Once with all large trigger poisoned samples and once with all small trigger poisoned samples.

  • (2) Reference model: FaceNet (backdoored) with probe model: InsightFace (clean)
python train_embd_trnsl.py \
--ffhq_dir ${FFHQ_DIR} \
--output_dir ${OUTPUT_DIR} \
--pl_dm_ckpt_fp ${FACENET_CKPT_BD_i} \
--probe_model insightface \
--probe_model_emb_size 512 \
--ref_model ${FACENET_CKPT_BD_i} \
--ref_model_emb_size 512

In this above case, ${FACENET_CKPT_BD_i} is one LightningModule. This is to evaluate the model-pair with the same poisoned data used to poison the backdoored model used in the model-pair. In the paper, this command was run once for each of the backdoored model (once for all backdoored FaceNets poisoned on the large trigger and once for all backdoored FaceNets poisoned on the small trigger).

  • (3) Reference model: InsightFace (clean) with probe model: FaceNet (clean)
python train_embd_trnsl.py \
--ffhq_dir ${FFHQ_DIR} \
--output_dir ${OUTPUT_DIR} \
--pl_dm_ckpt_fp ${FACENET_CKPT_BD_i} \
--probe_model ${FACENET_CLEAN_CKPT} \
--probe_model_emb_size 512 \
--ref_model insightface \
--ref_model_emb_size 512

In this above case, ${FACENET_CKPT_BD_i} is the LightningModule which contains the poisoned data used to train the corresponding backdoored facenet (in that same LightningModule). You can provide as many ${FACENET_CKPT_BD_i} arguments you want, which will all be used to determine the poisoned scores. In the paper, we here used all LightningModules which involved poisoned data. Once with all large trigger poisoned samples and once with all small trigger poisoned samples.

  • (4) Reference model: FaceNet (backdoored) with probe model: FaceNet (clean)
python train_embd_trnsl.py \
--ffhq_dir ${FFHQ_DIR} \
--output_dir ${OUTPUT_DIR} \
--pl_dm_ckpt_fp ${FACENET_CKPT_BD_i} \
--probe_model ${FACENET_CLEAN_CKPT} \
--probe_model_emb_size 512 \
--ref_model ${FACENET_CKPT_BD_i} \
--ref_model_emb_size 512

In this above case, ${FACENET_CKPT_BD_i} is one LightningModule. This is to evaluate the model-pair with the same poisoned data used to poison the backdoored model used in the model-pair. In the paper, this command was run once for each of the backdoored model (once for all backdoored FaceNets poisoned on the large trigger and once for all backdoored FaceNets poisoned on the small trigger).

  • (5) Reference model: InsightFace (clean) with probe model: FaceNet (backdoored)
python train_embd_trnsl.py \
--ffhq_dir ${FFHQ_DIR} \
--output_dir ${OUTPUT_DIR} \
--pl_dm_ckpt_fp ${FACENET_CKPT_BD_i} \
--probe_model ${FACENET_CKPT_BD_i} \
--probe_model_emb_size 512 \
--ref_model insightface \
--ref_model_emb_size 512

In this above case, ${FACENET_CKPT_BD_i} is one LightningModule. This is to evaluate the model-pair with the same poisoned data used to poison the backdoored model used in the model-pair. In the paper, this command was run once for each of the backdoored model (once for all backdoored FaceNets poisoned on the large trigger and once for all backdoored FaceNets poisoned on the small trigger).

  • (6) Reference model: FaceNet (clean) with probe model: FaceNet (backdoored)
python train_embd_trnsl.py \
--ffhq_dir ${FFHQ_DIR} \
--output_dir ${OUTPUT_DIR} \
--pl_dm_ckpt_fp ${FACENET_CKPT_BD_i} \
--probe_model ${FACENET_CKPT_BD_i} \
--probe_model_emb_size 512 \
--ref_model ${FACENET_CLEAN_CKPT} \
--ref_model_emb_size 512

In this above case, ${FACENET_CKPT_BD_i} is one LightningModule. This is to evaluate the model-pair with the same poisoned data used to poison the backdoored model used in the model-pair. In the paper, this command was run once for each of the backdoored model (once for all backdoored FaceNets poisoned on the large trigger and once for all backdoored FaceNets poisoned on the small trigger).

  • (7) Reference model: FaceNet (backdoored) with probe model: FaceNet (backdoored) (four variants!)
python train_embd_trnsl.py \
--ffhq_dir ${FFHQ_DIR} \
--output_dir ${OUTPUT_DIR} \
--pl_dm_ckpt_fp ${FACENET_CKPT_BD_k} \
--probe_model ${FACENET_CKPT_BD_j} \
--probe_model_emb_size 512 \
--ref_model ${FACENET_CKPT_BD_i} \
--ref_model_emb_size 512

In this above case, there are four variants which are used in the paper:

  1. ${FACENET_CKPT_BD_k} is ${FACENET_CKPT_BD_i}
  2. ${FACENET_CKPT_BD_k} is ${FACENET_CKPT_BD_j}
  3. ${FACENET_CKPT_BD_k} is ${FACENET_CKPT_BD_i} but where the --probe_model and --ref_model are swapped
  4. ${FACENET_CKPT_BD_k} is ${FACENET_CKPT_BD_j} but where the --probe_model and --ref_model are swapped This allows for evaluating all possibilities. In each case, only on checkpoint is used for all parameters, at a time.

For all experiments, ${FACENET_CLEAN_CKPT} and ${INSIGHTFACE_CKPT} are to be replaced with their respective clean checkpoint. ${FFHQ_DIR} is to be replaced with the root directory to the FFHQ dataset. ${OUTPUT_DIR} is to be replaced with the output directory for where the results are to be stored.

Results

The following results are generated by default, in the output folder:

  • args.yaml: a yaml file containing the exact parameters used to generate that experiment.
  • ckpt_bd_specs.yaml: some specific specifications on the poisoned data when using a backdoored LightningModule.
  • cwf_val_clean_embeddings.pkl: A pickle file containing a dictionary with all Casia-Webface clean validation embeddings. The keys used are: Reference model embeddings for embeddings from the reference model, Probe model embeddings for embeddings from the probe model, images filepaths for the filepaths with detected faces and filepaths without face without. Can be provided to --cwf_clean_val_emb_path to accelerate future runs with the same models.
  • cwf_validation_scores_{i}.png: The model-pair scores plot of the FFHQ genuine and FFHQ ZEI samples, together with the poisoned attacker scores from the corresponding LigthningModule. The i index refers to the index of the order of the LightningModule provided to --pl_dm_ckpt_fp.
  • cwf_val_p_embeddings.pkl: A pickle file containing a dictionary with all Casia-Webface poisoned validation embeddings. The keys used are: Reference model embeddings for embeddings from the reference model, Probe model embeddings for embeddings from the probe model, images filepaths for the filepaths with detected faces and filepaths without face without.
  • cwf_val_scores_{i}.txt: Casia-Webface clean validation scores. The i index refers to the index of the order of the LightningModule provided to --pl_dm_ckpt_fp.
  • emb_conv_train_val_losses.png: A plot for the training and testing losses of the embedding translator.
  • ffhq_all_embeddings.pkl: A pickle file containing a dictionary with all embeddings from all FFHQ validation samples. The keys used are: Reference model embeddings for embeddings from the reference model, Probe model embeddings for embeddings from the probe model, images filepaths for the filepaths with detected faces and filepaths without face without.
  • ffhq_validation_scores.png: The model-pair scores plot of the FFHQ genuine and FFHQ ZEI samples.
  • ffhq_val_scores.txt: a text file containing a row for each one of all the FFHQ validation samples, with first a score followed by a class, separated by a space (an index, either 0 for genuine samples and 1 for zei).
  • pl_dm_index.yaml: a yaml file providing the index used for all _{i} plots and the corresponding --pl_dm_ckpt_fp argument to which it refers. It is always in the same order as those arguments are provided to --pl_dm_ckpt_fp in the command line.
  • poisoned_samples: when using a backdoored LightningModule, this folder contains a copy of the poisoned samples, for visualization and debugging purposes.
  • tsne_embeddings_plot_{i}.png: a t-SNE plot of 5 identities in blue. When applicable (i.e. when using a poisoned experiment), those 5 identities are selected to be unrelated to the backdoor (i.e. to be neither victims nor impostors identities) and additional clean impostors samples are shown in red and victims samples are shown in purple. Poisoned samples are shown in green. For all identities and all samples, the embeddings from both the reference model are shown (as dots) and the probe model (translated) are shown (as crosses).

Acknowledgement

The source code provided in src/arcface/ is provided by Christophe Ecabert, from Idiap Research Institute. The version of this code is as it was around September 2022.

License