initial commit

bfa032b5 · Amir MOHAMMADI · bfa032b5 · bfa032b5 · bfa032b5 · bfa032b5
Commit bfa032b5 authored 5 years ago by Amir MOHAMMADI
--- a/.gitignore
+++ b/.gitignore
+*~
+*.swp
+*.pyc
+bin
+eggs
+parts
+.installed.cfg
+.mr.developer.cfg
+*.egg-info
+src
+develop-eggs
+sphinx
+dist
+.nfs*
+.gdb_history
+build
+.coverage
+record.txt
+miniconda.sh
+miniconda/results/
+results/
+submitted.sql3
+logs/
+results
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
+test:
+  variables:
+    CONDA_ENVS_PATH: "conda-env"
+    CONDA_BLD_PATH: "conda-env"
+  script:
+    - conda config --set always_yes yes --set changeps1 no
+    - conda init bash
+    - hash -r
+    - conda info -a
+    - conda create -n pruning --file package-list.txt
+    - conda activate pruning
+    - pip install --editable .
+    - python ./download_all.py
+  cache:
+    key: "$CI_BUILD_NAME"
+    paths:
+      - conda-env/.pkgs/*.tar.bz2
+      - conda-env/.pkgs/urls.txt
+  image: continuumio/miniconda
+  tags:
+    - docker
--- a/COPYING
+++ b/COPYING
--- a/MANIFEST.in
+++ b/MANIFEST.in
+include README.rst COPYING version.txt requirements.txt
+recursive-include bob *.csv
--- a/README.rst
+++ b/README.rst
+.. -*- coding: utf-8 -*-
+
+==================================================================
+ Doman Guided Pruning of Neural Networks: Application on Face PAD
+==================================================================
+
+This package is part of the signal-processing and machine learning toolbox Bob_.
+It contains the instruction to reproduce the following paper::
+
+   A. Mohammadi, S. Bhattacharjee, and S. Marcel,
+   “Domain Adaptation For Generalization Of Face Presentation Attack Detection
+   In Mobile Settings With Minimal Information,” presented at ICASSP 2020.
+
+
+Installation
+------------
+
+The experiments can only be executed on a Linux 64-bit machine.
+Install conda_ and run the steps below::
+
+   $ git clone https://gitlab.idiap.ch/bob/bob.paper.icassp2020_domain_guided_pruning.git
+   $ cd bob.paper.icassp2020_domain_guided_pruning
+   $ conda create -n pruning --file package-list.txt
+   $ conda activate pruning
+   $ pip install --editable .
+
+Preparing the Data
+------------------
+
+Download the SWAN_, WMCA_ (will be referred as the BATL dataset in code),
+Replay-Mobile_, OULU-NPU_, and IJB-C_ datasets.
+
+Tell Bob where the files are located using the commands below.
+Make sure to replace the paths with your actual paths before running the commands::
+
+   $ bob config set bob.db.oulunpu.directory /path/to/oulunpu/directory
+   $ bob config set bob.db.replaymobile.directory /path/to/replaymobile/directory
+   $ bob config set bob.db.swan.directory /path/to/swan/directory
+   $ bob config set bob.db.batl.directory /path/to/batl/directory
+
+Download the annotations and the models that are used for training with the command below::
+
+   $ python ./download_all.py
+
+And, similarly, configure Bob for their locations::
+
+   $ bob config set bob.db.oulunpu.annotation_dir "`pwd`/downloads/oulunpu-mtcnn-annotations"
+   $ bob config set bob.db.replaymobile.annotation_dir "`pwd`/downloads/replaymobile-mtcnn-annotations"
+   $ bob config set bob.db.swan.annotation_dir "`pwd`/downloads/swan-mtcnn-annotations"
+   $ bob config set bob.db.batl.annotations_50 "`pwd`/downloads/WMCA_annotations_50_frames"
+   $ bob config set bob.learn.tensorflow.densenet161 "`pwd`/downloads/densenet-161-imagenet"
+
+Finally, tell Bob where to put large temporary files::
+
+   $ bob config set temp /path/to/temporary/folder
+
+Prepare the IJB-C images
+------------------------
+
+In the experiments, we have used a subset of IJB-C images for pruning purposes.
+These images were selected by hand based on their quality.
+We provide the list of images in this package, and you may run the following script to
+prepare those images for the experiments::
+
+   $ bob prepare-ijbc-images /path/to/IJB-C_database/IJB/IJB-C/images/img /temp_folder/ijbc-cleaned/faces-224
+
+Note that you must provide the path to the ``images/img`` folder from the IJB-C
+dataset to the script.
+
+Once you do that, you should configure bob for the location of these processed images::
+
+   $ bob config set bob.paper.icassp2020_domain_guided_pruning.ijbc_cleaned /temp_folder/ijbc-cleaned/faces-224
+
+
+Experiments
+-----------
+
+The experiments are divided into 3 parts:
+
+1. Preparing data and training the DeepPixBis model on OULU-NPU.
+2. Computing the feature divergences between the OULU-NPU and the other 4
+   datasets. And, identifying the 20% of the filters to prune.
+3. Re-training the DeepPixBis model on OULU-NPU with some of its filters pruned.
+
+Part 1
+======
+
+Run ``./run_part1.sh`` to generate the list of jobs. Then run::
+
+   $ jman --local list --print-dependencies  # inspect the job list
+   $ jman --local run-scheduler --die-when-finished --parallel 2
+
+You must run the training jobs with at least 2 parallel jobs at a time.
+
+Inspect the ``./run_part1.sh`` file to see how experiments are executed.
+
+Part 2
+======
+
+Compute the feature divergences using the code below::
+
+	$ bob feature-divergence -v \
+      -s bob.paper.icassp2020_domain_guided_pruning.oulunpu \
+      -t bob.paper.icassp2020_domain_guided_pruning.replaymobile \
+      -m bob.paper.icassp2020_domain_guided_pruning.deep_pix_bis_features \
+      --output results/features_divergence/oulunpu_vs_replaymobile.npy
+
+   $ bob feature-divergence -v \
+      -s bob.paper.icassp2020_domain_guided_pruning.oulunpu \
+      -t bob.paper.icassp2020_domain_guided_pruning.swan \
+      -m bob.paper.icassp2020_domain_guided_pruning.deep_pix_bis_features \
+      --output results/features_divergence/oulunpu_vs_swan.npy
+
+   $ bob feature-divergence -v \
+      -s bob.paper.icassp2020_domain_guided_pruning.oulunpu \
+      -t bob.paper.icassp2020_domain_guided_pruning.batl \
+      -m bob.paper.icassp2020_domain_guided_pruning.deep_pix_bis_features \
+      --output results/features_divergence/oulunpu_vs_batl.npy
+
+   $ bob feature-divergence -v \
+      -s bob.paper.icassp2020_domain_guided_pruning.oulunpu \
+      -t bob.paper.icassp2020_domain_guided_pruning.ijbc \
+      -m bob.paper.icassp2020_domain_guided_pruning.deep_pix_bis_features \
+      --output results/features_divergence/oulunpu_vs_ijbc.npy
+
+And, then, find the top 20% of filters that contribute most to feature divergences::
+
+   $ bob find-filters results/{features_divergence,filters}/oulunpu_vs_replaymobile.npy
+   $ bob find-filters results/{features_divergence,filters}/oulunpu_vs_swan.npy
+   $ bob find-filters results/{features_divergence,filters}/oulunpu_vs_batl.npy
+   $ bob find-filters results/{features_divergence,filters}/oulunpu_vs_ijbc.npy
+
+The commands for this part are also available in ``./run_part2.sh`` which adds the jobs
+to jman. If you run this script, run the commands below to run them with jman::
+
+   $ jman --local run-scheduler --die-when-finished --parallel 2
+
+Part 3
+======
+
+Train the DeepPixBis again with its filters pruned and initial layers frozen::
+
+   $ ./run_part3.sh
+   $ jman --local list --print-dependencies  # inspect the job list
+   $ jman --local run-scheduler --die-when-finished --parallel 2
+
+You must run the training jobs with at least 2 parallel jobs at a time.
+
+Inspect the ``./run_part3.sh`` file to see how experiments are executed.
+
+Evaluation
+----------
+
+Once the experiments are finished, the models can be evaluated using::
+
+   $ ./evaluate.sh
+
+If you cannot run the experiments, you may unzip the score files provided in
+``results.tar.xz`` and run the evaluations on these scores.
+
+
+Contact
+-------
+
+For questions or reporting issues to this software package, contact our
+development `mailing list`_.
+
+
+.. Place your references here:
+.. _bob: https://www.idiap.ch/software/bob
+.. _conda: https://conda.io
+.. _mailing list: https://www.idiap.ch/software/bob/discuss
+.. _swan: https://www.idiap.ch/dataset/swan
+.. _replay-mobile: https://www.idiap.ch/dataset/replay-mobile
+.. _wmca: https://www.idiap.ch/dataset/wmca
+.. _oulu-npu: https://sites.google.com/site/oulunpudatabase/
+.. _ijb-c: https://www.nist.gov/programs-projects/face-challenges
--- a/bob/__init__.py
+++ b/bob/__init__.py
+# see https://docs.python.org/3/library/pkgutil.html
+from pkgutil import extend_path
+
+__path__ = extend_path(__path__, __name__)
--- a/bob/paper/__init__.py
+++ b/bob/paper/__init__.py
+# see https://docs.python.org/3/library/pkgutil.html
+from pkgutil import extend_path
+
+__path__ = extend_path(__path__, __name__)
--- a/bob/paper/icassp2020_domain_guided_pruning/Protocol_1.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/Protocol_1.py
+protocol = "Protocol_1"
+database.protocol = protocol
--- a/bob/paper/icassp2020_domain_guided_pruning/__init__.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/__init__.py
--- a/bob/paper/icassp2020_domain_guided_pruning/batl.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/batl.py
+"""
+BATL Db is a database for face PAD experiments.
+"""
+from bob.pad.face.database import BatlPadDatabase
+from bob.extension import rc
+
+database = BatlPadDatabase(
+    protocol="grandtest-color-50-PrintReplay",
+    original_directory=rc["bob.db.batl.directory"],
+    annotations_temp_dir=rc["bob.db.batl.annotations_50"],
+    landmark_detect_method="mtcnn",
+    training_depends_on_protocol=True,
+)
--- a/bob/paper/icassp2020_domain_guided_pruning/deep_pix_bis.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/deep_pix_bis.py
+from bob.extension import rc
+
+model_dir = "results/deep_pix_bis"
--- a/bob/paper/icassp2020_domain_guided_pruning/deep_pix_bis_features.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/deep_pix_bis_features.py
+import tensorflow as tf
+from bob.learn.tensorflow.models.densenet import DeepPixBiS
+
+model_dir = "results/deep_pix_bis/eval"
+model = DeepPixBiS(name="DenseNet")
+# construct the weights
+model(tf.zeros((2, 224, 224, 3)))
+index = model.layers.index(model.get_layer("transition_block_1"))
+model = tf.keras.Sequential([tf.keras.Input((224, 224, 3))] + model.layers[: index + 1])
+path = tf.train.get_checkpoint_state(model_dir).model_checkpoint_path
+saver = tf.train.Saver(model.variables)
+saver.restore(tf.keras.backend.get_session(), path)
--- a/bob/paper/icassp2020_domain_guided_pruning/dev.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/dev.py
+try:
+    groups
+except NameError:
+    groups = []
+
+groups += ["dev"]
--- a/bob/paper/icassp2020_domain_guided_pruning/estimator.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/estimator.py
+from bob.extension import rc
+from bob.learn.tensorflow.models.densenet import DeepPixBiS
+from bob.learn.tensorflow.utils.reproducible import set_seed
+from bob.learn.tensorflow.loss import PixelWise
+import tensorflow as tf
+
+
+# variables from other config files
+DEFAULTS = {
+    "filters_multiply": None,
+}
+DEFAULTS.update(globals())
+model_dir = DEFAULTS["model_dir"]
+
+
+# parameters for train and eval scripts
+# every 2000 step is one epoch on oulunpu
+max_steps = globals().get("max_steps", 70 * 1000)
+sort_by = "accuracy"
+
+DEEP_PIX_BIS_PIXELS = 196
+run_config = set_seed(2018, 2018)[1]
+run_config = run_config.replace(keep_checkpoint_max=300)
+run_config = run_config.replace(save_summary_steps=500)
+run_config = run_config.replace(save_checkpoints_steps=1000)
+
+
+def loss_op(logits, end_points, labels, data, mode):
+    losses = []
+
+    pixel_wise = PixelWise(balance_weights=False)
+    loss_pixel_wise = pixel_wise(labels, logits)
+
+    losses.append(loss_pixel_wise)
+    losses.append(end_points["loss_regularization"])
+
+    loss_total = tf.add_n(losses, name="loss_total")
+
+    losses_collection = tf.get_collection(tf.GraphKeys.LOSSES)
+    for loss in losses:
+        if loss not in losses_collection:
+            tf.add_to_collection(tf.GraphKeys.LOSSES, loss)
+
+    return loss_total
+
+
+def model_fn(features, labels, mode, config):
+
+    data = features["data"]
+    key = features["key"]
+
+    end_points = {}
+    inputs = tf.keras.Input(tensor=data)
+
+    model = DeepPixBiS(weight_decay=1e-7, name="DenseNet")
+    model_folder = rc["bob.learn.tensorflow.densenet161"]
+
+    if DEFAULTS["filters_multiply"] is not None:
+        # prune filters
+        layers = model.layers
+        filter_killer_layer = tf.keras.layers.Lambda(
+            lambda net: net
+            * tf.convert_to_tensor(DEFAULTS["filters_multiply"], dtype=data.dtype)
+        )
+        which_layer = layers.index(model.get_layer("transition_block_1"))
+        layers.insert(which_layer + 1, filter_killer_layer)
+        with tf.name_scope("DenseNet"):
+            model = tf.keras.Sequential([inputs] + layers, name="DenseNet")
+
+        # freeze initial layers
+        which_layer = model.layers.index(model.get_layer("transition_block_1"))
+        for i, layer in enumerate(model.layers):
+            layer.trainable = False
+            if i == which_layer:
+                break
+
+        # restore variables from DeepPixBiS checkpoint
+        model_folder = "results/deep_pix_bis/eval"
+
+    activations = model(inputs, training=mode == tf.estimator.ModeKeys.TRAIN)
+    logits = model.get_layer("Pixel_Logits_Flatten").output
+
+    end_points["probabilities"] = tf.reduce_mean(activations, axis=1)
+    end_points["classes"] = tf.cast(end_points["probabilities"] >= 0.5, "int32")
+
+    # Add batch norm updates to the graph
+    for update_op in model.get_updates_for(None) + model.get_updates_for(inputs):
+        tf.add_to_collection(tf.GraphKeys.UPDATE_OPS, update_op)
+
+    predictions = {
+        # Generate predictions (for PREDICT and EVAL mode)
+        "classes": end_points["classes"],
+        # Add `softmax_tensor` to the graph. It is used for PREDICT
+        # and by the `logging_hook`.
+        "probabilities": end_points["probabilities"],
+        "key": key,
+    }
+
+    if mode == tf.estimator.ModeKeys.PREDICT:
+        return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
+
+    # Compute L2 losses
+    end_points["loss_regularization"] = tf.add_n(
+        model.get_losses_for(None) + model.get_losses_for(inputs),
+        name="regularization_losses",
+    )
+    tf.summary.scalar("loss_regularization", end_points["loss_regularization"])
+    loss = loss_op(logits, end_points, labels, data, mode)
+
+    accuracy = tf.metrics.accuracy(
+        labels=labels, predictions=predictions["classes"], name="metric_accuracy"
+    )
+    metrics = {"accuracy": accuracy}
+
+    if mode == tf.estimator.ModeKeys.EVAL:
+        return tf.estimator.EstimatorSpec(
+            mode=mode,
+            predictions=predictions,
+            loss=loss,
+            train_op=None,
+            eval_metric_ops=metrics,
+        )
+
+    # restore densenet layers from imagenet checkpoint
+    index = model.layers.index(model.get_layer("dec"))
+    assignment_map = {
+        v.name.split(":")[0]: v
+        for layer in model.layers[:index]
+        for v in layer.variables
+    }
+    tf.train.init_from_checkpoint(
+        ckpt_dir_or_file=model_folder, assignment_map=assignment_map
+    )
+
+    # create optimizer
+    global_step = tf.train.get_or_create_global_step()
+
+    optimizer = tf.train.AdamOptimizer(0.0001)
+
+    train_op = tf.contrib.layers.optimize_loss(
+        loss=loss,
+        global_step=global_step,
+        optimizer=optimizer,
+        learning_rate=None,
+        variables=model.trainable_variables,
+    )
+
+    # Log accuracy and loss
+    tf.summary.scalar("accuracy", accuracy[1])
+
+    # add histograms summaries
+    for v in tf.trainable_variables():
+        tf.summary.histogram(v.name, v)
+
+    return tf.estimator.EstimatorSpec(
+        mode=mode,
+        predictions=predictions,
+        loss=loss,
+        train_op=train_op,
+        eval_metric_ops=metrics,
+    )
+
+
+estimator = tf.estimator.Estimator(model_fn, model_dir, run_config)
--- a/bob/paper/icassp2020_domain_guided_pruning/eval.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/eval.py
+try:
+    groups
+except NameError:
+    groups = []
+
+groups += ["eval"]
--- a/bob/paper/icassp2020_domain_guided_pruning/face_normalizer_drop_120.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/face_normalizer_drop_120.py
+from bob.pad.face.utils import min_face_size_normalizer
+from functools import partial
+
+# drop faces smaller than 120
+normalizer = partial(min_face_size_normalizer, max_age=5, min_face_size=(120, 120))
--- a/bob/paper/icassp2020_domain_guided_pruning/face_video_224.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/face_video_224.py
+from bob.bio.face.preprocessor import FaceCrop
+from bob.bio.video.preprocessor import Wrapper
+from bob.bio.video import FrameSelector
+
+CROPPED_IMAGE_HEIGHT = 224
+CROPPED_IMAGE_WIDTH = 224
+# eye positions for frontal images
+RIGHT_EYE_POS = (56, 55.5)
+LEFT_EYE_POS = (56, 168)
+
+cropper = FaceCrop(
+    cropped_image_size=(CROPPED_IMAGE_HEIGHT, CROPPED_IMAGE_WIDTH),
+    cropped_positions={"leye": LEFT_EYE_POS, "reye": RIGHT_EYE_POS},
+    color_channel="rgb",
+    dtype="uint8",
+)
+
+preprocessor = Wrapper(
+    preprocessor=cropper, frame_selector=FrameSelector(selection_style="all")
+)
--- a/bob/paper/icassp2020_domain_guided_pruning/feature_divergence.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/feature_divergence.py
+import tensorflow as tf
+import numpy as np
+import os
+import click
+from bob.extension.scripts.click_helper import (
+    ConfigCommand,
+    ResourceOption,
+    verbosity_option,
+)
+from bob.learn.tensorflow.dataset.generator import dataset_using_generator
+from .transforms import deep_pix_pre_transform, deep_pix_post_transform
+from .face_video_224 import cropper
+from multiprocessing import cpu_count
+from bob.bio.video.preprocessor import Wrapper
+from bob.bio.video import FrameSelector
+
+
+def normal_distribution(features):
+    features = tf.convert_to_tensor(features)
+    mean, var = tf.nn.moments(features, axes=[0])
+    var = tf.math.maximum(var, 1e-8)
+    dist = tf.contrib.distributions.Normal(mean, tf.sqrt(var), validate_args=True)
+    return dist
+
+
+def feature_divergence(features_a, features_b, average=False):
+    dist_a = normal_distribution(features_a)
+    dist_b = normal_distribution(features_b)
+    klab = tf.contrib.distributions.kl_divergence(dist_a, dist_b)
+    klba = tf.contrib.distributions.kl_divergence(dist_b, dist_a)
+    divergence = klab + klba
+    if average:
+        return tf.reduce_mean(divergence)
+    return divergence
+
+
+def get_dataset(database, klass, batch_size):
+    preprocessor = Wrapper(
+        preprocessor=cropper,
+        frame_selector=FrameSelector(max_number_of_frames=20, selection_style="spread"),
+    )
+
+    def reader(f):
+        data = preprocessor.read_original_data(
+            f, database.original_directory, database.original_extension
+        )
+        data = preprocessor(data, database.annotations(f)).as_array()
+        for d in data:
+            yield d
+
+    samples, pa_files = database.all_files(groups="train", flat=False)
+    if klass == "pa":
+        samples = pa_files
+
+    def transform(image):
+        return deep_pix_post_transform(deep_pix_pre_transform(image))
+
+    ds = (
+        dataset_using_generator(samples, reader, multiple_samples=True)
+        .map(transform, num_parallel_calls=cpu_count())
+        .batch(batch_size)
+    )
+    return ds, len(samples) * 20 // batch_size
+
+
+@click.command("feature-divergence", cls=ConfigCommand)
+@click.option(
+    "-s", "--source-database", cls=ResourceOption, entry_point_group="bob.pad.database"
+)
+@click.option(
+    "-t", "--target-database", cls=ResourceOption, entry_point_group="bob.pad.database"
+)
+@click.option("-m", "--model", cls=ResourceOption, entry_point_group="keras.model")
+@click.option("-o", "--output")
+@click.option(
+    "-c",
+    "--class",
+    "klass",
+    default="bf",
+    show_default=True,
+    type=click.Choice(["bf", "pa"]),
+)
+@click.option("-b", "--batch-size", default=256, show_default=True, type=click.INT)
+@verbosity_option()
+def feature_divergences_command(
+    source_database,
+    target_database,
+    model,
+    output,
+    klass,
+    batch_size,
+    verbose,
+    **kwargs
+):
+
+    src_ds, steps = get_dataset(source_database, klass, batch_size)
+    src_predictions = model.predict(src_ds, verbose=np.clip(verbose, 0, 1), steps=steps)
+
+    tar_ds, steps = get_dataset(target_database, klass, batch_size)
+    tar_predictions = model.predict(tar_ds, verbose=np.clip(verbose, 0, 1), steps=steps)
+
+    a = tf.placeholder(src_predictions.dtype, src_predictions.shape)
+    b = tf.placeholder(tar_predictions.dtype, tar_predictions.shape)
+    d = feature_divergence(a, b)
+    d = tf.reduce_mean(d, [0, 1])
+    d = tf.keras.backend.get_session().run(
+        d, feed_dict={a: src_predictions, b: tar_predictions}
+    )
+
+    # save the feature divergences
+    os.makedirs(os.path.dirname(output), exist_ok=True)
+    np.save(output, d)
--- a/bob/paper/icassp2020_domain_guided_pruning/filters_multiply_batl.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/filters_multiply_batl.py
+import numpy as np
+filters_multiply = np.load("results/filters/oulunpu_vs_batl.npy")
+model_dir = "results/deep_pix_bis_pruned_by_batl"
--- a/bob/paper/icassp2020_domain_guided_pruning/filters_multiply_ijbc.py
+++ b/bob/paper/icassp2020_domain_guided_pruning/filters_multiply_ijbc.py
+import numpy as np
+filters_multiply = np.load("results/filters/oulunpu_vs_ijbc.npy")
+model_dir = "results/deep_pix_bis_pruned_by_ijbc"