user_guide.rst 10.5 KB
Newer Older
Tiago de Freitas Pereira's avatar
Tiago de Freitas Pereira committed
1 2 3 4 5 6 7
.. vim: set fileencoding=utf-8 :


===========
 User guide
===========

Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
8 9 10
This package builds on top of tensorflow_. You are expected to have some
familiarity with it before continuing. We recommend reading at least the
following pages:
Tiago de Freitas Pereira's avatar
Tiago de Freitas Pereira committed
11

Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
12
* https://www.tensorflow.org/get_started
13
* https://www.tensorflow.org/guide/
14 15
* https://www.tensorflow.org/guide/estimators
* https://www.tensorflow.org/guide/datasets
Tiago de Freitas Pereira's avatar
Tiago de Freitas Pereira committed
16

Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
17 18 19
The best way to use tensorflow_ is to use its ``tf.estimator`` and ``tf.data``
API. The estimators are an abstraction API for machine learning models and the
data API is here to help you build complex and efficient input pipelines to
20 21 22
your model. Using the estimators and dataset API of tensorflow will make your
code more complex but instead you will enjoy more efficiency and avoid code
redundancy.
Tiago de Freitas Pereira's avatar
Tiago de Freitas Pereira committed
23 24


25 26
Face recognition example using bob.db databases
===============================================
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
27

28 29 30 31

Let's take a look at a complete example of using a convolutional neural network
(CNN) for recognizing faces from the ATNT database. At the end, we will explain
the data pipeline in more detail.
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
32 33

1. Let's do some imports:
34
*************************
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
35

36 37 38 39 40
.. testsetup::

    import tempfile
    temp_dir = model_dir = tempfile.mkdtemp()

Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
41 42 43 44 45 46 47
.. doctest::

    >>> from bob.learn.tensorflow.dataset.bio import BioGenerator
    >>> from bob.learn.tensorflow.utils import to_channels_last
    >>> from bob.learn.tensorflow.estimators import Logits
    >>> import bob.db.atnt
    >>> import tensorflow as tf
48
    >>> import tensorflow.contrib.slim as slim
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
49 50

2. Define the inputs:
51
*********************
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
52

53 54
.. _input_fn:

Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94
.. doctest::

    >>> def input_fn(mode):
    ...     db = bob.db.atnt.Database()
    ...
    ...     if mode == tf.estimator.ModeKeys.TRAIN:
    ...         groups = 'world'
    ...     elif mode == tf.estimator.ModeKeys.EVAL:
    ...         groups = 'dev'
    ...
    ...     files = db.objects(groups=groups)
    ...
    ...     # construct integer labels for each identity in the database
    ...     CLIENT_IDS = (str(f.client_id) for f in files)
    ...     CLIENT_IDS = list(set(CLIENT_IDS))
    ...     CLIENT_IDS = dict(zip(CLIENT_IDS, range(len(CLIENT_IDS))))
    ...
    ...     def biofile_to_label(f):
    ...         return CLIENT_IDS[str(f.client_id)]
    ...
    ...     def load_data(database, f):
    ...         img = f.load(database.original_directory, database.original_extension)
    ...         # make a channels_first image (bob format) with 1 channel
    ...         img = img.reshape(1, 112, 92)
    ...         return img
    ...
    ...     generator = BioGenerator(db, files, load_data, biofile_to_label)
    ...
    ...     dataset = tf.data.Dataset.from_generator(
    ...         generator, generator.output_types, generator.output_shapes)
    ...
    ...     def transform(image, label, key):
    ...         # convert to channels last
    ...         image = to_channels_last(image)
    ...
    ...         # per_image_standardization
    ...         image = tf.image.per_image_standardization(image)
    ...         return (image, label, key)
    ...
    ...     dataset = dataset.map(transform)
95
    ...     dataset = dataset.cache(temp_dir)
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
96
    ...     if mode == tf.estimator.ModeKeys.TRAIN:
97
    ...         dataset = dataset.repeat(1)
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120
    ...     dataset = dataset.batch(8)
    ...
    ...     data, label, key = dataset.make_one_shot_iterator().get_next()
    ...     return {'data': data, 'key': key}, label
    ...
    ...
    >>> def train_input_fn():
    ...     return input_fn(tf.estimator.ModeKeys.TRAIN)
    ...
    ...
    >>> def eval_input_fn():
    ...     return input_fn(tf.estimator.ModeKeys.EVAL)
    ...
    ...
    >>> # supply this hook for debugging
    >>> # from tensorflow.python import debug as tf_debug
    >>> # hooks = [tf_debug.LocalCLIDebugHook()]
    >>> hooks = None
    ...
    >>> train_spec = tf.estimator.TrainSpec(
    ...     input_fn=train_input_fn, max_steps=50, hooks=hooks)
    >>> eval_spec = tf.estimator.EvalSpec(input_fn=eval_input_fn)

121
3. Define the architecture:
122
***************************
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
123 124 125 126 127 128 129 130 131 132

.. doctest::

    >>> def architecture(data, mode, **kwargs):
    ...     endpoints = {}
    ...     training = mode == tf.estimator.ModeKeys.TRAIN
    ...
    ...     with tf.variable_scope('CNN'):
    ...
    ...         name = 'conv'
133 134
    ...         net = slim.conv2d(data, 32, kernel_size=(
    ...             5, 5), stride=2, padding='SAME', activation_fn=tf.nn.relu, scope=name)
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
135 136 137
    ...         endpoints[name] = net
    ...
    ...         name = 'pool'
138 139
    ...         net = slim.max_pool2d(net, (2, 2),
    ...             stride=1, padding='SAME', scope=name)
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
140 141 142
    ...         endpoints[name] = net
    ...
    ...         name = 'pool-flat'
143
    ...         net = slim.flatten(net, scope=name)
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
144 145 146
    ...         endpoints[name] = net
    ...
    ...         name = 'dense'
147
    ...         net = slim.fully_connected(net, 128, scope=name)
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
148 149 150
    ...         endpoints[name] = net
    ...
    ...         name = 'dropout'
151
    ...         net = slim.dropout(
152
    ...             inputs=net, keep_prob=0.4, is_training=training)
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
153 154 155
    ...         endpoints[name] = net
    ...
    ...     return net, endpoints
156 157


158 159 160 161 162 163 164
.. important ::

    Practical advice: use ``tf.contrib.slim`` to craft your CNNs. Although
    Tensorflow's documentation recommend the usage of ``tf.layers`` and
    ``tf.keras``, in our experience ``slim`` has better defaults and is more
    integrated with tensorflow's framework (compared to ``tf.keras``),
    probably because it is used more often internally at Google.
165 166


167 168 169 170 171
4. Estimator:
************************

Explicitly triggering the estimator
...................................
172 173

.. doctest::
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
174

Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
175 176 177 178 179 180 181
    >>> estimator = Logits(
    ...     architecture,
    ...     optimizer=tf.train.GradientDescentOptimizer(1e-4),
    ...     loss_op=tf.losses.sparse_softmax_cross_entropy,
    ...     n_classes=20,  # the number of identities in the world set of ATNT database
    ...     embedding_validation=True,
    ...     validation_batch_size=8,
182 183
    ...     model_dir=model_dir,
    ... )
184
    >>> tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec) # doctest: +SKIP
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
185
    ({'accuracy':...
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
186

187

188 189
Triggering the estimator via command line
..........................................
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
190

191 192
In the example above we explicitly triggered the training and validation via
`tf.estimator.train`. We provide command line scripts that does that for you.
193 194

Check the command bellow fro training::
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
195

196
 $ bob tf train --help
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
197

198 199 200
and to evaluate::

 $ bob tf eval --help
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
201 202 203


Data pipeline
204
=============
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
205

206 207 208
There are several ways to provide data to Tensorflow graphs. In this section we
provide some examples on how to make the bridge between `bob.db` databases and
tensorflow `input_fn`.
209 210 211 212

The BioGenerator input pipeline
*******************************

Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
213 214 215 216 217 218 219 220 221 222 223 224 225 226 227
The :any:`bob.learn.tensorflow.dataset.bio.BioGenerator` class can be used to
convert any database of bob (not just bob.bio.base's databases) to a
``tf.data.Dataset`` instance.

While building the input pipeline, you can manipulate your data in two
sections:

* In the ``load_data`` function where everything is a numpy array.
* In the ``transform`` function where the data are tensorflow tensors.

For example, you can annotate, crop to bounding box, and scale your images in
the ``load_data`` function and apply transformations on images (e.g. random
crop, mean normalization, random flip, ...) in the ``transform`` function.

Once these transformations are applied on your data, you can easily cache them
228 229
to disk (using ``tf.data.Dataset.cache``) for faster reading of data in your
training.
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
230

231 232 233 234

Input pipeline with TFRecords
*****************************

235 236 237
An optimized way to provide data to Tensorflow graphs is using tfrecords. In
this `link <http://warmspringwinds.github.io/tensorflow/tf-slim/2016/12/21/tfrecords-guide/>`_
you have a very nice guide on how TFRecord works.
238

239 240 241
In `bob.learn.tensorflow` we provide a command line interface
``bob tf db_to_tfrecords`` that converts ``bob.db`` databases to TFRecords.
Type the snippet bellow for help::
242

243
  $ bob tf db_to_tfrecords --help
244 245


246 247 248
To generate a tfrecord for our
`Face recognition example using bob.db databases`_ example use the following
snippet.
249 250 251 252

.. doctest::

    >>> from bob.bio.base.utils import read_original_data
253 254 255
    >>> from bob.bio.base.test.dummy.database import database # this is based on bob.db.atnt

    >>> groups = 'dev'
256

257
    >>> samples = database.all_files(groups=groups)
258 259

    >>> CLIENT_IDS = (str(f.client_id) for f in database.objects(groups=groups))
260
    >>> CLIENT_IDS = set(CLIENT_IDS)
261 262 263 264 265 266
    >>> CLIENT_IDS = dict(zip(CLIENT_IDS, range(len(CLIENT_IDS))))

    >>> def file_to_label(f):
    ...     return CLIENT_IDS[str(f.client_id)]

    >>> def reader(biofile):
267
    ...     data = read_original_data(biofile, database.original_directory, database.original_extension)
268 269 270 271 272
    ...     label = file_to_label(biofile)
    ...     key = biofile.path
    ...     return (data, label, key)


273 274
After saving this snippet in a python file (let's say `tfrec.py`) run the
following command ::
275

276
    $ bob tf db_to_tfrecords tfrec.py -o atnt.tfrecord
277

278 279
Once this is done you can replace the `input_fn`_ defined above by the snippet
bellow.
280 281 282

.. doctest::

283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298
    >>>
    >>> from bob.learn.tensorflow.dataset.tfrecords import shuffle_data_and_labels_image_augmentation
    >>>
    >>> tfrecords_filename = ['/path/to/atnt.tfrecord']
    >>> data_shape = (112, 92 , 3)
    >>> data_type = tf.uint8
    >>> batch_size = 16
    >>> epochs = 1
    >>>
    >>> def train_input_fn():
    ...     return shuffle_data_and_labels_image_augmentation(
    ...                tfrecords_filename,
    ...                data_shape,
    ...                data_type,
    ...                batch_size,
    ...                epochs=epochs)
299

300 301 302 303
.. testcleanup::

    import shutil
    shutil.rmtree(model_dir, True)
304

Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
305
The Estimator
306
=============
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
307

308
In this package we have crafted 4 types of estimators.
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
309

310 311 312 313 314 315 316 317 318 319 320 321 322
   - Logits: `Cross entropy loss
     <https://www.tensorflow.org/api_docs/python/tf/nn/softmax_cross_entropy_with_logits>`_
     in the hot-encoded layer
     :py:class:`bob.learn.tensorflow.estimators.Logits`
   - LogitsCenterLoss: `Cross entropy loss
     <https://www.tensorflow.org/api_docs/python/tf/nn/softmax_cross_entropy_with_logits>`_
     PLUS the `center loss <https://ydwen.github.io/papers/WenECCV16.pdf>`_ in
     the hot-encoded layer
     :py:class:`bob.learn.tensorflow.estimators.LogitsCenterLoss`
   - Siamese: Siamese network estimator
     :py:class:`bob.learn.tensorflow.estimators.Siamese`
   - Triplet: Triplet network estimator
     :py:class:`bob.learn.tensorflow.estimators.Triplet`
Amir MOHAMMADI's avatar
Amir MOHAMMADI committed
323 324

.. _tensorflow: https://www.tensorflow.org/
325