bob.learn.tensorflow issues
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues
2016-11-28T13:11:34Z
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/11
CUDA control devices
2016-11-28T13:11:34Z
Tiago de Freitas Pereira
CUDA control devices
Think about how to organize the visible GPU devices using:
`os.environ["CUDA_VISIBLE_DEVICES"] = "0, 1, ..."`
Think about how to organize the visible GPU devices using:
`os.environ["CUDA_VISIBLE_DEVICES"] = "0, 1, ..."`
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/87
Docs don't build with new sphinx version
2022-02-22T16:37:51Z
Amir MOHAMMADI
Docs don't build with new sphinx version
Job [#257923](https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/jobs/257923) failed for f420d1b322762c81b79f59fa103c4ad07713fd79:
```
bob/learn/tensorflow/losses/__init__.py:docstring of bob.learn.tensorflow.losses.center_loss.CenterLos...
Job [#257923](https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/jobs/257923) failed for f420d1b322762c81b79f59fa103c4ad07713fd79:
```
bob/learn/tensorflow/losses/__init__.py:docstring of bob.learn.tensorflow.losses.center_loss.CenterLossLayer.call:11:Unexpected indentation.
```
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/83
NIghlies failing
2019-08-19T13:42:11Z
Tiago de Freitas Pereira
NIghlies failing
As far as I could see we have doctests issue
https://gitlab.idiap.ch/bob/bob.nightlies/-/jobs/170751
This is probably related with the `sphinx` major bump , which I'm not surprised.
As far as I could see we have doctests issue
https://gitlab.idiap.ch/bob/bob.nightlies/-/jobs/170751
This is probably related with the `sphinx` major bump , which I'm not surprised.
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/82
Issue with style transfer
2019-06-25T17:53:13Z
Tiago de Freitas Pereira
Issue with style transfer
There's an issue with the style transfer implemented
ping @amohammadi
There's an issue with the style transfer implemented
ping @amohammadi
Tiago de Freitas Pereira
Tiago de Freitas Pereira
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/81
Dense net
2019-08-16T06:14:57Z
Tiago de Freitas Pereira
Dense net
Hey @amohammadi,
You said that you have patched the dense net in some branch.
Do you mind to open a MR for it?
Thanks
Hey @amohammadi,
You said that you have patched the dense net in some branch.
Do you mind to open a MR for it?
Thanks
Amir MOHAMMADI
Amir MOHAMMADI
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/78
Using tf.contrib.layers.optimize_loss in model_fns (estimators)
2019-05-27T11:19:01Z
Amir MOHAMMADI
Using tf.contrib.layers.optimize_loss in model_fns (estimators)
Guys, there is a neat function in tensorflow v1 which takes care of a lot of biolerplates in estimators:
https://www.tensorflow.org/versions/r1.12/api_docs/python/tf/contrib/layers/optimize_loss
If you guys, don't mind, I will add this ...
Guys, there is a neat function in tensorflow v1 which takes care of a lot of biolerplates in estimators:
https://www.tensorflow.org/versions/r1.12/api_docs/python/tf/contrib/layers/optimize_loss
If you guys, don't mind, I will add this to our estimators. It might break backward compatibility in terms of not being able to resume trainings from older checkpoints.
Amir MOHAMMADI
Amir MOHAMMADI
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/77
random rotate of images is not really random
2020-11-05T15:18:17Z
Amir MOHAMMADI
random rotate of images is not really random
in `bob.learn.tensorflow/bob/learn/tensorflow/dataset/__init__.py` there is:
```
if random_rotate:
image = tf.contrib.image.rotate(
image,
angles=numpy.random.randint(-5, 5),
interpolation=...
in `bob.learn.tensorflow/bob/learn/tensorflow/dataset/__init__.py` there is:
```
if random_rotate:
image = tf.contrib.image.rotate(
image,
angles=numpy.random.randint(-5, 5),
interpolation="BILINEAR")
```
this random number (from numpy) is going to be evaluated once and then all images will be rotated using that angle.
Amir MOHAMMADI
Amir MOHAMMADI
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/75
Tensorflow 2 compatibility
2020-11-05T15:17:35Z
Amir MOHAMMADI
Tensorflow 2 compatibility
Tensorflow is making Keras and eager execution the center of its new API in version 2:
https://medium.com/tensorflow/standardizing-on-keras-guidance-on-high-level-apis-in-tensorflow-2-0-bad2b04c819a
While estimators are going to be suppo...
Tensorflow is making Keras and eager execution the center of its new API in version 2:
https://medium.com/tensorflow/standardizing-on-keras-guidance-on-high-level-apis-in-tensorflow-2-0-bad2b04c819a
While estimators are going to be supported, they do not support eager execution (They always run in graph mode).
Per [this guide](https://www.tensorflow.org/guide/eager), it's best to run code that runs both in eager mode and graph mode. I think we can extend our estimator classes to support their execution in eager mode, i.e., we can have one eager execution training script that runs just like `estimator.train` but in eager mode. This allows for easier debugging of our programs and lets us to easily switch the same model training/evaluation/prediction to graph mode.
any feedback is welcome
Amir MOHAMMADI
Amir MOHAMMADI
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/74
The VGG16 that we have here amends a hot-encoded layer
2019-05-27T11:21:57Z
Tiago de Freitas Pereira
The VGG16 that we have here amends a hot-encoded layer
Today we wrap the `vgg16` and `vgg19` directly from slim https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/slim/python/slim/nets/vgg.py
Although this is very convenient and we definitely **must** reuse code as much...
Today we wrap the `vgg16` and `vgg19` directly from slim https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/slim/python/slim/nets/vgg.py
Although this is very convenient and we definitely **must** reuse code as much as possible this implementation has an issue.
Here, https://github.com/tensorflow/tensorflow/blob/e0585bc351b19da39610cc20f6d7622b439dca4d/tensorflow/contrib/slim/python/slim/nets/vgg.py#L187 the guys from `slim` amends a hot-encoded layer in the architecture function.
This is not very useful if we want to use our estimators.
Furthermore, in my opinion, architecture functions shouldn't carry explicit classification layers.
For instance, with this architecture as is, we can't directly use the Siamese or Triplet arrangements since they work directly with embeddings.
Tiago de Freitas Pereira
Tiago de Freitas Pereira
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/72
Package release
2019-08-16T06:15:43Z
Tiago de Freitas Pereira
Package release
Hi guys,
Just to let you know.
I'll tag this package.
Cheers
Hi guys,
Just to let you know.
I'll tag this package.
Cheers
Tiago de Freitas Pereira
Tiago de Freitas Pereira
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/71
bob tf predict_bio has a bug in checkpoint loading step
2019-01-18T09:55:59Z
Amir MOHAMMADI
bob tf predict_bio has a bug in checkpoint loading step
`tf.estimator.Estimator.predict` and `.evaluate` take a checkpoint parameter. This value must be a tensorflow checkpoint prefix (e.g. `/model_dir/model.ckpt-23952000`) but I wanted to point to a folder instead and wanted to pickup the la...
`tf.estimator.Estimator.predict` and `.evaluate` take a checkpoint parameter. This value must be a tensorflow checkpoint prefix (e.g. `/model_dir/model.ckpt-23952000`) but I wanted to point to a folder instead and wanted to pickup the latest checkpoint from there automatically so this script can be used in parallel with `bob tf eval`. However, looks like there is a bug in https://gitlab.idiap.ch/bob/bob.learn.tensorflow/blob/9c068090975ab5cb13d738048017ff3b648c1bb7/bob/learn/tensorflow/script/predict_bio.py#L226 where `estimator.model_dir` is used as input to `tf.train.get_checkpoint_state` instead of `checkpoint`. This means the `--checkpoint` option had no effect so far :(
Amir MOHAMMADI
Amir MOHAMMADI
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/70
Tensorflow
2018-11-23T14:02:50Z
Tiago de Freitas Pereira
Tensorflow
Guys, I'm lunching several jobs to our GPU cluster (hundreds).
For **some** hosts I'm getting the following error once `estimator.train` is triggered.
Have you guys faced similar issue?
I'm using tensorflow-gpu 1.8
ping @andre.anjos, ...
Guys, I'm lunching several jobs to our GPU cluster (hundreds).
For **some** hosts I'm getting the following error once `estimator.train` is triggered.
Have you guys faced similar issue?
I'm using tensorflow-gpu 1.8
ping @andre.anjos, @amohammadi
thanks
```
totalMemory: 11.17GiB freeMemory: 11.11GiB
2018-11-23 14:18:50.403387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-11-23 14:18:50.403643: E tensorflow/core/common_runtime/direct_session.cc:154] Internal: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version
Traceback (most recent call last):
File "/remote/idiap.svm/user.active/tpereira/gitlab/bob/bob.bio.htface/bin/bob", line 33, in <module>
sys.exit(bob.extension.scripts.main_cli())
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/remote/idiap.svm/user.active/tpereira/gitlab/bob/bob.bio.htface/bob/bio/htface/script/domain_specic_units.py", line 86, in htface_train_dsu
steps=200000)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 363, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 843, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 859, in _train_model_default
saving_listeners)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1056, in _train_with_estimator_spec
log_step_count_steps=self._config.log_step_count_steps) as mon_sess:
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 405, in MonitoredTrainingSession
stop_grace_period_secs=stop_grace_period_secs)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 816, in __init__
stop_grace_period_secs=stop_grace_period_secs)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 539, in __init__
self._sess = _RecoverableSession(self._coordinated_creator)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1002, in __init__
_WrappedSession.__init__(self, self._create_session())
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1007, in _create_session
return self._sess_creator.create_session()
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 696, in create_session
self.tf_sess = self._session_creator.create_session()
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 467, in create_session
init_fn=self._scaffold.init_fn)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 279, in prepare_session
config=config)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 180, in _restore_checkpoint
sess = session.Session(self._target, graph=self._graph, config=config)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1560, in __init__
super(Session, self).__init__(target, graph, config=config)
File "/idiap/user/tpereira/conda/envs/bob.bio.htface/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 633, in __init__
self._session = tf_session.TF_NewSession(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.
```
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/68
Follow-up from "Enable mac builds"
2018-10-17T12:35:11Z
Amir MOHAMMADI
Follow-up from "Enable mac builds"
The following discussion from !70 should be addressed:
- [ ] @amohammadi started a [discussion](https://gitlab.idiap.ch/bob/bob.learn.tensorflow/merge_requests/70#note_35426):
> This is not enough to enable mac builds here. You al...
The following discussion from !70 should be addressed:
- [ ] @amohammadi started a [discussion](https://gitlab.idiap.ch/bob/bob.learn.tensorflow/merge_requests/70#note_35426):
> This is not enough to enable mac builds here. You also need to change the conda recipe.
Tiago de Freitas Pereira
Tiago de Freitas Pereira
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/67
`bob tf eval` command does not save best n models anymore
2019-03-14T14:33:24Z
Amir MOHAMMADI
`bob tf eval` command does not save best n models anymore
There is something wrong. I am investigating this.
There is something wrong. I am investigating this.
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/66
Improve logging hook to report CPU and GPU memory and utilisation
2019-05-27T11:23:25Z
André Anjos
Improve logging hook to report CPU and GPU memory and utilisation
Following issues with `ssh`/`fork` behaviour at Idiap GPU hosts, I implemented a change in my logging hook (largely based on this package's), to incorporate automatic in-process measures of the CPU/GPU memory, model and utilisation. I s...
Following issues with `ssh`/`fork` behaviour at Idiap GPU hosts, I implemented a change in my logging hook (largely based on this package's), to incorporate automatic in-process measures of the CPU/GPU memory, model and utilisation. I strongly recommend you do the same in this package (via copy/paste or similar):
https://gitlab.idiap.ch/bob/bob.ip.hed/blob/master/bob/ip/hed/hooks.py
I tested and it produces something like this:
```text
bob.ip.hed.hooks@2018-10-10 07:38:38,999 -- INFO: training 50, loss = 0.67 (0.186 ops/sec, cpu = [98.3, 43.2] %, cpumem = 2.6/29.5 GB, gpu = 100 % (Tesla K80), gpumem = 10940 MiB/11439 MiB)
```
I hope it helps you not having anymore to ssh into the host to dig those.
Tiago de Freitas Pereira
Tiago de Freitas Pereira
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/65
Enable mac builds
2018-10-07T11:34:14Z
André Anjos
Enable mac builds
tensorflow>=1.9 came with macOS support. It would be a good idea to enable the builds for this package so this variant is tested in the nightlies.
bob-devel has been updated accordingly: https://gitlab.idiap.ch/bob/bob.conda/merge_requ...
tensorflow>=1.9 came with macOS support. It would be a good idea to enable the builds for this package so this variant is tested in the nightlies.
bob-devel has been updated accordingly: https://gitlab.idiap.ch/bob/bob.conda/merge_requests/378
Tiago de Freitas Pereira
Tiago de Freitas Pereira
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/64
Nightlies are failing because of this one
2018-10-07T10:46:07Z
Tiago de Freitas Pereira
Nightlies are failing because of this one
https://gitlab.idiap.ch/bob/bob.nightlies/-/jobs/149612
https://gitlab.idiap.ch/bob/bob.nightlies/-/jobs/149612
Tiago de Freitas Pereira
Tiago de Freitas Pereira
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/63
How does tf.train.ExponentialMovingAverage work with tf.estimators?
2018-11-02T07:59:14Z
Amir MOHAMMADI
How does tf.train.ExponentialMovingAverage work with tf.estimators?
Judging from the documentation looks like [tf.train.ExponentialMovingAverage](https://www.tensorflow.org/versions/r1.0/api_docs/python/tf/train/ExponentialMovingAverage) does nothing but to keep track of the average of an variable. Howev...
Judging from the documentation looks like [tf.train.ExponentialMovingAverage](https://www.tensorflow.org/versions/r1.0/api_docs/python/tf/train/ExponentialMovingAverage) does nothing but to keep track of the average of an variable. However, this average value should be used in predict/eval time as originally intended. The way to do this is explained in the documentation, in https://github.com/tensorflow/tensorflow/issues/3460, and http://ruishu.io/2017/11/22/ema/ .
There is also https://www.tensorflow.org/api_docs/python/tf/contrib/opt/MovingAverageOptimizer
From the looks of it, we do not do any of these solutions but changing `apply_moving_averages` to `True` or `False` in the [Logits](https://gitlab.idiap.ch/bob/bob.learn.tensorflow/blob/3ec02be27f61c669f914f003cef30e81619aa072/bob/learn/tensorflow/estimators/Logits.py#L271) class changes our results.
So now I am wondering what the heck is going on!
Amir MOHAMMADI
Amir MOHAMMADI
2018-10-15
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/62
Syntax error SimpleCNN
2018-08-28T11:28:46Z
Tiago de Freitas Pereira
Syntax error SimpleCNN
https://gitlab.idiap.ch/bob/bob.nightlies/-/jobs/147287
Hey @amohammadi, have you seen this one?
https://gitlab.idiap.ch/bob/bob.nightlies/-/jobs/147287
Hey @amohammadi, have you seen this one?
Amir MOHAMMADI
Amir MOHAMMADI
https://gitlab.idiap.ch/bob/bob.learn.tensorflow/-/issues/61
The user guide is not tested
2018-08-27T14:47:47Z
Amir MOHAMMADI
The user guide is not tested
I noticed the architecture function is not tested and is broken.
Is there is a reason that we skip the tests in the guide?
I noticed the architecture function is not tested and is broken.
Is there is a reason that we skip the tests in the guide?
Amir MOHAMMADI
Amir MOHAMMADI