Are people using tensorflow lagging behind compared with people using pytorch? Shall we switch to new frameworks every time something new comes up?
Well, I don't know the answer to these provocative questions :-P
So, let's talk about this MR.
I'm trying to understand how pytorch works and would like to give it a shot. Amir made a lot of effort to make a TF keras code that scales on multiworkers and multigpus and it seems that there are several issues with such scalability that are hard to be profiled and debugged. Furthermore, it seems that it's possible to truly scale pytorch code without much effort under these multigpu and multiworker constraints.
One of the things that pytorch does not support natively is the ability to read the protobuffers from tfrecord files. More specifically, the ones that we use to train our TF keras models on bob.learn.tensorflow. tfrecords are convenient for our file system that has quotas for number of files.
In this MR, I've created a dataset that is able to iterate over several tfrecord files and yield samples.
Is this useful?
Who is willing to review it?
Thanks and sorry for this verbose MR.