superfluous 'batch_size' parameter?

Created by: siebenkopf

I recently tried to use the bob.learn.mlp.RProp class to train a neural network. I came across the batch_size parameter that I need to set in the constructor (as well as in all other Trainer constructors). In the documentation, it is not clear, how to select a proper value, so I used 1 (for stochastic training). Anyways, afterwards I wanted to train a network with several input values, which I had put into a list. However, when I called the train method, I got the error message:

RuntimeError: array dimensions do not match 1 != 1031

where 1031 is the number of training examples. So, I had a look into the code, and I found that the batch_size parameter has to match the number of training samples.

Now, my question is, why do we need to specify something obvious? Can't the code just assume that the batch_size is the same as the number of inputs?

The only place, where the code actually relies on the batch_size is during the initialization of the trainer, which resizes some buffers according to the number of training data (see: https://github.com/bioidiap/bob.learn.mlp/blob/master/bob/learn/mlp/cxx/trainer.cpp#L221) . Can't we simply resize these buffers, when we know the number of data? This would enable us to remove this superfluous parameter from the Trainer constructors...