Skip to content

load_scores extremely memory hungry

The new implementation of score loading is memory hungry, as it stores the whole score file in memory. For large score files that have long client_id's and label's, this might easily be too much for a normal desktop machine. To split the score file into positives and negatives, most of the information (for example, the labels) is completely irrelevan.

I remember that I have had this problem with an older version of bob.measure, and this is why I have implemented the score reading using a generator function (i.e., yield'ing the file line by line) instead of keeping all information of the score file at the same time.

I will provide a better alternative of the 'load_scores' function as a generator function, which does not store the whole score file in memory.