Skip to content
Snippets Groups Projects

WIP: added LDA and MLP

Closed Guillaume HEUSCH requested to merge pad-classifiers into master
5 unresolved threads

Hey guys (@amohammadi @onikisins @pkorshunov @ageorge @dgeissbuhler @andre.anjos @sbhatta)

I just added two more classifiers in bob.pad.base.algorithm:

  • MLP, which relies on bob.learn.mlp
  • LDA, which is a derived class of bob.bio.base.algorithm.LDA

I made them as simple as possible ... there are still stuff missing (i.e. docstrings), but I think it may serve as a nice basis to discuss how algorithms in this package should ideally be implemented.

Let me know what you think !

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
42 label_real = numpy.ones((len(training_features[0]), 1), dtype='float64')
43 label_attack = numpy.zeros((len(training_features[1]), 1), dtype='float64')
44
45 real = numpy.array(training_features[0])
46 attack = numpy.array(training_features[1])
47 X = numpy.vstack([real, attack])
48 Y = numpy.vstack([label_real, label_attack])
49
50
51 # The machine
52 input_dim = real.shape[1]
53 shape = []
54 shape.append(input_dim)
55 for i in range(len(self.hidden_units)):
56 shape.append(self.hidden_units[i])
57 shape.append(1)
  • 20 requires_projector_training=True,
    21 **kwargs)
    22
    23 self.hidden_units = hidden_units
    24 self.max_iter = max_iter
    25 self.mlp = None
    26
    27
    28 def train_projector(self, training_features, projector_file):
    29 """
    30 Trains the MLP
    31
    32 **Parameters**
    33
    34 training_features:
    35 """
  • 61 self.mlp.output_activation = bob.learn.activation.Logistic()
    62 self.mlp.randomize()
    63
    64 # The trainer
    65 trainer = bob.learn.mlp.BackProp(batch_size, bob.learn.mlp.CrossEntropyLoss(self.mlp.output_activation), self.mlp, train_biases=True)
    66
    67 n_iter = 0
    68 previous_cost = 0
    69 current_cost = 1
    70 precision = 0.001
    71 while (n_iter < self.max_iter) or (abs(previous_cost - current_cost) < precision):
    72 previous_cost = current_cost
    73 trainer.train(self.mlp, X, Y)
    74 current_cost = trainer.cost(self.mlp, X, Y)
    75 n_iter += 1
    76 print("Iteration {} -> cost = {} (previous = {})".format(n_iter, trainer.cost(self.mlp, X, Y), previous_cost))
  • 13
    14 """
    15
    16 def __init__(self, hidden_units=(10, 10), max_iter=1000, **kwargs):
    17
    18 Algorithm.__init__(self,
    19 performs_projection=True,
    20 requires_projector_training=True,
    21 **kwargs)
    22
    23 self.hidden_units = hidden_units
    24 self.max_iter = max_iter
    25 self.mlp = None
    26
    27
    28 def train_projector(self, training_features, projector_file):
  • 1 #!/usr/bin/env python
    2 # vim: set fileencoding=utf-8 :
    3
    4 import numpy
    5 from bob.bio.base.algorithm import LDA
    6
    7 class PadLDA(LDA):
    8 """
    9 This class is a wrapper for bob.bio.base.algorithm.LDA,
    10 to be used in a PAD context.
    11
    12 **Parameters**
    13
    14 """
  • @heusch to fix the ci, please go through our guide on https://gitlab.idiap.ch/bob/bob.admin/tree/master/templates#15-conda-recipe

    This looks good thank you. Although I think you need to add to our API summary in the docs and also add tests.

  • Thanks for the feedback @amohammadi !

    I'll address all this when I'll have some more time. And don't worry: I'm well aware of some of the stuff you mentioned, but as stated, it is Work In Progress ;)

  • @heusch , thank you for adding another classifier! Here are a couple of comments from me.

    • Does MLP classifier supports FrameContainers? Would be very useful in bob.pad.face

    • This is a matter of personal preferences, but I would consider splitting the train_projector method of MLP into functions. Nice citation from my point of view: Whenever you can clearly separate tasks within a computation, you should do so. It would help us better reuse useful bits of your code and overwrite if necessary. The splitting I can see: 1. data preparation/management 2. data normalization 3. actual training 4. save the machine/normalization parameters.

    • Minor comment, I would suggest to make this precision = 0.001 an argument, rather than a hard-coded constant.

    • Again a matter of preferences, I would suggest to use "4 spaces per indentation level", since most people in the group are doing so, thus easier for others to edit.

    Hopefully you will find some stuff useful. Thank you!

  • Thanks for the feedback @onikisins

    Here are the answers to the points you raised:

    • MLP and LDA do not support FrameContainers at the moment, I will add the same mechanism that @pkorshunov did in other algorithms, using helper functions.

    • I don't agree on the function breaking stuff ... Data preparation / normalization is data/task dependent, and therefore should not, in my opinion, be part of the classifier. The way I see it, the classifier should receive data in a format as generic as possible (i.e. numpy.array), train a machine, save it and that's it. That being said, I think that specific handling - if needed - should be implemented in a derived class. I'm not a design expert, but it sounds more logical to me. Maybe @amohammadi could provide some guidance on this.

    • Sure, precision will be an argument in the future

    • I personnaly prefer 2 spaces ;)

  • @heusch, I think it would be best to see other files in this package and use the same number of spacing. It would be inconsistent otherwise.

  • Guillaume HEUSCH added 5 commits

    added 5 commits

    • 287abb39 - [algorithm] fixed precision criterion to stop training
    • 8b6edd8e - [utils] fixed helper function, to avoid dividing by zero
    • f1f528e4 - [algorithm] added two output units to MLP binary classifier
    • c9b2ecc3 - [algorithm] fixed the import to convert and prepare feature, and the single score for one sequence
    • e85eb10b - [algorithm] added my simple implementation of One Class SVM

    Compare with previous version

  • added 1 commit

    • 054414a5 - [algorithm] added a minimal SVM version

    Compare with previous version

  • Guillaume HEUSCH added 5 commits

    added 5 commits

    • 6542568e - [ocsvm] added nu and gamma parameters
    • e5879661 - [svm] added some debug information
    • 5675f218 - [mysvm] added some debug stuff
    • 9ddf3309 - [utils] added some debug stuff
    • cff21817 - [algorithms] added my own version of one-class GMM

    Compare with previous version

  • Guillaume HEUSCH mentioned in merge request !50 (merged)

    mentioned in merge request !50 (merged)

  • Hi all,

    Since this branch was way behind the master when I got back to il, I created a new one, and I plan to add algorithms and unit tests in a more principled way (i.e. making sure that everything is working at each stage).

    Actually, I worked on this branch in a hurry a while ago, and although there is still useful stuff, it needs some more work. Anyway, I close this MR and will eventually delete this branch.

    The new branch is here: https://gitlab.idiap.ch/bob/bob.pad.base/tree/add-new-classifiers

    and the corresponding MR: !50 (merged)

    @amohammadi Don't worry, I took the remarks you made here into account when working on the new branch

    @onikisins Same remark applies to you, but I'm still questioning the use of FrameContainers at this stage of the toolchain ... I'll open an issue to discuss that.

  • Please register or sign in to reply
    Loading