Commit 7bada4ca authored by Hannah MUCKENHIRN's avatar Hannah MUCKENHIRN

Update README.md

parent e88c2210
......@@ -31,7 +31,10 @@ The training set is split in two subsets: the training subset and the validation
You first need to create six compressed files: three files for the training subset and three files for the validation subset. In both cases the three files contain respectively the audio data, the speakers labels and the voice activity detection (VAD) labels. The VAD labels are not mandatory but it was observed that it improves the performance of the system. The VAD labels were computed separately with the bob framework and are provided in the folder `files`.
The six compressed files (or four files if you do not want to use VAD labels) are created with the script `generateFilesSpeakerIdentification.lua`, which is run the following way:
`th generateFilesSpeakerIdentification.lua -train files/world/world_train -valid files/world/world_valid -trainVAD files/world/world_train_VAD -validVAD files/world/world_valid_VAD -folderDatabase <folder_database> -output <folder_compressed_data_train>`
```bash
th generateFilesSpeakerIdentification.lua -train files/world/world_train -valid files/world/world_valid -trainVAD files/world/world_train_VAD -validVAD files/world/world_valid_VAD -folderDatabase <folder_database> -output <folder_compressed_data_train>
```
The only two things you need to specify are:
- argument `-folderDatabase`: the path to the folder containing the Voxforge data.
......@@ -41,7 +44,9 @@ The only two things you need to specify are:
### 2. Train the CNN
The training is done with the script trainSpeakerIdentification.lua:
`th trainSpeakerIdentification.lua -trainData <folder_compressed_data_train>/train_wav.bin -trainLabel <folder_compressed_data_train>/train_label.bin -trainVAD <folder_compressed_data_train>/train_VAD.bin -validData <folder_compressed_data_train>/valid_wav.bin -validLabel <folder_compressed_data_train>/valid_label.bin -validVAD <folder_compressed_data_train>/valid_VAD.bin -save <folder_results_train> -norm seq`
```bash
th trainSpeakerIdentification.lua -trainData <folder_compressed_data_train>/train_wav.bin -trainLabel <folder_compressed_data_train>/train_label.bin -trainVAD <folder_compressed_data_train>/train_VAD.bin -validData <folder_compressed_data_train>/valid_wav.bin -validLabel <folder_compressed_data_train>/valid_label.bin -validVAD <folder_compressed_data_train>/valid_VAD.bin -save <folder_results_train> -norm seq
```
The results will be saved in the folder `<folder_results_train>`. This folder will contain a log file, a file containing the error of the training set and a file containing the error of the validation set. It will also contains the models trained at each epoch.
......@@ -56,16 +61,24 @@ You can pass additional arguments when calling the function `trainSpeakerIdentif
Both the development and evaluation sets are split into two subsets: the enrollment data and the probe data. One CNN is trained on the enrollment data of each speaker (where samples randomly chosen from the training set are used as the negative samples). Thus, the enrollment data is furthermore split into training/validation data: training data is used to train the CNN while validation data is used for early stopping.
To generate the files for the enrollment data of the development and evaluation sets:
`th generateFilesSpeakerVerification_enroll.lua -speakersTrain files/dev/dev_model_train -speakersValid files/dev/dev_model_valid -negativeTrain files/small_world_for_verif/smallworld_train -negativeValid files/small_world_for_verif/smallworld_valid -speakersTrainVAD files/dev/dev_model_train_VAD -speakersValidVAD files/dev/dev_model_valid_VAD -negativeTrainVAD files/small_world_for_verif/smallworld_train_VAD -negativeValidVAD files/small_world_for_verif/smallworld_valid_VAD -folderData <folder_database> -output <folder_compressed_data_dev>`
```bash
th generateFilesSpeakerVerification_enroll.lua -speakersTrain files/dev/dev_model_train -speakersValid files/dev/dev_model_valid -negativeTrain files/small_world_for_verif/smallworld_train -negativeValid files/small_world_for_verif/smallworld_valid -speakersTrainVAD files/dev/dev_model_train_VAD -speakersValidVAD files/dev/dev_model_valid_VAD -negativeTrainVAD files/small_world_for_verif/smallworld_train_VAD -negativeValidVAD files/small_world_for_verif/smallworld_valid_VAD -folderData <folder_database> -output <folder_compressed_data_dev>
```
`th generateFilesSpeakerVerification_enroll.lua -speakersTrain files/eval/eval_model_train -speakersValid files/eval/eval_model_valid -negativeTrain files/small_world_for_verif/smallworld_train -negativeValid files/small_world_for_verif/smallworld_valid -speakersTrainVAD files/eval/eval_model_train_VAD -speakersValidVAD files/eval/eval_model_valid_VAD -negativeTrainVAD files/small_world_for_verif/smallworld_train_VAD -negativeValidVAD files/small_world_for_verif/smallworld_valid_VAD -folderData <folder_database> -output <folder_compressed_data_eval>`
```bash
th generateFilesSpeakerVerification_enroll.lua -speakersTrain files/eval/eval_model_train -speakersValid files/eval/eval_model_valid -negativeTrain files/small_world_for_verif/smallworld_train -negativeValid files/small_world_for_verif/smallworld_valid -speakersTrainVAD files/eval/eval_model_train_VAD -speakersValidVAD files/eval/eval_model_valid_VAD -negativeTrainVAD files/small_world_for_verif/smallworld_train_VAD -negativeValidVAD files/small_world_for_verif/smallworld_valid_VAD -folderData <folder_database> -output <folder_compressed_data_eval>
```
Note that in both cases the only arguments that you need to modify are: `-folderData` and `<output>`.
To generate the files for the probe data of the development and evaluation sets:
`th generateFilesSpeakerVerification_probe.lua -probe files/dev/dev_probe -probeVAD files/dev/dev_probe_VAD -folderData <folder_database> -output <folder_compressed_data_dev>`
```bash
th generateFilesSpeakerVerification_probe.lua -probe files/dev/dev_probe -probeVAD files/dev/dev_probe_VAD -folderData <folder_database> -output <folder_compressed_data_dev>
```
`th generateFilesSpeakerVerification_probe.lua -probe files/eval/eval_probe -probeVAD files/eval/eval_probe_VAD -folderData <folder_database> -output <folder_compressed_data_eval>`
```bash
th generateFilesSpeakerVerification_probe.lua -probe files/eval/eval_probe -probeVAD files/eval/eval_probe_VAD -folderData <folder_database> -output <folder_compressed_data_eval>
```
Note that in both cases the only arguments that you need to modify are: `-folderData` and `<output>`.
......@@ -73,25 +86,33 @@ Note that in both cases the only arguments that you need to modify are: `-folder
Development set:
`th trainEachSpeaker.lua -model </path_to_trained_models/model_x.bin> -modelID "dev" -folderData <folder_compressed_data_dev> -VAD -speakersList files/dev/speakers`
```bash
th trainEachSpeaker.lua -model <path_to_trained_models/model_x.bin> -modelID "dev" -folderData <folder_compressed_data_dev> -VAD -speakersList files/dev/speakers
```
Evaluation set:
`th trainEachSpeaker.lua -model </path_to_trained_models/model_x.bin> -modelID "eval" -folderData <folder_compressed_data_eval> -VAD -speakersList files/eval/speakers`
```bash
th trainEachSpeaker.lua -model <path_to_trained_models/model_x.bin> -modelID "eval" -folderData <folder_compressed_data_eval> -VAD -speakersList files/eval/speakers
```
The folder `/path_to_trained_models/` corresponds to the folder containing the models obtained during the first step and is a subdirectory of `<folder_results_train>`.The model `<model_x.bin>` should correspond to the best model in the folder `</path_to_trained_models>` obtained during the speaker identification phase, where x corresponds to the epoch. To choose which one to use, you should check in the logfile which epoch yields the lowest validation error.
The folder `<path_to_trained_models>` corresponds to the folder containing the models obtained during the first step and is a subdirectory of `<folder_results_train>`.The model `<model_x.bin>` should correspond to the best model in the folder `<path_to_trained_models>` obtained during the speaker identification phase, where x corresponds to the epoch. To choose which one to use, you should check in the logfile which epoch yields the lowest validation error.
If you do not want to use the VAD labels, you need to remove the argument `VAD`.
### 3. Evaluate each CNN using the probe data
Development set:
`th forward_probe.lua -folderModels </path_to_trained_models/speakers/dev> -folderData <folder_compressed_data_dev> -VAD -speakersList files/dev/speakers`
```bash
th forward_probe.lua -folderModels <path_to_trained_models/speakers/dev> -folderData <folder_compressed_data_dev> -VAD -speakersList files/dev/speakers
```
Evaluation set:
`th forward_probe.lua -folderModels </path_to_trained_models/speakers/eval> -folderData <folder_compressed_data_eval> -VAD -speakersList files/eval/speakers`
```bash
th forward_probe.lua -folderModels <path_to_trained_models/speakers/eval> -folderData <folder_compressed_data_eval> -VAD -speakersList files/eval/speakers
```
This resulting score file will be `</path_to_trained_models/speakers/dev/all_scores/scores>`. The format of the score file is the following:
This resulting score file will be `<path_to_trained_models/speakers/dev/all_scores/scores>`. The format of the score file is the following:
- the first column corresponds to the claimed identity,
- the second column corresponds to the true identity,
- the third column corresponds to the path to the file evaluated,
......@@ -101,7 +122,7 @@ The equal error rate and half total error rate were computed with the [bob](http
# Models
If you want to use the pre-trained model (the one obtained in the first step "speaker identification on training set"), you can just load the model in `model/speaker_identification_model.bin` as the following:
If you want to use the pre-trained model (the one obtained in the first step "speaker identification on training set"), you can just load the model `model/speaker_identification_model.bin` in lua as the following:
```lua
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment