...
 
Commits (3)
This diff is collapsed.
......@@ -17,11 +17,14 @@ If you use this code and/or its results, please cite the paper.
To install Torch, follow the instructions given [here](http://torch.ch/docs/getting-started.html). The experiments require one additional package: `sndfile` which you can install ith the following command: `luarocks install sndfile`.
# Database
The data used is a subset (300 speakers) of the Voxforge database (English corpus), which is freely available and can be downloaded [here](http://www.voxforge.org/home/downloads).
# Running experiments
## 1. First step: train a speaker identification system on the training dataset
### Create .bin files
## I. First step: train a speaker identification system on the training dataset
### 1. Create .bin files
The training set is split in two subsets: the training subset and the validation subset (90% and 10% of the data respectively). The convolutional neural network (CNN) is trained on the training subset while a validation error is computed on the validation subset in order to control that the CNN is not over-fitting.
......@@ -35,7 +38,7 @@ The only two things you need to specify are:
- argument `-output`: the path to the folder in which you want to save the compressed files (the resulting data takes aroung 7Go of space).
### Train the CNN
### 2. Train the CNN
The training is done with the script trainSpeakerIdentification.lua:
`th trainSpeakerIdentification.lua -trainData <folder_compressed_data_train>/train_wav.bin -trainLabel <folder_compressed_data_train>/train_label.bin -trainVAD <folder_compressed_data_train>/train_VAD.bin -validData <folder_compressed_data_train>/valid_wav.bin -validLabel <folder_compressed_data_train>/valid_label.bin -validVAD <folder_compressed_data_train>/valid_VAD.bin -save <folder_results_train> -norm seq`
......@@ -46,9 +49,9 @@ The arguments `-norm` can have 3 values: "seq", "win" or "dset". The "seq" norma
You can pass additional arguments when calling the function `trainSpeakerIdentification.lua` to modify the value of the learning rate, the number of iterations and the hyperparameters of the CNN. You can access the list of arguments by typing `th trainSpeakerIdentification.lua --help`.
## 2. Second step: train one CNN for each speaker in the development and evaluation sets
## II. Second step: train one CNN for each speaker in the development and evaluation sets
### Generate .bin files
### 1. Generate .bin files
Both the development and evaluation sets are split into two subsets: the enrollment data and the probe data. One CNN is trained on the enrollment data of each speaker (where samples randomly chosen from the training set are used as the negative samples). Thus, the enrollment data is furthermore split into training/validation data: training data is used to train the CNN while validation data is used for early stopping.
......@@ -66,19 +69,21 @@ To generate the files for the probe data of the development and evaluation sets:
Note that in both cases the only arguments that you need to modify are: `-folderData` and `<output>`.
### Train one CNN per speaker
### 2. Train one CNN per speaker
Development set:
`th trainFromUBM.lua -model </path_to_trained_models/model_xx.bin> -modelID "dev" -folderData <folder_compressed_data_dev> -VAD -speakersList files/dev/speakers`
`th trainEachSpeaker.lua -model </path_to_trained_models/model_x.bin> -modelID "dev" -folderData <folder_compressed_data_dev> -VAD -speakersList files/dev/speakers`
Evaluation set:
`th trainFromUBM.lua -model </path_to_trained_models/model_xx.bin> -modelID "eval" -folderData <folder_compressed_data_eval> -VAD -speakersList files/eval/speakers`
The folder `/path_to_trained_models/` corresponds to the folder containing the models obtained during the first step and is a subdirectory of `<folder_results_train>`.The model `<model_xx.bin>` should correspond to the best model in the folder `</path_to_trained_models>` obtained during the speaker identification phase, where xx corresponds to the epoch. To choose which one to use, you should check in the logfile which epoch yields the lowest validation error.
`th trainEachSpeaker.lua -model </path_to_trained_models/model_x.bin> -modelID "eval" -folderData <folder_compressed_data_eval> -VAD -speakersList files/eval/speakers`
The folder `/path_to_trained_models/` corresponds to the folder containing the models obtained during the first step and is a subdirectory of `<folder_results_train>`.The model `<model_x.bin>` should correspond to the best model in the folder `</path_to_trained_models>` obtained during the speaker identification phase, where x corresponds to the epoch. To choose which one to use, you should check in the logfile which epoch yields the lowest validation error.
If you do not want to use the VAD labels, you need to remove the argument `VAD`.
### Evaluate each CNN using the probe data
### 3. Evaluate each CNN using the probe data
Development set:
`th forward_probe.lua -folderModels </path_to_trained_models/speakers/dev> -folderData <folder_compressed_data_dev> -VAD -speakersList files/dev/speakers`
......@@ -99,8 +104,11 @@ The equal error rate and half total error rate were computed with the [bob](http
If you want to use the pre-trained model (the one obtained in the first step "speaker identification on training set"), you can just load the model in `model/speaker_identification_model.bin` as the following:
`require 'nn'`
`model = torch.load("model/speaker_identification_model.bin")`
`net = model[1]`
`params = model[2]`
# Other Remarks
......
......@@ -60,7 +60,7 @@ function saveWav(input, output)
local i=0
for line in io.lines(input) do
filename = strsplit(line)[1]
table.insert(data_filenames,path.join(params.folderDatabase,fileName));
table.insert(data_filenames,path.join(params.folderDatabase,filename));
i=i+1
end
......
......@@ -32,8 +32,7 @@ params=model[2]
function trainCNN(trainData, trainLabel, trainVAD, validData, validLabel, validVAD,speakerID)
dirname,model,ext=string.match(params2.model, "(.-)([^\\/]-%.?([^%.\\/]*))$")
dirname = dirname .. "speakers/"..params2.modelID .. "/" .. speakerID
print(dirname)
os.exit()
os.execute("mkdir -p " .. dirname);
cmd:log(dirname .. "/logfile",params);
......
......@@ -45,7 +45,7 @@ params=cmd:parse(arg);
if params.save ~= "" then
dirname=params.arch .. "_"
dirname=""
for k,v in pairs(params) do
if v~=0 and v~="" and k~="save" and k~="trainData" and k~="trainLabel" and k~="validData" and k~="validLabel" and k~="trainVAD" and k~="validVAD" then
......