Commit 5bd99a6e by Hannah MUCKENHIRN

Modified documentation

parent a58fd91c
......@@ -22,20 +22,37 @@ The experiments require two additional packages: `signal` and `sndfile`
## 1. Create data files
In order to run the experiments, you will need three bin files per data subset: one containing the audio data, one containing the targets (0 and 1s) and one containing the audio sample label (corresponding to the name of each file).
The ASVspoof and AVspoof database are split into three data subsets: training, development and evaluation.
In order to run the experiments, you will need four bin files per data subset, each containing respectively:
- the audio data,
- the targets (0 if it is a presentation attack and 1 if it is a genuine access),
- the audio sample label (corresponding to the name of each file),
- the VAD labels.
To create the bin files, you will need to have two lists for each data subset: one containing the paths to the audio data of the real samples and the attacks respectively. These lists are provided in the folder `file_lists` for the AVspoof and ASVspoof database. Note that for all files contained in `file_list/<database_name>/lists`, the path to your local folder containing the audio files should be appended to each line.
The four bin files are created respectively with the following command:
- `th saveWav.lua file_lists/<database_name>/lists/audio/<file_name>_real.txt file_lists/<database_name>/lists/<file_name>_attack.txt <output_wav>.bin`
- `th saveLabel.lua file_lists/<database_name>/lists/audio/<file_name>_real.txt file_lists/<database_name>/lists/<file_name>_attack.txt <output_label>.bin`
- `th savePaths.lua file_lists/<database_name>/lists/audio/<file_name>_real.txt file_lists/<database_name>/lists/<file_name>_attack.txt <output_paths>.bin`
- `th saveWav.lua file_lists/<database_name>/lists/audio/<file_name>_real.txt file_lists/<database_name>/VAD/<file_name>_attack.txt <output_VAD>.bin`
For each database, you should have in total 12 bin files, i.e., 4 per data subset (training, development and evaluation subsets).
To create these three bin files, you will need to have two lists for each data subset: one containing the paths to the audio data of the real samples and the attacks respectively.
## 2. Train the CNN
The training is done with the script train.sh.
The training is done with the script train.lua.
When running this script, several parameters of the CNN should be specified:
-arch {cnnMLP, cnnSLP}
-nhu1
-kW1
-dW1
-clNhu
-nf1: number of convolutional filters
-kW1: kernel width
-dW1: kernel shift
-nhu: number of hidden units in the MLP
as well as the name of the bin files previously created:
-trainData
......@@ -43,14 +60,32 @@ as well as the name of the bin files previously created:
-trainVAD
-devData
-devLabel
-devVAD'
-devVAD
and the folder in which the outputs will be saved: -save
This folder will contain a log file, a file containing the error of the training set and a file containing the error of the development set. It will also contains the models trianed at each epoch.
An example of the command to run is the following:
`th train.lua -arch cnnMLP -nf1 20 -kW1 300 -dW1 10 -nhu 100 -trainData <train_wav>.bin -trainLabel <train_label>.bin -trainVAD <train_VAD>.bin -devData <dev_wav>.bin -devLabel <dev_label>.bin -devVAD <dev_VAD>.bin -save <output_directory>`
## 3. Forward development and evaluation sets
The forward pass is done with the script forward.lua.
Several parameters need to be specified:
- model, which corresponds to a bin file containing a trained model (in the `<output_directory>` specified during training)
- modelID, which can be any name identifying the model
- data
- label
- path
- VAD
In order to compute the scores on the development and evaluation sets, you will first need to select which saved model to use. This should be done by selecting the one that achieves the lowest error rate on the development set.
Computing the scores is done with the script forward.lua. This will create one file, names "scores-dev" or "scores-eval", depending on the -type option.
An example of the command to run is the following:
`th forward.lua -model <model_to_use>.bin -modelID <name_of_the_model> -type dev -data <dev_wav>.bin -label <dev_label>.bin -path <dev_path>.bin -VAD <dev_VAD>.bin`
`th forward.lua -model <model_to_use>.bin -modelID <name_of_the_model> -type eval -data <eval_wav>.bin -label <eval_label>.bin -path <eval_path>.bin -VAD <eval_VAD>.bin`
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment