Commit 439a7371 authored by Amir MOHAMMADI's avatar Amir MOHAMMADI

Merge branch 'review_kaldi' into 'master'

Add documentation

See merge request !1
parents e011b90d 82d71e40
Pipeline #8781 passed with stages
in 10 minutes and 38 seconds
include COPYING README.rst buildout.cfg develop.cfg version.txt
include LICENSE README.rst buildout.cfg develop.cfg version.txt
recursive-include doc conf.py *.rst
recursive-include bob *.wav *.txt *.npy *.ivector *.ie
recursive-include bob/kaldi/test/data *.wav *.txt *.npy *.ivector *.ie
......@@ -2,27 +2,41 @@
.. Milos Cernak <milos.cernak@idiap.ch>
.. Tue Apr 4 15:28:26 CEST 2017
.. image:: http://img.shields.io/badge/docs-stable-yellow.svg
:target: http://pythonhosted.org/bob.kaldi/index.html
.. image:: http://img.shields.io/badge/docs-latest-orange.svg
:target: https://www.idiap.ch/software/bob/docs/latest/bob/bob.kaldi/master/index.html
.. image:: https://gitlab.idiap.ch/bob/bob.kaldi/badges/master/build.svg
:target: https://gitlab.idiap.ch/bob/bob.kaldi/commits/master
.. image:: https://img.shields.io/badge/gitlab-project-0000c0.svg
:target: https://gitlab.idiap.ch/bob/bob.kaldi
.. image:: http://img.shields.io/pypi/v/bob.kaldi.svg
:target: https://pypi.python.org/pypi/bob.kaldi
.. image:: http://img.shields.io/pypi/dm/bob.kaldi.svg
:target: https://pypi.python.org/pypi/bob.kaldi
===========================
Python Bindings for Kaldi
===========================
This package provides pythonic bindings for Kaldi_ functionality so it can be
seemlessly integrated with Python-based workflows.
seamlessly integrated with Python-based workflows. It is a part fo the signal-
processing and machine learning toolbox Bob_.
Installation
------------
To install this package -- alone or together with other `Packages of Bob
<https://github.com/idiap/bob/wiki/Packages>`_ -- please read the `Installation
Instructions <https://github.com/idiap/bob/wiki/Installation>`_. For Bob_ to
be able to work properly, some dependent packages are required to be installed.
Please make sure that you have read the `Dependencies
<https://github.com/idiap/bob/wiki/Dependencies>`_ for your operating system.
This package depends on both Bob_ and Kaldi_. To install Bob_ follow our
installation_ instructions. Kaldi_ is also bundled in our conda channnels which
means you can install Kaldi_ using conda easily too. After you have installed
Bob_, please follow these instructions to install Kaldi_ too.
This package also requires that Kaldi_ is properly installed alongside the
Python interpreter you're using, under the directory ``<PREFIX>/lib/kaldi``,
along with all necessary scripts and compiled binaries.
# BOB_ENVIRONMENT is the name of your conda enviroment.
$ source activate BOB_ENVIRONMENT
$ conda install kaldi
$ pip install bob.kaldi
Documentation
......@@ -31,9 +45,18 @@ Documentation
For further documentation on this package, please read the `Stable Version
<http://pythonhosted.org/bob.kaldi/index.html>`_ or the `Latest Version
<https://www.idiap.ch/software/bob/docs/latest/bioidiap/bob.kaldi/master/index.html>`_
of the documentation. For a list of tutorials on this or the other packages ob
of the documentation. For a list of tutorials on this or the other packages of
Bob_, or information on submitting issues, asking questions and starting
discussions, please visit its website.
Contact
-------
For questions or reporting issues to this software package, contact our
development `mailing list`_.
.. _bob: https://www.idiap.ch/software/bob
.. _kaldi: http://kaldi-asr.org/
.. _mailing list: https://www.idiap.ch/software/bob/discuss
.. _installation: https://www.idiap.ch/software/bob/install
......@@ -11,16 +11,14 @@ from .ivector import plda_train
from .ivector import plda_enroll
from .ivector import plda_score
from . import test
def get_config():
"""Returns a string containing the configuration information.
"""
"""Returns a string containing the configuration information.
"""
import bob.extension
return bob.extension.get_config(__name__)
import bob.extension
return bob.extension.get_config(__name__)
# gets sphinx autodoc done right - don't remove it
__all__ = [_ for _ in dir() if not _.startswith('_')]
__all__ = [_ for _ in dir() if not _.startswith('_')]
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -8,36 +8,38 @@
import pkg_resources
import numpy as np
import bob.io.audio
import io
import bob.kaldi
def test_mfcc():
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
reference = pkg_resources.resource_filename(__name__, 'data/sample16k-mfcc.txt')
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
reference = pkg_resources.resource_filename(
__name__, 'data/sample16k-mfcc.txt')
data = bob.io.audio.reader(sample)
data = bob.io.audio.reader(sample)
ours = bob.kaldi.mfcc(data.load()[0], data.rate, normalization=False)
theirs = np.loadtxt(reference)
ours = bob.kaldi.mfcc(data.load()[0], data.rate, normalization=False)
theirs = np.loadtxt(reference)
assert ours.shape == theirs.shape
assert ours.shape == theirs.shape
assert np.allclose(ours, theirs, 1e-03, 1e-05)
assert np.allclose(ours, theirs, 1e-03, 1e-05)
def test_mfcc_from_path():
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
reference = pkg_resources.resource_filename(__name__, 'data/sample16k-mfcc.txt')
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
reference = pkg_resources.resource_filename(
__name__, 'data/sample16k-mfcc.txt')
ours = bob.kaldi.mfcc_from_path(sample)
theirs = np.loadtxt(reference)
ours = bob.kaldi.mfcc_from_path(sample)
theirs = np.loadtxt(reference)
assert ours.shape == theirs.shape
assert ours.shape == theirs.shape
assert np.allclose(ours, theirs, 1e-03, 1e-05)
assert np.allclose(ours, theirs, 1e-03, 1e-05)
# def test_compute_vad():
......@@ -55,7 +57,8 @@ def test_mfcc_from_path():
# segsref.append([ start, end ])
# segsref = np.array(segsref, dtype='int32')
# feats = [mat for name,mat in io.read_mat_ark( pkg_resources.resource_filename(__name__,'data/sample16k.ark') )][0]
# feats = [mat for name,mat in io.read_mat_ark(
# pkg_resources.resource_filename(__name__,'data/sample16k.ark') )][0]
# segs = bob.kaldi.compute_vad(feats)
......
......@@ -2,7 +2,7 @@
#
# Milos Cernak <milos.cernak@idiap.ch>
# March 1, 2017
#
#
'''Tests for Kaldi bindings'''
......@@ -14,75 +14,77 @@ import os
import bob.kaldi
def test_ubm_train():
temp_file = bob.io.base.test_utils.temporary_filename()
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
temp_file = bob.io.base.test_utils.temporary_filename()
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
data = bob.io.audio.reader(sample)
# MFCC
array = bob.kaldi.mfcc(data.load()[0], data.rate,
normalization=False)
# Train small diagonall GMM
dubm = bob.kaldi.ubm_train(array, temp_file, num_gauss=2,
num_gselect=2, num_iters=2)
data = bob.io.audio.reader(sample)
# MFCC
array = bob.kaldi.mfcc(data.load()[0], data.rate,
normalization=False)
# Train small diagonall GMM
dubm = bob.kaldi.ubm_train(array, temp_file, num_gauss = 2,
num_gselect = 2, num_iters = 2)
assert os.path.exists(dubm)
assert os.path.exists(dubm)
def test_ubm_full_train():
temp_dubm_file = bob.io.base.test_utils.temporary_filename()
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
data = bob.io.audio.reader(sample)
# MFCC
array = bob.kaldi.mfcc(data.load()[0], data.rate, normalization=False)
# Train small diagonal GMM
dubm = bob.kaldi.ubm_train(array, temp_dubm_file, num_gauss = 2,
num_gselect = 2, num_iters = 2)
# Train small full GMM
ubm = bob.kaldi.ubm_full_train(array, temp_dubm_file,
num_gselect = 2, num_iters = 2)
assert os.path.exists(ubm)
temp_dubm_file = bob.io.base.test_utils.temporary_filename()
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
data = bob.io.audio.reader(sample)
# MFCC
array = bob.kaldi.mfcc(data.load()[0], data.rate, normalization=False)
# Train small diagonal GMM
dubm = bob.kaldi.ubm_train(array, temp_dubm_file, num_gauss=2,
num_gselect=2, num_iters=2)
# Train small full GMM
ubm = bob.kaldi.ubm_full_train(array, temp_dubm_file,
num_gselect=2, num_iters=2)
assert os.path.exists(ubm)
def test_ubm_enroll():
temp_dubm_file = bob.io.base.test_utils.temporary_filename()
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
data = bob.io.audio.reader(sample)
# MFCC
array = bob.kaldi.mfcc(data.load()[0], data.rate,
normalization=False)
# Train small diagonal GMM
dubm = bob.kaldi.ubm_train(array, temp_dubm_file, num_gauss = 2,
num_gselect = 2, num_iters = 2)
# Perform MAP adaptation of the GMM
spk_model = bob.kaldi.ubm_enroll(array, dubm)
assert os.path.exists(spk_model)
temp_dubm_file = bob.io.base.test_utils.temporary_filename()
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
data = bob.io.audio.reader(sample)
# MFCC
array = bob.kaldi.mfcc(data.load()[0], data.rate,
normalization=False)
# Train small diagonal GMM
dubm = bob.kaldi.ubm_train(array, temp_dubm_file, num_gauss=2,
num_gselect=2, num_iters=2)
# Perform MAP adaptation of the GMM
spk_model = bob.kaldi.ubm_enroll(array, dubm)
assert os.path.exists(spk_model)
def test_gmm_score():
temp_dubm_file = bob.io.base.test_utils.temporary_filename()
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
data = bob.io.audio.reader(sample)
# MFCC
array = bob.kaldi.mfcc(data.load()[0], data.rate,
normalization=False)
# Train small diagonal GMM
dubm = bob.kaldi.ubm_train(array, temp_dubm_file, num_gauss = 2,
num_gselect = 2, num_iters = 2)
# Perform MAP adaptation of the GMM
spk_model = bob.kaldi.ubm_enroll(array, dubm)
# GMM scoring
score = bob.kaldi.gmm_score(array, spk_model, dubm)
assert np.allclose(score, [ 0.28216 ], 1e-03, 1e-05)
temp_dubm_file = bob.io.base.test_utils.temporary_filename()
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
data = bob.io.audio.reader(sample)
# MFCC
array = bob.kaldi.mfcc(data.load()[0], data.rate,
normalization=False)
# Train small diagonal GMM
dubm = bob.kaldi.ubm_train(array, temp_dubm_file, num_gauss=2,
num_gselect=2, num_iters=2)
# Perform MAP adaptation of the GMM
spk_model = bob.kaldi.ubm_enroll(array, dubm)
# GMM scoring
score = bob.kaldi.gmm_score(array, spk_model, dubm)
assert np.allclose(score, [0.28216], 1e-03, 1e-05)
# def test_gmm_score_fast():
......
......@@ -2,7 +2,7 @@
#
# Milos Cernak <milos.cernak@idiap.ch>
# March 1, 2017
#
#
'''Tests for Kaldi bindings'''
......@@ -14,98 +14,101 @@ import os
import bob.kaldi
def test_ivector_train():
temp_dubm_file = bob.io.base.test_utils.temporary_filename()
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
data = bob.io.audio.reader(sample)
# MFCC
array = bob.kaldi.mfcc(data.load()[0], data.rate, normalization=False)
# Train small diagonal GMM
dubm = bob.kaldi.ubm_train(array, temp_dubm_file, num_gauss = 2,
num_gselect = 2, num_iters = 2)
# Train small full GMM
ubm = bob.kaldi.ubm_full_train(array, temp_dubm_file,
num_gselect = 2, num_iters = 2)
# Train small ivector extractor
ivector = bob.kaldi.ivector_train(array, temp_dubm_file, num_gselect
= 2, ivector_dim = 20, num_iters = 2)
assert os.path.exists(ivector)
temp_dubm_file = bob.io.base.test_utils.temporary_filename()
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
data = bob.io.audio.reader(sample)
# MFCC
array = bob.kaldi.mfcc(data.load()[0], data.rate, normalization=False)
# Train small diagonal GMM
dubm = bob.kaldi.ubm_train(array, temp_dubm_file, num_gauss=2,
num_gselect=2, num_iters=2)
# Train small full GMM
ubm = bob.kaldi.ubm_full_train(array, temp_dubm_file,
num_gselect=2, num_iters=2)
# Train small ivector extractor
ivector = bob.kaldi.ivector_train(
array, temp_dubm_file, num_gselect=2, ivector_dim=20, num_iters=2)
assert os.path.exists(ivector)
def test_ivector_extract():
temp_dubm_file = bob.io.base.test_utils.temporary_filename()
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
reference = pkg_resources.resource_filename(__name__, 'data/sample16k.ivector')
data = bob.io.audio.reader(sample)
# MFCC
array = bob.kaldi.mfcc(data.load()[0], data.rate, normalization=False)
# Train small diagonal GMM
dubm = bob.kaldi.ubm_train(array, temp_dubm_file, num_gauss = 2,
num_gselect = 2, num_iters = 2)
# Train small full GMM
ubm = bob.kaldi.ubm_full_train(array, temp_dubm_file,
num_gselect = 2, num_iters = 2)
# Train small ivector extractor
ivector = bob.kaldi.ivector_train(array, temp_dubm_file, num_gselect
= 2, ivector_dim = 20, num_iters =
2)
# Extract ivector
ivector_array = bob.kaldi.ivector_extract(array, temp_dubm_file,
num_gselect = 2)
theirs = np.loadtxt(reference)
assert np.allclose(ivector_array, theirs)
temp_dubm_file = bob.io.base.test_utils.temporary_filename()
sample = pkg_resources.resource_filename(__name__, 'data/sample16k.wav')
reference = pkg_resources.resource_filename(
__name__, 'data/sample16k.ivector')
data = bob.io.audio.reader(sample)
# MFCC
array = bob.kaldi.mfcc(data.load()[0], data.rate, normalization=False)
# Train small diagonal GMM
dubm = bob.kaldi.ubm_train(array, temp_dubm_file, num_gauss=2,
num_gselect=2, num_iters=2)
# Train small full GMM
ubm = bob.kaldi.ubm_full_train(array, temp_dubm_file,
num_gselect=2, num_iters=2)
# Train small ivector extractor
ivector = bob.kaldi.ivector_train(
array, temp_dubm_file, num_gselect=2, ivector_dim=20, num_iters=2)
# Extract ivector
ivector_array = bob.kaldi.ivector_extract(array, temp_dubm_file,
num_gselect=2)
theirs = np.loadtxt(reference)
assert np.allclose(ivector_array, theirs)
def test_plda_train():
temp_file = bob.io.base.test_utils.temporary_filename()
features = pkg_resources.resource_filename(__name__, 'data/feats-mobio.npy')
temp_file = bob.io.base.test_utils.temporary_filename()
features = pkg_resources.resource_filename(
__name__, 'data/feats-mobio.npy')
feats = np.load(features)
# Train PLDA
plda = bob.kaldi.plda_train(feats, temp_file)
feats = np.load(features)
assert os.path.exists(temp_file + '.plda')
assert os.path.exists(temp_file + '.plda.mean')
# Train PLDA
plda = bob.kaldi.plda_train(feats, temp_file)
def test_plda_enroll():
assert os.path.exists(temp_file + '.plda')
assert os.path.exists(temp_file + '.plda.mean')
temp_file = bob.io.base.test_utils.temporary_filename()
features = pkg_resources.resource_filename(__name__, 'data/feats-mobio.npy')
feats = np.load(features)
# Train PLDA
plda = bob.kaldi.plda_enroll(feats, temp_file)
def test_plda_enroll():
assert os.path.exists(plda)
temp_file = bob.io.base.test_utils.temporary_filename()
features = pkg_resources.resource_filename(
__name__, 'data/feats-mobio.npy')
feats = np.load(features)
def test_plda_score():
# Train PLDA
plda = bob.kaldi.plda_enroll(feats, temp_file)
temp_file = bob.io.base.test_utils.temporary_filename()
test_file = pkg_resources.resource_filename(__name__, 'data/test-mobio.ivector')
features = pkg_resources.resource_filename(__name__, 'data/feats-mobio.npy')
assert os.path.exists(plda)
train_feats = np.load(features)
test_feats = np.loadtxt(test_file)
# Train PLDA
plda = bob.kaldi.plda_train(train_feats, temp_file)
# Enroll PLDA (average speaker)
enrolled = bob.kaldi.plda_enroll(train_feats[0], temp_file)
# Score PLDA
score = bob.kaldi.plda_score(test_feats, enrolled, temp_file)
def test_plda_score():
assert np.allclose(score, [ -23.9922 ], 1e-03, 1e-05)
temp_file = bob.io.base.test_utils.temporary_filename()
test_file = pkg_resources.resource_filename(
__name__, 'data/test-mobio.ivector')
features = pkg_resources.resource_filename(
__name__, 'data/feats-mobio.npy')
train_feats = np.load(features)
test_feats = np.loadtxt(test_file)
# Train PLDA
plda = bob.kaldi.plda_train(train_feats, temp_file)
# Enroll PLDA (average speaker)
enrolled = bob.kaldi.plda_enroll(train_feats[0], temp_file)
# Score PLDA
score = bob.kaldi.plda_score(test_feats, enrolled, temp_file)
assert np.allclose(score, [-23.9922], 1e-03, 1e-05)
.. py:currentmodule:: bob.kaldi
.. testsetup:: *
from __future__ import print_function
import pkg_resources
import bob.kaldi
import bob.io.audio
import tempfile
import os
=======================
Using Kaldi in Python
=======================
MFCC Extraction
---------------
Two functions are implemented to extract MFCC features
:py:func:`bob.kaldi.mfcc` and :py:func:`bob.kaldi.mfcc_from_path`. The former
function accepts the speech samples as :obj:`numpy.ndarray`, whereas the latter
the filename as :obj:`str`:
1. :py:func:`bob.kaldi.mfcc`
.. doctest::
>>> sample = pkg_resources.resource_filename('bob.kaldi', 'test/data/sample16k.wav')
>>> data = bob.io.audio.reader(sample)
>>> feat = bob.kaldi.mfcc(data.load()[0], data.rate, normalization=False)
>>> print (feat.shape)
(317, 39)
2. :py:func:`bob.kaldi.mfcc_from_path`
.. doctest::
>>> feat = bob.kaldi.mfcc_from_path(sample)
>>> print (feat.shape)
(317, 39)
UBM training and evaluation
---------------------------
Both diagonal and full covariance Universal Background Models (UBMs)
are supported, speakers can be enrolled and scored:
.. doctest::
>>> # Train small diagonall GMM
>>> projector = tempfile.NamedTemporaryFile()
>>> dubm = bob.kaldi.ubm_train(feat, projector.name, num_gauss=2, num_gselect=2, num_iters=2)
>>> # Train small full GMM
>>> ubm = bob.kaldi.ubm_full_train(feat, projector.name, num_gselect=2, num_iters=2)
>>> # Enrollement - MAP adaptation of the UBM-GMM
>>> spk_model = bob.kaldi.ubm_enroll(feat, dubm)
>>> # GMM scoring
>>> score = bob.kaldi.gmm_score(feat, spk_model, dubm)
>>> print ('%.3f' % score)
0.282
>>> os.unlink(projector.name + '.dubm')
>>> os.unlink(projector.name + '.fubm')
Following guide describes how to run whole speaker recognition experiments:
1. To run the UBM-GMM with MAP adaptation speaker recognition experiment, run:
.. code-block:: sh
verify.py -d 'mobio-audio-male' -p 'energy-2gauss' -e 'mfcc-kaldi' -a 'gmm-kaldi' -s exp-gmm-kaldi --groups {dev,eval} -R '/your/work/directory/' -T '/your/temp/directory' -vv
2. To run the ivector+plda speaker recognition experiment, run:
.. code-block:: sh
verify.py -d 'mobio-audio-male' -p 'energy-2gauss' -e 'mfcc-kaldi' -a 'ivector-plda-kaldi' -s exp-ivector-plda-kaldi --groups {dev,eval} -R '/your/work/directory/' -T '/your/temp/directory' -vv
3. Results:
+---------------------------------------------------+--------+--------+
| Experiment description | EER | HTER |
+---------------------------------------------------+--------+--------+
| -e 'mfcc-kaldi', -a 'gmm-kadi', 100GMM | 18.53% | 14.52% |
+---------------------------------------------------+--------+--------+
| -e 'mfcc-kaldi', -a 'gmm-kadi', 512GMM | 17.51% | 12.44% |
+---------------------------------------------------+--------+--------+
| -e 'mfcc-kaldi', -a 'ivector-plda-kaldi', 64GMM | 12.26% | 11.97% |
+---------------------------------------------------+--------+--------+
| -e 'mfcc-kaldi', -a 'ivector-plda-kaldi', 256GMM | 11.35% | 11.46% |
+---------------------------------------------------+--------+--------+
.. include:: links.rst
......@@ -7,14 +7,14 @@
.. _bob.kaldi:
======================
Bob/Kaldi Extensions
======================
=======================
Bob wrapper for Kaldi
=======================
.. todolist::
This module contains information on how to build and maintain |project|
Kaldi_ extensions written in pure Python or a mix of C/C++ and Python.
This package provides a pythonic API for Kaldi_ functionality so it can be
seamlessly integrated with Python-based workflows.
Documentation
-------------
......@@ -22,6 +22,7 @@ Documentation
.. toctree::
:maxdepth: 2
guide
py_api
......@@ -32,4 +33,5 @@ Indices and tables
* :ref:`modindex`
* :ref:`search`
.. include:: links.rst
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment