Commit 6f1812f8 authored by Elie KHOURY's avatar Elie KHOURY

initial commit

parents
*~
*.swp
*.pyc
.DS_Store
logs
replay
*.db
.installed.cfg
parts
develop-eggs
bin
eggs
downloads
lib
src
sphinx
.project
.pydevproject
.settings
*.egg-info
.mr.developer.cfg
Bob satellite package for speaker recognition protocol using Voxforge Database
==============================================================================
`Voxforge`_ offers a collection transcribed speech for use with **Free** and **Open Source Speech Recognition Engines**.
In this package, we design a speaker recognition protocol that uses a subset of the **english audio files** (only 6561 files) belonging to **30 speakers**.
This subset is splitted into three equivalent parts: Training (10 speakers), Development (10 speakers) and Test (10 speakers) sets.
This package serves as an example of speaker recognition database while testing `xbob.speaker_recognition`_.
The `xbob.speaker_recognition`_ is developed at Idiap during its participation to the `NIST SRE 2012 evaluation`_. If you use this package and/or its results, please cite the following
publications:
1. The original paper presented at the NIST SRE 2012 workshop::
@inproceedings{Khoury_NISTSRE_2012,
author = {Khoury, Elie and El Shafey, Laurent and Marcel, S{\'{e}}bastien},
month = {dec},
title = {The Idiap Speaker Recognition Evaluation System at NIST SRE 2012},
booktitle = {NIST Speaker Recognition Conference},
year = {2012},
location = {Orlando, USA},
organization = {NIST},
pdf = {http://publications.idiap.ch/downloads/papers/2012/Khoury_NISTSRE_2012.pdf}
}
2. Bob as the core framework used to run the experiments::
@inproceedings{Anjos_ACMMM_2012,
author = {A. Anjos and L. El Shafey and R. Wallace and M. G\"unther and C. McCool and S. Marcel},
title = {Bob: a free signal processing and machine learning toolbox for researchers},
year = {2012},
month = oct,
booktitle = {20th ACM Conference on Multimedia Systems (ACMMM), Nara, Japan},
publisher = {ACM Press},
url = {http://publications.idiap.ch/downloads/papers/2012/Anjos_Bob_ACMMM12.pdf},
}
Installation
------------
Just download this package and uncompressed it locally::
$ wget http://pypi.python.org/packages/source/x/xbob.subvoxforge/xbob.db.subvoxforge-0.0.1.zip
$ unzip xbob.db.subvoxforge-0.0.1.zip
$ cd xbob.db.subvoxforge
Use buildout to bootstrap and have a working environment ready for
experiments::
$ python bootstrap
$ ./bin/buildout
This also requires that bob (>= 1.2.0) is installed.
Getting the data
~~~~~~~~~~~~~~~~
The data can be downloaded and extracted by running that takes as input the location in which the data will be downloaded::
$ ./download_and_untar.sh PATH/TO/WAV/DIRECTORY
.. _Voxforge: http://www.voxforge.org/
.. _xbob.speaker_recognition: https://github.com/bioidiap/xbob.speaker_recognition
.. _NIST SRE 2012 evaluation: http://www.nist.gov/itl/iad/mig/sre12.cfm
##############################################################################
#
# Copyright (c) 2006 Zope Foundation and Contributors.
# All Rights Reserved.
#
# This software is subject to the provisions of the Zope Public License,
# Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution.
# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED
# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
# FOR A PARTICULAR PURPOSE.
#
##############################################################################
"""Bootstrap a buildout-based project
Simply run this script in a directory containing a buildout.cfg.
The script accepts buildout command-line options, so you can
use the -c option to specify an alternate configuration file.
"""
import os
import shutil
import sys
import tempfile
from optparse import OptionParser
tmpeggs = tempfile.mkdtemp()
usage = '''\
[DESIRED PYTHON FOR BUILDOUT] bootstrap.py [options]
Bootstraps a buildout-based project.
Simply run this script in a directory containing a buildout.cfg, using the
Python that you want bin/buildout to use.
Note that by using --find-links to point to local resources, you can keep
this script from going over the network.
'''
parser = OptionParser(usage=usage)
parser.add_option("-v", "--version", help="use a specific zc.buildout version")
parser.add_option("-t", "--accept-buildout-test-releases",
dest='accept_buildout_test_releases',
action="store_true", default=False,
help=("Normally, if you do not specify a --version, the "
"bootstrap script and buildout gets the newest "
"*final* versions of zc.buildout and its recipes and "
"extensions for you. If you use this flag, "
"bootstrap and buildout will get the newest releases "
"even if they are alphas or betas."))
parser.add_option("-c", "--config-file",
help=("Specify the path to the buildout configuration "
"file to be used."))
parser.add_option("-f", "--find-links",
help=("Specify a URL to search for buildout releases"))
options, args = parser.parse_args()
######################################################################
# load/install setuptools
to_reload = False
try:
import pkg_resources
import setuptools
except ImportError:
ez = {}
try:
from urllib.request import urlopen
except ImportError:
from urllib2 import urlopen
# XXX use a more permanent ez_setup.py URL when available.
exec(urlopen('https://bitbucket.org/pypa/setuptools/raw/0.7.2/ez_setup.py'
).read(), ez)
setup_args = dict(to_dir=tmpeggs, download_delay=0)
ez['use_setuptools'](**setup_args)
if to_reload:
reload(pkg_resources)
import pkg_resources
# This does not (always?) update the default working set. We will
# do it.
for path in sys.path:
if path not in pkg_resources.working_set.entries:
pkg_resources.working_set.add_entry(path)
######################################################################
# Try to best guess the version of buildout given setuptools
if options.version is None:
try:
from distutils.version import LooseVersion
package = pkg_resources.require('setuptools')[0]
v = LooseVersion(package.version)
if v < LooseVersion('0.7'):
options.version = '2.1.1'
except:
pass
######################################################################
# Install buildout
ws = pkg_resources.working_set
cmd = [sys.executable, '-c',
'from setuptools.command.easy_install import main; main()',
'-mZqNxd', tmpeggs]
find_links = os.environ.get(
'bootstrap-testing-find-links',
options.find_links or
('http://downloads.buildout.org/'
if options.accept_buildout_test_releases else None)
)
if find_links:
cmd.extend(['-f', find_links])
setuptools_path = ws.find(
pkg_resources.Requirement.parse('setuptools')).location
requirement = 'zc.buildout'
version = options.version
if version is None and not options.accept_buildout_test_releases:
# Figure out the most recent final version of zc.buildout.
import setuptools.package_index
_final_parts = '*final-', '*final'
def _final_version(parsed_version):
for part in parsed_version:
if (part[:1] == '*') and (part not in _final_parts):
return False
return True
index = setuptools.package_index.PackageIndex(
search_path=[setuptools_path])
if find_links:
index.add_find_links((find_links,))
req = pkg_resources.Requirement.parse(requirement)
if index.obtain(req) is not None:
best = []
bestv = None
for dist in index[req.project_name]:
distv = dist.parsed_version
if _final_version(distv):
if bestv is None or distv > bestv:
best = [dist]
bestv = distv
elif distv == bestv:
best.append(dist)
if best:
best.sort()
version = best[-1].version
if version:
requirement = '=='.join((requirement, version))
cmd.append(requirement)
import subprocess
if subprocess.call(cmd, env=dict(os.environ, PYTHONPATH=setuptools_path)) != 0:
raise Exception(
"Failed to execute command:\n%s",
repr(cmd)[1:-1])
######################################################################
# Import and run buildout
ws.add_entry(tmpeggs)
ws.require(requirement)
import zc.buildout.buildout
if not [a for a in args if '=' not in a]:
args.append('bootstrap')
# if -c was provided, we push it back into args for buildout' main function
if options.config_file is not None:
args[0:0] = ['-c', options.config_file]
zc.buildout.buildout.main(args)
shutil.rmtree(tmpeggs)
; vim: set fileencoding=utf-8 :
; Elie Khoury <Elie.Khoury@idiap.ch>
; Thu Aug 22 20:38:47 CEST 2013
[buildout]
parts = scripts
develop = .
eggs = bob>=1.2.0
xbob.db.verification.filelist
xbob.db.subvoxforge
newest = false
; Look at this directory to find bob
; prefixes = /idiap/group/torch5spro/releases/bob-1.2.0/install/linux-x86_64-release
[scripts]
recipe = xbob.buildout:scripts
# Elie Khoury <Elie.Khoury@idiap.ch>
# Date: Thu Aug 22 18:17:29 CEST 2013
#
# Copyright (C) 2012-2013 Idiap Research Institute, Martigny, Switzerland
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3 of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# This script will download the audio files used in the protocol.
# It will first download the tgz files, and then decompress them.
if [ "$#" -ne 1 ]; then
echo "Usage: $0 DATA_DIR"
exit 1
fi
baselink="http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Main/16kHz_16bit"
directory=$1
mkdir -p $directory
while read filename; do
basefilename=`basename $filename .tgz`
echo $basefilename
if [ ! -d "$directory/$basefilename" ]; then
wget $baselink/$filename
tar -zxvf $filename
mv $basefilename $directory/.
rm $filename
fi
done < xbob/db/subvoxforge/list_of_tgz_files.lst # where the list of files is stored
from setuptools import setup, find_packages
setup(
name='xbob.paper.tpami2013',
version='0.0.1a1',
description='Example on how to use the scalable implementation of PLDA and how to reproduce experiments of the article',
url='http://pypi.python.org/pypi/xbob.paper.tpami2013',
license='GPLv3',
author='Laurent El Shafey',
author_email='Laurent.El-Shafey@idiap.ch',
long_description=open('README.rst').read(),
packages=find_packages(),
include_package_data=True,
install_requires=[
'setuptools',
'xbob.db.verification.filelist',
],
namespace_packages = [
'xbob',
'xbob.db',
],
classifiers = [
'Development Status :: 5 - Production/Stable',
'Intended Audience :: Education',
'License :: OSI Approved :: GNU General Public License v3 (GPLv3)',
'Natural Language :: English',
'Programming Language :: Python',
'Programming Language :: Python :: 3',
'Topic :: Scientific/Engineering :: Artificial Intelligence',
],
)
#see http://peak.telecommunity.com/DevCenter/setuptools#namespace-packages
__import__('pkg_resources').declare_namespace(__name__)
#see http://peak.telecommunity.com/DevCenter/setuptools#namespace-packages
__import__('pkg_resources').declare_namespace(__name__)
#!/usr/bin/env python
# vim: set fileencoding=utf-8 :
# @author: Elie Khoury <Elie.Khoury@idiap.ch>
# @date: Thu Aug 22 17:43:04 CEST 2013
#
# Copyright (C) 2012-2013 Idiap Research Institute, Martigny, Switzerland
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3 of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
"""
Details about the Voxforge database can be found here:
http://www.voxforge.org/
"""
from .query import Database
__all__ = dir()
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
#!/usr/bin/env python
# vim: set fileencoding=utf-8 :
# @author: Elie Khoruy <Elie.Khoury@idiap.ch>
# @date: Thu Aug 22 17:49:19 CEST 2013
#
# Copyright (C) 2012-2013 Idiap Research Institute, Martigny, Switzerland
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3 of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
import xbob.db.verification.filelist
class Database(xbob.db.verification.filelist.Database):
"""Wrapper class for the subVoxforge database for speaker recognition (http://www.voxforge.org/).
This class defines a simple protocol for training, dev and and by splitting the audio files of the database in three main parts.
"""
def __init__(self):
# call base class constructor
xbob.db.verification.filelist.Database.__init__(self, 'lists')
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment