Commit 389736c9 authored by Laurent EL SHAFEY's avatar Laurent EL SHAFEY

Initial port of the ATNT database as a satellite package for bob

parents
*~
*.swp
*.pyc
bin
eggs
parts
.installed.cfg
.mr.developer.cfg
*.egg-info
src
develop-eggs
dist
sphinx
include README.rst
recursive-include docs *.py *.rst
recursive-include xbob *.txt
===================
ATNT/ORL Database
===================
The ATNT/ORL database is a database of faces.
The actual raw data for the database should be downloaded from the original
URL. This package only contains the `Bob <http://www.idiap.ch/software/bob/>`_
accessor methods to use the DB directly from python, with our certified
protocols.
You would normally not install this package unless you are maintaining it. What
you would do instead is to tie it in at the package you need to **use** it.
There are a few ways to achieve this:
1. You can add this package as a requirement at the ``setup.py`` for your own
`satellite package
<https://github.com/idiap/bob/wiki/Virtual-Work-Environments-with-Buildout>`_
or to your Buildout ``.cfg`` file, if you prefer it that way. With this
method, this package gets automatically downloaded and installed on your
working environment, or
2. You can manually download and install this package using commands like
``easy_install`` or ``pip``.
The package is available in two different distribution formats:
1. You can download it from `PyPI <http://pypi.python.org/pypi>`_, or
2. You can download it in its source form from `its git repository
<https://github.com/bioidiap/xbob.db.nuaa>`_. When you download the
version at the git repository, you will need to run a command to recreate
the backend SQLite file required for its operation. This means that the
database raw files must be installed somewhere in this case. With option
``a`` you can run in `dummy` mode and only download the raw data files for
the database once you are happy with your setup.
You can mix and match points 1/2 and a/b above based on your requirements. Here
are some examples:
Modify your setup.py and download from PyPI
===========================================
That is the easiest. Edit your ``setup.py`` in your satellite package and add
the following entry in the ``install_requires`` section (note: ``...`` means
`whatever extra stuff you may have in-between`, don't put that on your
script)::
install_requires=[
...
"xbob.db.atnt",
],
Proceed normally with your ``boostrap/buildout`` steps and you should be all
set. That means you can now import the namespace ``xbob.db.atnt`` into your scripts.
Modify your buildout.cfg and download from git
==============================================
You will need to add a dependence to `mr.developer
<http://pypi.python.org/pypi/mr.developer/>`_ to be able to install from our
git repositories. Your ``buildout.cfg`` file should contain the following
lines::
[buildout]
...
extensions = mr.developer
auto-checkout = *
eggs = bob
...
xbob.db.atnt
[sources]
xbob.db.atnt = git https://github.com/bioidiap/xbob.db.atnt.git
...
This diff is collapsed.
; vim: set fileencoding=utf-8 :
; Andre Anjos <andre.anjos@idiap.ch>
; Mon 16 Apr 08:29:18 2012 CEST
[buildout]
parts = external tests sphinx python
develop = .
versions = versions
eggs = bob
xbob.db.atnt
[versions]
;If you would like to pin-down the recipes package version so you are not
;bothered with eventual updates, do it here. Note that, by pinning the version
;of the package, you will be also excluded from bug fixes.
;xbob.buildout = 0.1
[external]
recipe = xbob.buildout:external
egg-directories = /idiap/group/torch5spro/nightlies/last/install/linux-x86_64-release/lib
[tests]
recipe = xbob.buildout:nose
eggs = ${buildout:eggs}
script = tests.py
[sphinx]
recipe = xbob.buildout:sphinx
eggs = ${buildout:eggs}
source = ${buildout:directory}/docs
build = ${buildout:directory}/sphinx
[python]
recipe = z3c.recipe.scripts
eggs = ${buildout:eggs}
interpreter = python
dependent-scripts = true
#!/usr/bin/env python
# vim: set fileencoding=utf-8 :
# Andre Anjos <andre.anjos@idiap.ch>
# Mon 13 Aug 2012 12:38:15 CEST
#
# Copyright (C) 2011-2012 Idiap Research Institute, Martigny, Switzerland
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3 of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
import sys, os
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#sys.path.insert(0, os.path.abspath('.'))
# -- General configuration -----------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
#needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = [
'sphinx.ext.todo',
'sphinx.ext.coverage',
'sphinx.ext.pngmath',
'sphinx.ext.ifconfig',
'sphinx.ext.autodoc',
'sphinx.ext.autosummary',
'sphinx.ext.doctest',
'sphinx.ext.intersphinx',
'bob.sphinxext.plot', # ours add source copying to install directory
]
# The viewcode extension appeared only on Sphinx >= 1.0.0
import sphinx
if sphinx.__version__ >= "1.0":
extensions.append('sphinx.ext.viewcode')
# Always includes todos
todo_include_todos = True
# If we are on OSX, the 'dvipng' path maybe different
dvipng_osx = '/opt/local/libexec/texlive/binaries/dvipng'
if os.path.exists(dvipng_osx): pngmath_dvipng = dvipng_osx
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix of source filenames.
source_suffix = '.rst'
# The encoding of source files.
#source_encoding = 'utf-8-sig'
# The master toctree document.
master_doc = 'index'
# General information about the project.
project = u'AT&T/ORL Database (Bob API)'
import time
copyright = u'%s, Idiap Research Institute' % time.strftime('%Y')
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
from xbob.db.atnt.driver import Interface
version = Interface().version()
# The full version, including alpha/beta/rc tags.
release = version
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#language = None
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
#today = ''
# Else, today_fmt is used as the format for a strftime call.
#today_fmt = '%B %d, %Y'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = ['**/links.rst']
# The reST default role (used for this markup: `text`) to use for all documents.
#default_role = None
# If true, '()' will be appended to :func: etc. cross-reference text.
#add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
#add_module_names = True
# If true, sectionauthor and moduleauthor directives will be shown in the
# output. They are ignored by default.
#show_authors = False
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# A list of ignored prefixes for module index sorting.
#modindex_common_prefix = []
# -- Options for HTML output ---------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
if sphinx.__version__ >= "1.0":
html_theme = 'nature'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#html_theme_options = {}
# Add any paths that contain custom themes here, relative to this directory.
#html_theme_path = []
# The name for this set of Sphinx documents. If None, it defaults to
# "<project> v<release> documentation".
#html_title = None
# A shorter title for the navigation bar. Default is the same as html_title.
#html_short_title = 'bob'
# The name of an image file (relative to this directory) to place at the top
# of the sidebar.
html_logo = ''
# The name of an image file (within the static path) to use as favicon of the
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
# pixels large.
html_favicon = ''
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
#html_last_updated_fmt = '%b %d, %Y'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
#html_use_smartypants = True
# Custom sidebar templates, maps document names to template names.
#html_sidebars = {}
# Additional templates that should be rendered to pages, maps page names to
# template names.
#html_additional_pages = {}
# If false, no module index is generated.
#html_domain_indices = True
# If false, no index is generated.
#html_use_index = True
# If true, the index is split into individual pages for each letter.
#html_split_index = False
# If true, links to the reST sources are added to the pages.
#html_show_sourcelink = True
# If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
#html_show_sphinx = True
# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True.
#html_show_copyright = True
# If true, an OpenSearch description file will be output, and all pages will
# contain a <link> tag referring to it. The value of this option must be the
# base URL from which the finished HTML is served.
#html_use_opensearch = ''
# This is the file name suffix for HTML files (e.g. ".xhtml").
#html_file_suffix = None
# Output file base name for HTML help builder.
htmlhelp_basename = 'bobdbdoc'
# -- Options for LaTeX output --------------------------------------------------
# The paper size ('letter' or 'a4').
latex_paper_size = 'a4'
# The font size ('10pt', '11pt' or '12pt').
latex_font_size = '10pt'
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, documentclass [howto/manual]).
latex_documents = [
('index', 'bobdbman.tex', u'Bob',
u'Biometrics Group, Idiap Research Institute', 'manual'),
]
# The name of an image file (relative to this directory) to place at the top of
# the title page.
latex_logo = ''
# For "manual" documents, if this is true, then toplevel headings are parts,
# not chapters.
#latex_use_parts = False
# If true, show page references after internal links.
#latex_show_pagerefs = False
# If true, show URL addresses after external links.
#latex_show_urls = False
# Additional stuff for the LaTeX preamble.
#latex_preamble = ''
# Documents to append as an appendix to all manuals.
#latex_appendices = []
# If false, no module index is generated.
#latex_domain_indices = True
# Included after all input documents
rst_epilog = ''
# -- Options for manual page output --------------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
('index', 'bob', u'AT&T/ORL Database (Bob API) Documentation', [u'Idiap Research Institute'], 1)
]
# We want to remove all private (i.e. _. or __.__) members
# that are not in the list of accepted functions
accepted_private_functions = ['__call__']
def member_function_test(app, what, name, obj, skip, options):
# test if we have a private function
if len(name) > 1 and name[0] == '_':
# test if this private function should be allowed
if name not in accepted_private_functions:
# omit privat functions that are not in the list of accepted private functions
return True
else:
# test if the method is documented
if not hasattr(obj, '__doc__') or not obj.__doc__:
return True
# Skips selected members in auto-generated documentation. Unfortunately, old
# versions of Boost.Python will not generate a __self__ member for static
# methods and that screws-up Sphinx processing.
if sphinx.__version__ < "1.0":
# We have to remove objects that do not have a __self__ attribute set
import types
if isinstance(obj, types.BuiltinFunctionType) and \
not hasattr(obj, '__self__') and what == 'class':
app.warn("Skipping %s %s (no __self__)" % (what, name))
return True
return False
# Default processing flags for sphinx
autoclass_content = 'both'
autodoc_member_order = 'bysource'
autodoc_default_flags = ['members', 'undoc-members', 'special-members', 'inherited-members', 'show-inheritance']
def setup(app):
app.connect('autodoc-skip-member', member_function_test)
.. vim: set fileencoding=utf-8 :
.. Andre Anjos <andre.anjos@idiap.ch>
.. Mon 13 Aug 2012 12:36:40 CEST
===================
AT&T/ORL Database
===================
.. automodule:: xbob.db.atnt
#!/usr/bin/env python
# vim: set fileencoding=utf-8 :
# Laurent El Shafey <laurent.el-shafey@idiap.ch>
from setuptools import setup, find_packages
# The only thing we do in this file is to call the setup() function with all
# parameters that define our package.
setup(
name='xbob.db.atnt',
version='1.0.0a1',
description='ATNT/ORL Database Access API for Bob',
url='http://github.com/bioidiap/bob.db.nuaa',
license='GPLv3',
author='Laurent El Shafey',
author_email='laurent.el-shafey@idiap.ch',
long_description=open('README.rst').read(),
# This line is required for any distutils based packaging.
packages=find_packages(),
include_package_data=True,
zip_safe=False,
install_requires=[
'setuptools',
'bob', # base signal proc./machine learning library
],
namespace_packages = [
'xbob',
'xbob.db',
],
entry_points={
# declare database to bob
'bob.db': [
'atnt = xbob.db.atnt.driver:Interface',
],
# declare tests to bob
'bob.test': [
'atnt = xbob.db.atnt.test:ATNTDatabaseTest',
],
},
classifiers = [
'Development Status :: 4 - Beta',
'Intended Audience :: Developers',
'License :: OSI Approved :: GNU General Public License v3 (GPLv3)',
'Natural Language :: English',
'Programming Language :: Python',
'Topic :: Scientific/Engineering :: Artificial Intelligence',
'Topic :: Database :: Front-Ends',
],
)
#see http://peak.telecommunity.com/DevCenter/setuptools#namespace-packages
__import__('pkg_resources').declare_namespace(__name__)
#see http://peak.telecommunity.com/DevCenter/setuptools#namespace-packages
__import__('pkg_resources').declare_namespace(__name__)
#!/usr/bin/env python
# vim: set fileencoding=utf-8 :
# @author: Manuel Guenther <Manuel.Guenther@idiap.ch>
# @date: Fri Apr 20 12:04:44 CEST 2012
#
# Copyright (C) 2011-2012 Idiap Research Institute, Martigny, Switzerland
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3 of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
"""
The AT&T "Database of Faces" is a small free facial image database to test face
recognition and verification algorithms on. It is also known by its former name
"The ORL Database of Faces". You can download the AT&T database from:
http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
"""
import os
import sys
import numpy
from bob.db import utils
__all__ = ['Database',]
class Database(object):
"""Wrapper class for the AT&T (aka ORL) database of faces (http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html).
This class defines a simple protocol for training, enrollment and probe by splitting the few images of the database in a reasonable manner."""
def __init__(self):
self.m_groups = ('world', 'dev')
self.m_purposes = ('enrol', 'probe')
self.m_client_ids = set(range(1, 41))
self.m_files = set(range(1, 11))
self.m_training_clients = set([1,2,5,6,10,11,12,14,16,17,20,21,24,26,27,29,33,34,36,39])
self.m_enrol_files = set([2,4,5,7,9])
def dbname(self):
"""Calculates my own name automatically."""
return os.path.basename(os.path.dirname(__file__))
def __check_validity__(self, l, obj, valid, default):
"""Checks validity of user input data against a set of valid values."""
if not l: return default
elif isinstance(l, str) or isinstance(l, int): return self.__check_validity__([l], obj, valid, default)
for k in l:
if k not in valid:
raise RuntimeError, 'Invalid %s "%s". Valid values are %s, or lists/tuples of those' % (obj, k, valid)
return l
def __make_path__(self, client_id, file_id, directory, extension):
"""Generates the file name for the given client id and file id of the AT&T database."""
stem = os.path.join("s" + str(client_id), str(file_id))
if not extension: extension = ''
if directory: return os.path.join(directory, stem + extension)
return stem + extension
def clients(self, groups = None):
"""Returns the vector of ids of the clients used in a given purpose
Keyword Parameters:
groups
One of the groups 'world', 'dev' or a tuple with both of them (which is the default).
"""
VALID_GROUPS = self.m_groups
groups = self.__check_validity__(groups, "group", VALID_GROUPS, VALID_GROUPS)
ids = set()
if 'world' in groups:
ids |= self.m_training_clients
if 'dev' in groups:
ids |= self.m_client_ids - self.m_training_clients
return list(sorted(ids))
def models(self, groups = None):
"""Returns the vector of ids of the models used in a given purpose
Keyword Parameters:
groups
One of the groups 'world', 'dev' or a tuple with both of them (which is the default).
"""
VALID_GROUPS = self.m_groups
groups = self.__check_validity__(groups, "group", VALID_GROUPS, VALID_GROUPS)
ids = set()
if 'world' in groups:
ids |= self.m_training_clients
if 'dev' in groups:
ids |= self.m_client_ids - self.m_training_clients
return list(sorted(ids))
def get_client_id_from_file_id(self, file_id):
"""Returns the client id from the given image id"""
return (file_id-1) / len(self.m_files) + 1
def files(self, directory=None, extension=None, client_ids=None, groups=None, purposes=None):
"""Returns a set of filenames for the specific query by the user.
Keyword Parameters:
directory
A directory name that will be prepended to the final filepath returned
extension
A filename extension that will be appended to the final filepath returned
client_ids
The ids of the clients whose files need to be retrieved. Should be a list of integral numbers from [1,40]
groups
One of the groups 'world' or 'dev' or a list with both of them (which is the default).
purposes
One of the purposes 'enrol' or 'probe' or a list with both of them (which is the default).
This field is ignored when the group 'train' is selected.
"""
# check if groups set are valid
VALID_GROUPS = self.m_groups
groups = self.__check_validity__(groups, "group", VALID_GROUPS, VALID_GROUPS)
# collect the ids to retrieve
ids = set(self.clients(groups))
# check the desired client ids for sanity
VALID_IDS = self.m_client_ids
client_ids = set(self.__check_validity__(client_ids, "client id", VALID_IDS, VALID_IDS))
# calculate the intersection between the ids and the desired client ids
ids = ids & client_ids
# check that the groups are valid
VALID_PURPOSES = self.m_purposes
if 'dev' in groups:
purposes = self.__check_validity__(purposes, "purpose", VALID_PURPOSES, VALID_PURPOSES)
else:
purposes = VALID_PURPOSES
file_ids = set()
if 'enrol' in purposes:
file_ids |= self.m_enrol_files
if 'probe' in purposes:
file_ids |= self.m_files - self.m_enrol_files
# go through the dataset and collect all desired files
files = {}
for client_id in ids:
for file_id in file_ids:
files[(client_id-1) * len(self.m_files) + file_id] = self.__make_path__(client_id, file_id, directory, extension)
return files
#!/usr/bin/env python
# vim: set fileencoding=utf-8 :
# @author: Manuel Guenther <Manuel.Guenther@idiap.ch>
# @date: Fri Apr 20 12:04:44 CEST 2012
#
# Copyright (C) 2011-2012 Idiap Research Institute, Martigny, Switzerland
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3 of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
"""Commands the AT&T database can respond to.
"""
import os
import sys
from bob.db.driver import Interface as BaseInterface
def dumplist(args):
"""Dumps lists of files based on your criteria."""
from .__init__ import Database
db = Database()
r = db.files(directory=args.directory, extension=args.extension, groups=args.groups, purposes=args.purposes)
output = sys.stdout
if args.selftest:
from bob.db.utils import null
output = null()
for id, f in r.items():
output.write('%s\n' % (f,))
return 0
def checkfiles(args):
"""Checks the existence of the files based on your criteria."""
from .__init__ import Database
db = Database()
r = db.files(directory=args.directory, extension=args.extension)
# go through all files, check if they are available
good = {}
bad = {}
for id, f in r.items():
if os.path.exists(f): good[id] = f
else: bad[id] = f
# report
output = sys.stdout
if args.selftest:
from bob.db.utils import null
output = null()
if bad:
for id, f in bad.items():
output.write('Cannot find file "%s"\n' % (f,))
output.write('%d files (out of %d) were not found at "%s"\n' % \
(len(bad), len(r), args.directory))
return 0
class Interface(BaseInterface):
def name(self):
return 'atnt'
def version(self):
import pkg_resources # part of setuptools