Commit b03eae48 authored by Samuel GAIST's avatar Samuel GAIST
Browse files

[doc] Add documentation about the ZMQ architecture

parent 774d012f
......@@ -32,8 +32,15 @@
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #
# #
###################################################################################
import time
import os
import pkg_resources
import sphinx_rtd_theme
# For inter-documentation mapping:
from bob.extension.utils import link_documentation, load_requirements
# -- General configuration -----------------------------------------------------
......@@ -54,7 +61,6 @@ extensions = [
"sphinx.ext.napoleon",
"sphinx.ext.viewcode",
"sphinx.ext.mathjax",
#'matplotlib.sphinxext.plot_directive'
]
# Be picky about warnings
......@@ -104,7 +110,6 @@ master_doc = "index"
# General information about the project.
project = u"beat.core"
import time
copyright = u"%s, Idiap Research Institute" % time.strftime("%Y")
......@@ -164,7 +169,6 @@ owner = [u"Idiap Research Institute"]
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
import sphinx_rtd_theme
html_theme = "sphinx_rtd_theme"
......@@ -258,16 +262,13 @@ autoclass_content = "class"
autodoc_member_order = "bysource"
autodoc_default_flags = ["members", "undoc-members", "show-inheritance"]
if not "BOB_DOCUMENTATION_SERVER" in os.environ:
if "BOB_DOCUMENTATION_SERVER" not in os.environ:
# notice we need to overwrite this for BEAT projects - defaults from Bob are
# not OK
os.environ[
"BOB_DOCUMENTATION_SERVER"
] = "https://www.idiap.ch/software/beat/docs/beat/%(name)s/%(version)s/|https://www.idiap.ch/software/beat/docs/beat/%(name)s/master/"
# For inter-documentation mapping:
from bob.extension.utils import link_documentation, load_requirements
sphinx_requirements = "extra-intersphinx.txt"
if os.path.exists(sphinx_requirements):
intersphinx_mapping = link_documentation(
......
......@@ -46,6 +46,7 @@ This package provides the core components of BEAT ecosystem. These core componen
backend_api
develop
api
zmq_architecture
Indices and tables
......
......@@ -43,6 +43,8 @@
.. _python 2.7: http://www.python.org
.. _zero message queue: http://zeromq.org
.. _zmq: http://zeromq.org
.. _ZeroMQ book: http://shop.oreilly.com/product/0636920026136.do
.. _Majordomo Protocol: https://rfc.zeromq.org/spec:18/MDP/
.. _language bindings: http://zeromq.org/bindings:_start
.. _python bindings: http://zeromq.org/bindings:python
.. _markdown: http://daringfireball.net/projects/markdown/
......
.. vim: set fileencoding=utf-8 :
.. Copyright (c) 2019 Idiap Research Institute, http://www.idiap.ch/ ..
.. Contact: beat.support@idiap.ch ..
.. ..
.. This file is part of the beat.backend.python module of the BEAT platform. ..
.. ..
.. Redistribution and use in source and binary forms, with or without
.. modification, are permitted provided that the following conditions are met:
.. 1. Redistributions of source code must retain the above copyright notice, this
.. list of conditions and the following disclaimer.
.. 2. Redistributions in binary form must reproduce the above copyright notice,
.. this list of conditions and the following disclaimer in the documentation
.. and/or other materials provided with the distribution.
.. 3. Neither the name of the copyright holder nor the names of its contributors
.. may be used to endorse or promote products derived from this software without
.. specific prior written permission.
.. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
.. ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
.. WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
.. DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
.. FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.. DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
.. SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
.. CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
.. OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
.. OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
.. _zmq_architecture:
==================================
ZMQ Architecture for task handling
==================================
Introduction
------------
The ZMQ architecture implemented in beat.core is based on the `Majordomo
Protocol`_ as described in the `ZeroMQ book`_.
There are however some subtle differences:
- We have one client: the scheduler
- We have unique workers
- We currently don't have "system commands"
Due to these differences and their implementation, the protocol has been
renamed: "BEAT Computation Protocol" or BCP for short.
The system is based on these three components:
- The client
- The broker
- The worker(s)
In BEAT, the client will be the scheduler which will send the tasks to the
broker which will be responsible for forwarding them to the appropriate worker
requested by the scheduler.
Once the task has been completed, the worker will send back a message to the
scheduler through the broker.
The whole messaging system is asynchronous except when starting an actual task.
The worker will send back a confirmation as soon as the runner was properly
started.
Why this design ?
-----------------
The original design was a bit simpler:
- One scheduler
- Many workers
The scheduler was responsible for both task scheduling and worker communication
handling. One issue that arose from time to time was that with very low volume
of network activity, the connection between one or more workers and the
scheduler would get cut and nobody would notice. The result was that new tasks
would be sent but silently dropped and thus experiment would stay in a running
state while not doing anything. And if canceled, the state would stay in
canceling as again the command would be silently dropped.
Thus the rational behind choosing this new design was to avoid these connection
loss and therefore platform paralysis.
Now, the broker and the workers implement a bidirectional heartbeat. This has a
twofold benefit:
- The heartbeat itself should generate enough network activity to avoid the
connection to be cut.
- If a worker goes missing, it will be detected by the broker that will act as
configured to.
BCP Schema
----------
The figure below shows how the system is working.
::
--------------
| |
| client |
| |
--------------
|
--------------
| |
| broker |
| |
--------------
| | | |
| | | |
/----------------------/ | | \-----------------------\
| | | |
| /-------/ \-------\ |
| | | |
--------------- --------------- --------------- ---------------
| | | | | | | |
| worker1 | | worker2 | | worker3 | | worker4 |
| | | | | | | |
--------------- --------------- --------------- ---------------
.. include:: links.rst
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment