Commit 0edcbd49 authored by André Anjos's avatar André Anjos 💬

Merge branch 'docs' into 'master'

merge new documentation to master

See merge request !265
parents b0b627f2 c66e0439
Pipeline #25511 passed with stages
in 15 minutes and 25 seconds
This diff is collapsed.
.. vim: set fileencoding=utf-8 :
.. _beat_web:
======================
BEAT Web Application
======================
This documentation includes information about the BEAT platform.
For users
=========
.. toctree::
:maxdepth: 1
:titlesonly:
user/index.rst
For developers
==============
.. toctree::
:maxdepth: 1
:titlesonly:
admin/index.rst
api/index.rst
This diff is collapsed.
......@@ -36,7 +36,7 @@ configuration page, the declaration of this experiment transmitted to the
scheduler, that now must run the experiment until it finishes, you press the
``stop`` button, or an error condition is produced.
As it is described in the :ref:`toolchains` section, the scheduler first breaks
As it is described in the :ref:`beat-system-toolchains` section, the scheduler first breaks
the toolchain into a sequence of executable blocks with dependencies. For
example: block ``B`` must be run after block ``A``. Each block is then
scheduled for execution depending on current resource availability. If no more
......
......@@ -30,37 +30,15 @@
Data formats specify the transmitted data between the blocks of a toolchain.
They describe the format of the data blocks that circulate between algorithms
and formalize the interaction between algorithms and data sets, so they can
communicate in an orderly manner. Inputs and outputs of the algorithms and
datasets **must** be formally declared. Two algorithms that communicate
directly must produce and consume the **same** type of data objects.
communicate in an orderly manner. For more detailed information see :ref:`beataformats`.
A data format specifies a list of typed fields. An algorithm or data set
generating a block of data (via one of its outputs) must fill all the fields
declared in that data format. An algorithm consuming a block of data (via one
of its inputs) must not expect the presence of any other field than the ones
defined by the data format.
The |project| platform provides a number of pre-defined formats to facilitate
experiments. They are implemented in an extensible way. This allows users to
define their own formats, based on existing ones, while keeping some level of
compatibility with other existing algorithms.
.. note:: **Naming Convention**
.. note::
Data formats are named using three values joined by a ``/`` (slash)
operator:
* **username**: indicates the author of the dataformat
* **name**: an identifier for the object
* **version**: an integer (starting from one), indicating the version of
the object
operator. The first value is the **username**.
Each tuple of these three components defines a *unique* data format name
inside the platform. For example, ``system/float/1``.
The ``system`` user, provides a number of `pre-defined formats such as
integers, booleans, floats and arrays
<https://www.beat-eu.org/platform/dataformats/system/>`_. You may also
The ``system`` user, provides a number of pre-defined formats such as
integers, booleans, floats and arrays (see `here <https://www.beat-eu.org/platform/dataformats/system/>`_). You may also
browse `publicly available data formats`_ to see all available data formats
from the ``system`` and other users.
......@@ -74,218 +52,6 @@ to that data format, like shown on the image below:
.. image:: img/system-defined-info.*
A data format is declared as a JSON_ object with several fields. For example,
the following declaration could represent the coordinates of a bounding box in
a video frame:
.. code-block:: javascript
{
"value": [
0,
{
"frame_id": "uint64",
"height": "int32",
"width": "int32",
"top-left-y": "int32",
"top-left-x": "int32"
}
]
}
The special field ``#description`` can be used to store a short description of
the declared data format. It is ignored in practice and only used for
informational purposes. Each field in a declaration has a well-defined type, as
explained next.
Simple type (primitive object)
==============================
The |project| platform supports the following *primitive* types.
* signed integers: ``int8``, ``int16``, ``int32``, ``int64``
* unsigned integers: ``uint8``, ``uint16``, ``uint32``, ``uint64``
* floating-point numbers: ``float32``, ``float64``
* complex numbers: ``cpmplex64``, ``complex128``
* a boolean value: ``bool``
* a string: ``string``
Aggregation
===========
A data format can be composed of complex objects formed by aggregating other
*declared* types. For example, we could define the positions of the eyes of a
face in an image like this:
.. code-block:: javascript
{
"left": "system/coordinates/1",
"right": "system/coordinates/1"
}
Arrays
======
A field can be a multi-dimensional array of any other type. Here ``array1`` is
declared as a one dimensional array of 10 32-bit signed integers (``int32``)
and ``array2`` as a two-dimensional array with 10 rows and 5 columns of
booleans:
.. code-block:: javascript
{
"array1": [10, "int32"],
"array2": [10, 5, "bool"]
}
An array can have up to 32 dimensions. It can also contain objects (either
declared inline, or using another data format). It is also possible to declare
an array without specifying the number of elements in some of its dimensions,
by using a size of 0 (zero). For example, here is a two-dimensional grayscale
image of unspecified size:
.. code-block:: javascript
{
"value": [0, 0, "uint8"]
}
You may also fix the some of dimensions extent. For example, here is a
possible representation for a three-dimensional RGB image of unspecified size
(width and height):
.. code-block:: javascript
{
"value": [3, 0, 0, "uint8"],
}
In this representation, the image must have 3 color planes (no more, no less).
The width and the height are unspecified.
.. note:: **Unspecified Dimensions**
Because of the way the |project| platform stores data, not all combinations
of unspecified extents will work for arrays. As a rule of thumb, only the
last dimensions may remain unspecified. These are valid:
.. code-block:: javascript
{
"value1": [0, "float64"],
"value2": [3, 0, "float64"],
"value3": [3, 2, 0, "float64"],
"value4": [3, 0, 0, "float64"],
"value5": [0, 0, 0, "float64"]
}
Whereas this would be invalid declarations for arrays:
.. code-block:: javascript
{
"value": [0, 3, "float64"],
"value": [4, 0, 3, "float64"]
}
Object Representation
---------------------
As you'll read in our :ref:`Algorithms` section, data is available via our
backend API to the user algorithms. For example, in Python, the |project|
platform uses NumPy_ data types to pass data to and from algorithms. For
example, when the algorithm reads data for which the format is defined like:
.. code-block:: javascript
{
"value": "float64"
}
The field ``value`` of an instance named ``object`` of this format is
accessible as ``object.value`` and will have the type ``numpy.float64``. If the
format would be, instead:
.. code-block:: javascript
{
"value": [0, 0, "float64"]
}
It would be accessed in the same way (i.e., via ``object.value``), except that
the type would be ``numpy.ndarray`` and ``object.value.dtype`` would be
``numpy.float64``. Naturally, objects which are instances of a format like
this:
.. code-block:: javascript
{
"x": "int32",
"y": "int32"
}
Could be accessed like ``object.x``, for the ``x`` value and ``object.y``, for
the ``y`` value. The type of ``object.x`` and ``object.y`` would be
``numpy.int32``.
Conversely, if you *write* output data in an algorithm, the type of the output
objects are checked for compatibility with respect to the value declared on the
format. For example, this would be a valid use of the format above, in Python:
.. code-block:: python
import numpy
class Algorithm:
def process(self, inputs, outputs):
# read data
# prepares object to be written
myobj = {"x": numpy.int32(4), "y": numpy.int32(6)}
# write it
outputs["point"].write(myobj) #OK!
If you try to write into an object that is supposed to be of type ``int32``, a
``float64`` object, an exception will be raised. For example:
.. code-block:: python
import numpy
class Algorithm:
def process(self, inputs, outputs):
# read data
# prepares object to be written
myobj = {"x": numpy.int32(4), "y": numpy.float64(3.14)}
# write it
outputs["point"].write(myobj) #Error: cannot downcast!
The bottomline is: **all type casting in the platform must be explicit**. It
will not automatically downcast or upcast objects for you as to avoid
unexpected precision loss leading to errors.
Editing Operations
......
......@@ -37,23 +37,6 @@ such as different databases and algorithms. Each experiment has its own
:ref:`toolchains` which cannot be changed after the experiment is created.
Experiments can be shared and forked, to ensure maximum re-usability.
.. note:: **Naming Convention**
Experiments are named using five values joined by a ``/`` (slash)
operator:
* **username**: indicates the author of the experiment
* **toolchain username**: indicates the author of the toolchain used for
that experiment
* **toolchain name**: indicates the name of the toolchain used for that
experiment
* **toolchain version**: indicates the version (integer starting from
``1``) of the toolchain used for the experiment
* **name**: an identifier for the object
Each tuple of these five components defines a *unique* experiment name
inside the platform. For a grasp, you may browse `publicly available
experiments`_.
Displaying an existing experiment
......@@ -111,7 +94,7 @@ These icons represent the following options (from left to right):
* red cross: delete the experiment
* blue tag: rename the experiment
* gold medal: request attestation
* circular arrow: reset the experiment
* circular arrow: reset the experiment (if some of the blocks in the experiment have been ran before the platform will use the cache available for the outputs of those blocks)
* ``fork``: fork a new, editable copy of this experiment
* page: add experiment to report
* blue lens: search for similar experiments
......@@ -193,22 +176,11 @@ toolchain:
results. Options for this block are similar for normal blocks.
.. note:: **Algorithms, Datasets and Blocks**
While configuring the experiment, your objective is to fill-in all
containers defined by the toolchain with valid datasets and algorithms or
analyzers. **The platform will check connected datasets, algorithms and
analyzers produce or consume data in the right format**. It only presents
options which are *compatible* with adjacent blocks.
.. note::
For example, if you chose dataset ``A`` for block ``train`` of your
experiment that outputs objects in the format ``user/format/1``, then the
algorithm running on the block following ``train``, **must** consume
``user/format/1`` on its input. Therefore, the choices for algorithms that
can run after ``train`` become limited at the moment you chose the dataset
``A``. The configuration system will *dynamically* update to take those
constraints into consideration everytime you make a selection, increasing
the global constraints for the experiment.
As mentioned in :ref:`beat-system-experiments-blocks`, BEAT checks that connected datasets, algorithms and
analyzers produce or consume data in the right format. It only presents
options which are *compatible* with adjacent blocks.
Tip: If you reach a situation where no algorithms are available for a given
block, reset the experiment and try again, making sure the algorithms you'd
......
......@@ -38,8 +38,7 @@ provides an attestation mechanism for your reports (scientific papers,
technical documents or certifications).
This guide contains detailed and illustrated information on how to interact
with the |project| platform using its web interface. It is the primary resource
for information concerning how to use and run evaluations using the platform.
with the |project| platform using its web interface. Before you continue with this guide you should familiar yourself with different components of BEAT (see `Getting Started with BEAT`_).
In order to take full advantage of the guide, we recommend you register_ into
the platform and follow the tutorials in the order defined in this guide.
......
......@@ -31,23 +31,11 @@ Libraries
functions. Instead of re-implementing every function from scratch, you can
reuse functions already implemented by other users and published in the form of
|project| libraries. Similarly, you can create and publish your own libraries
of functions that you consider may be useful to other users.
of functions that you consider may be useful to other users. For more information see :ref:`beat-system-libraries`
Usage of libraries in encouraged in the |project| platform. Besides saving you
time and effort, this also promotes reproducibility in research.
.. note:: **Naming Convention**
Libraries are named using three values joined by a ``/`` (slash) operator:
* **username**: indicates the author of the library
* **name**: indicates the name of the library
* **version**: indicates the version (integer starting from ``1``) of the
library
Each tuple of these three components defines a *unique* name inside the
platform. For a grasp, you may browse `publicly available libraries`_.
You can access the Libraries section from your home-page on |project| by
clicking the ``User Resources`` tab and selecting ``Libraries`` from the
drop-down list. You should see a page similar to that shown below:
......@@ -98,8 +86,7 @@ To create a library you will need to provide the following information:
Of course, functions implemented in a new library may also call functions from
other shared libraries in the |project| platform. You can indicate the
dependencies on other libraries via the ``External library usage`` section (to
open this section, click on the ``v`` symbol on the right).
dependencies on other libraries via the ``External library usage`` section.
To save your work, click on the green ``Save`` button (in the top-right region
of the page). After you have saved your library, you will be able to use
......
......@@ -93,8 +93,7 @@ Script
* Outcome: The library will be saved, Edit and Delete buttons will be appeared on
the right-top corner.
11. Say: "To share the library, click on the 'Share' button. A pop-up window
specifying sharing preferences will appear"
11. Say: "To share the library, click on the 'Share' button. A pop-up window specifying sharing preferences will appear"
* Action: Click on the sharing button for a private search (the one you
saved before)
......@@ -108,4 +107,4 @@ Script
* Action: Click on the 'Public' radio box, click on 'Share it'
* Outcome: A pop-up window says the Search is now shared.
13. END OF THE CLIP
13. END OF THE CLIP
\ No newline at end of file
......@@ -51,3 +51,8 @@
.. _numpy: http://www.numpy.org/
.. _our gitlab repository: https://gitlab.idiap.ch/beat/
.. _gnu affero gpl v3 license: http://www.gnu.org/licenses/agpl-3.0.en.html
.. _Getting Started with BEAT: https://www.idiap.ch/software/beat/docs/beat/docs/master/beat/introduction.html
.. _Algorithms: https://www.idiap.ch/software/beat/docs/beat/docs/master/beat/algorithms.html
.. _Experiments: https://www.idiap.ch/software/beat/docs/beat/docs/master/beat/experiments.html#beat-system-experiments
.. _Toolchains: https://www.idiap.ch/software/beat/docs/beat/docs/master/beat/toolchains.html#beat-system-toolchains
.. _Dataformats: https://www.idiap.ch/software/beat/docs/beat/docs/master/beat/dataformats.html#beat-system-dataformats
......@@ -35,20 +35,20 @@ reproducible and experiments share parts with each other as much as possible.
For example in a simple experiment, the database, the algorithm, and the
environment used to run the experiment can be shared between experiments.
A fundamental part of the :ref:`experiments` in the |project| platform is a
A fundamental part of the `Experiments`_ in the |project| platform is a
toolchain. You can see an example of a toolchain below:
.. image:: img/toolchain.*
:ref:`toolchains` are sequences of blocks and links (like a block diagram)
`Toolchains`_ are sequences of blocks and links (like a block diagram)
which represent the data flow on an experiment. Once you have defined the
toolchain against which you'd like to run, the experiment can be further
configured by assigning different datasets, algorithms and analyzers to the
different toolchain blocks.
The data that circulates at each toolchain connection in a configured
experiment is formally defined using :ref:`dataformats`. The platform
experiment is formally defined using `Dataformats`_. The platform
experiment configuration system will prevent you from using incompatible
data formats between connecting blocks. For example, in the toolchain depicted
above, if one decides to use a dataset on the block ``train`` (top-left of the
......@@ -68,7 +68,7 @@ toolchain, datasets, and algorithms.
:ref:`faq` for more information.
- You can learn about how to develop new algorithms or change existing ones by
looking at our :ref:`algorithms` section. A special kind of algorithms are
looking at our `Algorithms`_ section. A special kind of algorithms are
result *analyzers* that are used to generate the results of the experiments.
You can check the results of an experiment once it finishes and share the
......
......@@ -110,16 +110,14 @@ Script
to the report (2 experiments matching the analyzer and one failing)
* Outcome: Added 2 (out of 3 in total) experiment(s) to report
10. Say: "Let's get back to our report list we can see that our report has now 3 experiments. It is
now time to add some interesting tables and figures to our report."
10. Say: "Let's get back to our report list we can see that our report has now 3 experiments. It is now time to add some interesting tables and figures to our report."
* Action: click on "User Resources: Reports" point at the 3 experiments in the report, then click
on the report "myfirstreport"
* Outcome: the empty report will be displayed.
* Action: click on "User Resources: Reports" point at the 3 experiments in the report, then click
on the report "myfirstreport"
* Outcome: the empty report will be displayed.
11. Say: "Some general information is displayed such as the unique report id of this report for
review and publication purpose, the date of creation, the status of the report, currently Editable,
and the common analyzer among the experiments of the report.
review and publication purpose, the date of creation, the status of the report, currently Editable, and the common analyzer among the experiments of the report.
On the right side 4 action buttons are also displayed in order to
let you delete the report, save the report after doing some additional changes, lock your
report for review and finally the unique report id link".
......@@ -134,14 +132,11 @@ Script
if you wish to export the data in a csv format, you have this possibility with the 'Export Table'
button".
* Action: Click on 'Add a report item', select 'Table/Results' and select
a few data you would like to see on your figure, then point out with the mouse the different action
buttons
* Outcome: a table shoud appear.
* Action: Click on 'Add a report item', select 'Table/Results' and select
a few data you would like to see on your figure, then point out with the mouse the different action buttons
* Outcome: a table shoud appear.
13. Say: "Now let's add figure to this report. Let's click on 'Add a report item' and let's
select Figure and 'scores_distribution' we can see that a figure has appeared below the
previously created table.
13. Say: "Now let's add figure to this report. Let's click on 'Add a report item' and let's select Figure and 'scores_distribution' we can see that a figure has appeared below the previously created table.
Some action buttons let us delete or export the figure in PNG, JPEG or PDF format.
If many plotters or plotters parameters are available, you will have the possibility
to modify this in order to suit your needs with the plot.
......
......@@ -58,7 +58,7 @@ This is a panel with two buttons. The green button which says ``Show``, makes a
pop-up window appear showing your current API token. You may use this token
(64-byte character string) in outside programs that communicate with the
platform programmatically. For example, our command-line interface requires a
token to be able to pull/push contributions for the user.
token to be able to pull/push contributions for the user (see :ref:`beat-cmdline-configuration`).
If your token is compromised, you may change it by clicking on the ``Modify``
button. A pop-up window will appear confirming the modification. You may cancel
......
......@@ -28,94 +28,7 @@
============
Toolchains are the backbone of experiments within the |project| platform. They
determine the data flow for experiments in the |project| platform.
You can see an example toolchain for a toy-`eigenface`_ system on the image
below.
.. image:: img/eigenfaces.*
From this block diagram, the platform can identify all it requires to conduct
an experiment with this workflow:
* There are three types of blocks:
1. **Dataset blocks** (light yellow, left side): are the input blocks of a
toolchain. They only have outputs.
2. **Regular blocks** (gray): represent processing steps of the toolchain.
3. **Analysis blocks** (light blue, right side): is the output of a
toolchain. They only have inputs.
* Each block defines *place holders* for datasets and algorithms to be
inserted when the user wants to execute an experiment based on such a
toolchain (see the :ref:`experiments` section).
* Each block is linked to the next one via a **connection**. The sequence of
blocks in a toolchain and their connectivity defines a natural data flow.
Data is output by data sets on the left and flow to the right until a
result is produced.
* Each dataset block (light yellow, left side) define a unique
*synchronization channel*, which is encoded in the platform via a color.
For example, the sychronization channel ``train`` is blue. The
synchronization channel ``templates`` is green and, finally, the
synchronization channel ``probes`` is red.
* Each regular or analysis block on the toolchain respects exactly one of
these synchronization channels. This is indicated by the colored circle on
the top-right of each block. For example, the block called ``scoring`` is
said to be *synchronized with* the ``probes`` channel.
When a block is synchronized with a channel, it means the platform will
iterate on that channel contents when calling the user algorithm on that
block. For example, the block ``linear_(machine)_training``, on the top
left of the image, following the data set block ``train``, is synchronized
with that dataset block. Therefore, it will be executed as many times as
the dataset block outputs objects through its ``image`` output. I.e., the
``linear_machine_training`` block *loops* or *iterates* over the ``train``
data.
Notice, the toolchain does not define what an ``image`` will be. That is
defined by the concrete dataset implementation chosen by the user when an
experiment is constructed. The block ``linear_machine_training`` also does not
define which type of images it can input. That is defined by the algorithm
chosen by the user when an experiment is constructed. For example, if the user
chooses a data set that outputs objects with the data format
``system/array_2d_uint8/1`` objects then, an algorithm that can input those
types of objects must be chosen for the block following that dataset. Don't
worry! The |project| platform experiment configuration will check that for you!
The order of execution can also be abstracted from this diagram. We sketched
that for you in this overlay:
.. image:: img/eigenfaces-ordered.*
The backend processing farm will first "push" the data out of the datasets. It
will then run the code on the block ``linear_machine_training`` (numbered 2).
The blocks ``template_builder`` and ``probe_builder`` are then ready to run.
The platform may choose to run them at the *same time* if enough computing
resources are available. The ``scoring`` block runs by fourth. The last block
to be executed is the ``analysis`` block. In the figure above, you can also see
marked what is the channel data in which the block *loops* on. When you read
about :ref:`Algorithms`, you'll understand, concretely, how synchronization is
handled on algorithm code.
.. note:: **Naming Convention**
Toolchains are named using three values joined by a ``/`` (slash) operator:
* **username**: indicates the author of the toolchain
* **name**: indicates the name of the toolchain
* **version**: indicates the version (integer starting from ``1``) of the
toolchain
Each tuple of these three components defines a *unique* toolchain name
inside the platform. For a grasp, you may browse `publicly available
toolchains`_.
determine the data flow for experiments in the |project| platform. For more information about toolchanis see `Toolchains`.
The *Toolchains* tab
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment