Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
beat.web
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
beat
beat.web
Commits
4097e8e4
Commit
4097e8e4
authored
6 years ago
by
Zohreh MOSTAANI
Browse files
Options
Downloads
Patches
Plain Diff
[web][doc] removing extra info from and modifying experiments, toolchains and dataformats
parent
d1a4f35d
No related branches found
Branches containing commit
No related tags found
Tags containing commit
1 merge request
!265
merge new documentation to master
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
doc/user/dataformats/guide.rst
+3
-236
3 additions, 236 deletions
doc/user/dataformats/guide.rst
doc/user/experiments/guide.rst
+1
-1
1 addition, 1 deletion
doc/user/experiments/guide.rst
doc/user/toolchains/guide.rst
+1
-51
1 addition, 51 deletions
doc/user/toolchains/guide.rst
with
5 additions
and
288 deletions
doc/user/dataformats/guide.rst
+
3
−
236
View file @
4097e8e4
...
...
@@ -30,33 +30,12 @@
Data formats specify the transmitted data between the blocks of a toolchain.
They describe the format of the data blocks that circulate between algorithms
and formalize the interaction between algorithms and data sets, so they can
communicate in an orderly manner. Inputs and outputs of the algorithms and
datasets **must** be formally declared. Two algorithms that communicate
directly must produce and consume the **same** type of data objects.
communicate in an orderly manner. For more detailed information see :ref:`beat-system-dataformats`.
A data format specifies a list of typed fields. An algorithm or data set
generating a block of data (via one of its outputs) must fill all the fields
declared in that data format. An algorithm consuming a block of data (via one
of its inputs) must not expect the presence of any other field than the ones
defined by the data format.
The |project| platform provides a number of pre-defined formats to facilitate
experiments. They are implemented in an extensible way. This allows users to
define their own formats, based on existing ones, while keeping some level of
compatibility with other existing algorithms.
.. note:: **Naming Convention**
.. note::
Data formats are named using three values joined by a ``/`` (slash)
operator:
* **username**: indicates the author of the dataformat
* **name**: an identifier for the object
* **version**: an integer (starting from one), indicating the version of
the object
Each tuple of these three components defines a *unique* data format name
inside the platform. For example, ``system/float/1``.
operator. The first value is the **username**.
The ``system`` user, provides a number of `pre-defined formats such as
integers, booleans, floats and arrays
...
...
@@ -74,218 +53,6 @@ to that data format, like shown on the image below:
.. image:: img/system-defined-info.*
A data format is declared as a JSON_ object with several fields. For example,
the following declaration could represent the coordinates of a bounding box in
a video frame:
.. code-block:: javascript
{
"value": [
0,
{
"frame_id": "uint64",
"height": "int32",
"width": "int32",
"top-left-y": "int32",
"top-left-x": "int32"
}
]
}
The special field ``#description`` can be used to store a short description of
the declared data format. It is ignored in practice and only used for
informational purposes. Each field in a declaration has a well-defined type, as
explained next.
Simple type (primitive object)
==============================
The |project| platform supports the following *primitive* types.
* signed integers: ``int8``, ``int16``, ``int32``, ``int64``
* unsigned integers: ``uint8``, ``uint16``, ``uint32``, ``uint64``
* floating-point numbers: ``float32``, ``float64``
* complex numbers: ``cpmplex64``, ``complex128``
* a boolean value: ``bool``
* a string: ``string``
Aggregation
===========
A data format can be composed of complex objects formed by aggregating other
*declared* types. For example, we could define the positions of the eyes of a
face in an image like this:
.. code-block:: javascript
{
"left": "system/coordinates/1",
"right": "system/coordinates/1"
}
Arrays
======
A field can be a multi-dimensional array of any other type. Here ``array1`` is
declared as a one dimensional array of 10 32-bit signed integers (``int32``)
and ``array2`` as a two-dimensional array with 10 rows and 5 columns of
booleans:
.. code-block:: javascript
{
"array1": [10, "int32"],
"array2": [10, 5, "bool"]
}
An array can have up to 32 dimensions. It can also contain objects (either
declared inline, or using another data format). It is also possible to declare
an array without specifying the number of elements in some of its dimensions,
by using a size of 0 (zero). For example, here is a two-dimensional grayscale
image of unspecified size:
.. code-block:: javascript
{
"value": [0, 0, "uint8"]
}
You may also fix the some of dimensions extent. For example, here is a
possible representation for a three-dimensional RGB image of unspecified size
(width and height):
.. code-block:: javascript
{
"value": [3, 0, 0, "uint8"],
}
In this representation, the image must have 3 color planes (no more, no less).
The width and the height are unspecified.
.. note:: **Unspecified Dimensions**
Because of the way the |project| platform stores data, not all combinations
of unspecified extents will work for arrays. As a rule of thumb, only the
last dimensions may remain unspecified. These are valid:
.. code-block:: javascript
{
"value1": [0, "float64"],
"value2": [3, 0, "float64"],
"value3": [3, 2, 0, "float64"],
"value4": [3, 0, 0, "float64"],
"value5": [0, 0, 0, "float64"]
}
Whereas this would be invalid declarations for arrays:
.. code-block:: javascript
{
"value": [0, 3, "float64"],
"value": [4, 0, 3, "float64"]
}
Object Representation
---------------------
As you'll read in our :ref:`Algorithms` section, data is available via our
backend API to the user algorithms. For example, in Python, the |project|
platform uses NumPy_ data types to pass data to and from algorithms. For
example, when the algorithm reads data for which the format is defined like:
.. code-block:: javascript
{
"value": "float64"
}
The field ``value`` of an instance named ``object`` of this format is
accessible as ``object.value`` and will have the type ``numpy.float64``. If the
format would be, instead:
.. code-block:: javascript
{
"value": [0, 0, "float64"]
}
It would be accessed in the same way (i.e., via ``object.value``), except that
the type would be ``numpy.ndarray`` and ``object.value.dtype`` would be
``numpy.float64``. Naturally, objects which are instances of a format like
this:
.. code-block:: javascript
{
"x": "int32",
"y": "int32"
}
Could be accessed like ``object.x``, for the ``x`` value and ``object.y``, for
the ``y`` value. The type of ``object.x`` and ``object.y`` would be
``numpy.int32``.
Conversely, if you *write* output data in an algorithm, the type of the output
objects are checked for compatibility with respect to the value declared on the
format. For example, this would be a valid use of the format above, in Python:
.. code-block:: python
import numpy
class Algorithm:
def process(self, inputs, outputs):
# read data
# prepares object to be written
myobj = {"x": numpy.int32(4), "y": numpy.int32(6)}
# write it
outputs["point"].write(myobj) #OK!
If you try to write into an object that is supposed to be of type ``int32``, a
``float64`` object, an exception will be raised. For example:
.. code-block:: python
import numpy
class Algorithm:
def process(self, inputs, outputs):
# read data
# prepares object to be written
myobj = {"x": numpy.int32(4), "y": numpy.float64(3.14)}
# write it
outputs["point"].write(myobj) #Error: cannot downcast!
The bottomline is: **all type casting in the platform must be explicit**. It
will not automatically downcast or upcast objects for you as to avoid
unexpected precision loss leading to errors.
Editing Operations
...
...
This diff is collapsed.
Click to expand it.
doc/user/experiments/guide.rst
+
1
−
1
View file @
4097e8e4
...
...
@@ -178,7 +178,7 @@ toolchain:
.. note::
As
it was
mentioned in :ref:`beat-system-experiments-blocks`, BEAT checks that connected datasets, algorithms and
As mentioned in :ref:`beat-system-experiments-blocks`, BEAT checks that connected datasets, algorithms and
analyzers produce or consume data in the right format. It only presents
options which are *compatible* with adjacent blocks.
...
...
This diff is collapsed.
Click to expand it.
doc/user/toolchains/guide.rst
+
1
−
51
View file @
4097e8e4
...
...
@@ -28,7 +28,7 @@
============
Toolchains are the backbone of experiments within the |project| platform. They
determine the data flow for experiments in the |project| platform.
determine the data flow for experiments in the |project| platform.
For more information about toolchanis see :ref:`beat-system-toolchains`.
You can see an example toolchain for a toy-`eigenface`_ system on the image
below.
...
...
@@ -67,56 +67,6 @@ an experiment with this workflow:
the top-right of each block. For example, the block called ``scoring`` is
said to be *synchronized with* the ``probes`` channel.
When a block is synchronized with a channel, it means the platform will
iterate on that channel contents when calling the user algorithm on that
block. For example, the block ``linear_(machine)_training``, on the top
left of the image, following the data set block ``train``, is synchronized
with that dataset block. Therefore, it will be executed as many times as
the dataset block outputs objects through its ``image`` output. I.e., the
``linear_machine_training`` block *loops* or *iterates* over the ``train``
data.
Notice, the toolchain does not define what an ``image`` will be. That is
defined by the concrete dataset implementation chosen by the user when an
experiment is constructed. The block ``linear_machine_training`` also does not
define which type of images it can input. That is defined by the algorithm
chosen by the user when an experiment is constructed. For example, if the user
chooses a data set that outputs objects with the data format
``system/array_2d_uint8/1`` objects then, an algorithm that can input those
types of objects must be chosen for the block following that dataset. Don't
worry! The |project| platform experiment configuration will check that for you!
The order of execution can also be abstracted from this diagram. We sketched
that for you in this overlay:
.. image:: img/eigenfaces-ordered.*
The backend processing farm will first "push" the data out of the datasets. It
will then run the code on the block ``linear_machine_training`` (numbered 2).
The blocks ``template_builder`` and ``probe_builder`` are then ready to run.
The platform may choose to run them at the *same time* if enough computing
resources are available. The ``scoring`` block runs by fourth. The last block
to be executed is the ``analysis`` block. In the figure above, you can also see
marked what is the channel data in which the block *loops* on. When you read
about :ref:`Algorithms`, you'll understand, concretely, how synchronization is
handled on algorithm code.
.. note:: **Naming Convention**
Toolchains are named using three values joined by a ``/`` (slash) operator:
* **username**: indicates the author of the toolchain
* **name**: indicates the name of the toolchain
* **version**: indicates the version (integer starting from ``1``) of the
toolchain
Each tuple of these three components defines a *unique* toolchain name
inside the platform. For a grasp, you may browse `publicly available
toolchains`_.
The *Toolchains* tab
--------------------
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment