Commit eb766d0a authored by André Anjos's avatar André Anjos 💬

Merge branch '13_add_loop_documentation' into 'master'

Add basics about soft loop

Closes #13

See merge request !12
parents 4d489288 dc5748ac
Pipeline #35695 passed with stages
in 4 minutes and 46 seconds
......@@ -29,7 +29,7 @@
Algorithms are user-defined piece of software that run within the blocks of a
toolchain. An algorithm can read data on the input(s) of the block and write
processed data on its output(s) (We refer to the inputs and outputs collectively as *endpoints*.).
processed data on its output(s) (We refer to the inputs and outputs collectively as *endpoints*.).
They are, hence, key components for
scientific experiments, since they formally describe how to transform raw
data into higher level concept such as classes.
......@@ -39,7 +39,7 @@ An algorithm lies at the core of each processing block and may be subject to
parametrization. Inputs and outputs of an algorithm have well-defined data
formats. The format of the data on each input and output of the block is
defined at a higher-level in BEAT framework. It is expected that the
implementation of the algorithm respects the format of each endpoint that was declared before.
implementation of the algorithm respects the format of each endpoint that was declared before.
:numref:`beat-core-overview-block` displays the relationship between a
processing block and its algorithm.
......@@ -84,17 +84,20 @@ dataset and injected into the toolchain.
Algorithm types
===============
The current version of BEAT framework has two algorithm type which are different
The current version of BEAT framework has two algorithm type which are different
in the way they handle data samples. These algorithms are the following:
- Sequential
- Autonomous
In the previous versions of BEAT only one type of
algorithm (referred to as v1 algorithm) was implemented.
algorithm (referred to as v1 algorithm) was implemented.
The sequential algorithm type is the direct successor of the v1 algorithm. For
migration information, see :ref:`beat-system-algorithms-api-migration`.
The platform now also provides the concept of soft loop. The soft loop allows
the implementation of supervised processing within a macro block.
Sequential
----------
......@@ -112,6 +115,37 @@ appropriate amount of data on its outputs.
Furthermore, the way the algorithm handle the data is highly configurable and
covers a huge range of possible scenarios.
Loop
----
A loop is composed of three elements:
- An processor algorithm
- An evaluator algorithm
- A LoopChannel
The two algorithms work in pair using the LoopChannel to communicate. The
processor algorithm is responsible for applying some transformation or analysis
on a set of data and then send the result to evaluator for validation. The
role of the evaluator is to provide a feedback to the processor that will
either continue processing the same block of data or go on with the next until
all data is exhausted. The output writing of the evaluator is synchronized with
the output writing of the processor.
Sequential versions have also the reading part that is synchronized so that the
evaluator can read data at the same pace as the processor.
The two algorithms are available in both sequential and autonomous form. However
there are only three valid combinations:
========== ==========
Processor Evaluator
========== ==========
Autonomous Autonomous
Sequential Sequential
Sequential Autonomous
========== ==========
.. _beat-system-algorithms-definition:
......@@ -169,19 +203,19 @@ probabilistic component analysis (PCA):
Here are the description for each of the fields in the example above:
* **schema_version:** specifies which schema version must be used to validate the file content.
* **schema_version:** specifies which schema version must be used to validate the file content.
* **api_version:** specifies the version of the API implemented by the algorithm.
* **type:** specifies the type of the algorithm. Depending on that, the execution model will change.
* **type:** specifies the type of the algorithm. Depending on that, the execution model will change.
* **language:** specifies the language in which the algorithm is implemented.
* **language:** specifies the language in which the algorithm is implemented.
* **splittable:** indicates, whether the algorithm can be parallelized into chunks or not.
* **splittable:** indicates, whether the algorithm can be parallelized into chunks or not.
* **parameters:** lists the parameters of the algorithm, describing both default values and their types.
* **parameters:** lists the parameters of the algorithm, describing both default values and their types.
* **groups:** gives information about the inputs and outputs of the algorithm. They are provided into a list of dictionary, each element in this list being associated to a database *channel*. The group, which contains outputs, is the *synchronization channel*. By default, a loop is automatically performed by the BEAT framework on the synchronization channel, and user-code must not loop on this group. In contrast, it is the responsibility of the user to load data from the other groups. This is described in more details in the following subsections.
* **groups:** gives information about the inputs and outputs of the algorithm. They are provided into a list of dictionary, each element in this list being associated to a database *channel*. The group, which contains outputs, is the *synchronization channel*. By default, a loop is automatically performed by the BEAT framework on the synchronization channel, and user-code must not loop on this group. In contrast, it is the responsibility of the user to load data from the other groups. This is described in more details in the following subsections.
* **description:** is optional and gives a short description of the algorithm.
......@@ -363,6 +397,71 @@ The platform will call this method only once as it is its responsibility to load
the appropriate amount of data and process it.
.. _beat-system-algorithms-examples-simple-processor:
Simple autonomous processor algorithm (no parametrization)
................................................
At the very minimum, a processor algorithm class must look like this:
.. code-block:: python
class Algorithm:
def process(self, data_loaders, outputs, loop_channel):
# Read data from data_loaders, compute something, and validates the
# hypothesis
...
is_valid, feedback = loop_channel.validate({"value": np.float64(some_value)})
# check is_valid and continue appropriately and write the result
# of the computation on outputs
...
return True
The class must be called ``Algorithm`` and must have a method called
``process()``, that takes as parameters a list of inputs (see section
:ref:`beat-system-algorithms-input-inputlist`), a list of data loader (see section
:ref:`beat-system-algorithms-dataloaders-dataloaderlist`), a list of outputs
(see section :ref:`beat-system-algorithms-output-outputlist`) and a loop chanel
(see section :ref:`beat-system-algorithms-loop-channel`) . This method must
return ``True`` if everything went correctly, and ``False`` if an error
occurred.
The platform will call this method once per block of data available on the
`synchronized` inputs of the block.
.. _beat-system-algorithms-examples-simple-evaluator:
Simple autonomous evaluator algorithm (no parametrization)
................................................
At the very minimum, a processor algorithm class must look like this:
.. code-block:: python
class Algorithm:
def validate(self, hypothesis):
# compute if hypothesis makes sense and returns a tuple with a
# boolean value and some feendback
return (result, {"value": np.float32(delta)})
def write(self, outputs, processor_output_name, end_data_index):
# write something on its output, it is called in sync with processor
# algorithm write
outputs["out"].write({"value": np.int32(self.output)}, end_data_index)
The class must be called ``Algorithm`` and must have a method called
``validate()``, that takes as parameter a dataformat that will contain the
hypothesis that needs validation. The function must return a tuple made of a
boolean value and feedback value that will be used by the processor to determine
whether it should continue processing the current data or move further.
.. _beat-system-algorithms-examples-parameterizable:
Parameterizable algorithm
......@@ -1504,6 +1603,17 @@ the data block on the output.
return True
.. _beat-system-algorithms-loop-channel:
Soft loop communication
-----------------------
The processor and evaluator algorithm components of the soft loop macro block
communicate with each other using a LoopChannel object. This object defines the
two dataformats that will be used to make the request and the answer that will
transit through the loop channel. This class is only meant to be used by the
algorithm implementer.
.. _beat-system-algorithms-api-migration:
Migrating from API v1 to API v2
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment