Integration with the new scheduler
Here is a task list with topics that require attention for the integration:
-
The most important feature to be integrated is the ability to start/stop the experiment. This has been implemented in the scheduler. -
When deleting an experiment, if the experiment is still running, then the experiment must be cancelled by the scheduler -
The update of the dataflow drawing while the toolchain executes should respect the new abilities of the scheduler to report start/end of each block and for the experiment as they arrive. There is no need anymore for the platform to guess that certain blocks are being executed. -
There should be an administrative page on the beat web site that allows for: -
Viewing the current scheduler state through the /state
API call -
Cancel all running experiments -
Clean the cache content -
Change the scheduling policy -
Change the debugging level -
Change the queue/environment/worker/slot configuration.
-
-
The web server should hold and allow for administrative change of queue/environment/worker/slot properties -
For each block of the experiment, the user should be allowed to select: -
Processing queue to use, following user rights -
Processing environment to use -
Number of slots to occupy These information should be transmitted to the scheduler.
-
Hints on behavior and implementation details:
Use Cases
- Simple xp submission by an user with no particular rights
- A xp submission with a user which has 1000 points of reputation
- A xp submission with a user that has priority on a given queue
New Tables on the Django DB:
We must first define an "environment". The environment has a name (e.g. "Python"), a version (e.g. "2.7.3"), a OS (e.g. "Debian Wheezy 7.2 (x86_64)"), a rich description string which defines all properties of that environment, including installed packages.
Each "queue" in the system is defined with a name (e.g. "3 hours/4G on Python"), a memory limit (e.g. "4096Mb"), a time limit (e.g. "3 hours"), an environment (e.g. "Python") and the number of slots the queue can occupy, at most, on every machine available in the system.)
Each user "library" consists of a bunch of files that are packaged together and put on a certain directory. Their organisation follows the same strategies as for algorithms. The sole exception is that each library is represented by a directory rather than a single file. The "library" is defined with a name (e.g. "lbp"), a version (e.g. "1.0"), an environment compatibility list and a list of other libraries this library depends on.
We must also define another table called "GroupQueueRights" in which we are going to track 4 aspects: queue - group - max slots - priority. For example: a user belonging to group "default" may be able to use 5 slots on queue "3 hours/4G on Python" with priority 0.
A user may belong to several groups. In this case, the platform should only consider the maximum slots/priority for each Queue when submitting the job to the scheduler.
E.g.: on GroupQueueRights table (queue name, group name, slots, priority):
Row 1: Q1 - default - 2 - 0
Row 2: Q1 - special - 1 - 1
Row 3: Q1 - super - 3 - 1
User "A" belongs to group "default": Computed user queue rights are (Q1 - 2 - 0)
User "B" belongs to groups "default" and "super": Computed user queue rights are (Q1 - 3 - 1)
User "C" belongs to groups "default" and "special": Computed user queue rights are (Q1 - 2 - 1)
Worker Perspective
The worker installed in each machine knows where each local environment with a given name and version is installed and how to execute user programs using that environment.
The program execution receives as parameters:
- the environment
- the parameters to call the environment executable with
N.B.: The parameter list API for all environment executables defines our so-called Sandboxing API. It has to be respected for all environment implementations.
The worker is just told what to do - it does not check for rights or know any of that.
User Perspective Operation
The user selects an overall execution queue for that experiment. It may also specify individual queues for individual blocks. The platform only allows the user to select queues in which the user has at least 1 slot for processing (other queues, even if they exist, are suppressed from the selection box).
After basic queue selection, every block on the toolchain executes in a single slot for the selected queues. If the algorithm does support a multi-slot operation, then the platform will allow the user to select how many slots (max'ed at the user rights) to use for a given block.
Users can submit as many jobs as they want. They will be treated according to queue rights and farm availability.
Web Platform
In possession of all this information, the web platform will submit to the scheduler, the experiment for execution. Each request for execution contains 4 components:
- Toolchain
- Configuration, containing the queues, libraries and slots the user wants to deploy
- Username or ID
- User queue rights - computed from the max of all the groups the user belongs to
NB: The reputation can be implemented as a multiplying factor. (ceilled at the total # of slots for a particular queue). For example, a user with Reputation = 200 has 2x more processing power than the established User queue rights.
Scheduler
The scheduler will receive run-xp requests and breaks down the experiments in jobs (representing the blocks) with dependencies.
At each queue loop it must decide what to execute based on:
- Current slot availability for the different queues/users (user-queue occupation state needs to be stored)
- Job queueing time (how old is the job?)
- User queue rights/priority
N.B.: If the Scheduler receives a second job for the same user, the queue rights for that user will be updated with the new values, in case they differ.
Simplifications for first implementation
- There is only 1 environment installed, based on Python/"execute_single_algorithm.py".
- All queue priorities are set to 0 (i.e. the scheduler can ignore it)
- #slots/memory goes in pairs until we understand all this a bit better
- The scheduler implements only a FIFO (first queued/runnable is run) strategy based on the job submission age
- Libraries must be written in pure Python
- No reputation system is in place just yet