Run Control

Note

This document is a work in progress. Information on this page is meant as a collection of ideas and suggestions for a future run control system.

Run control is the mechanism that controls when and where raw detector data is being written to tape. It is also in charge of giving the stored data a uniquely identifyable name (in a simple case such as run001.lmd, run002.lmd, …).

Considerations

A few actors are in the game:

  • Data providers
    • MBS nodes
    • MBS Timeorder nodes
    • drasi nodes
    • UCESB
    • others
  • Storage providers
    • no storage
    • RFIO server
    • LTSM server
    • simple disk storage (ucesb?)
  • Metadata providers:
    • DAQ (detector rates)
    • EPICS (HV setting)
    • Accelerator (beam information)
  • Action Logging:
    • Run database
    • ELOG
    • To file
  • Provider logging:
    • To file
  • Pipelines:
    • Connecting Data providers to storage providers
  • Users:
    • Using pipelines

Main idea

Users use preconfigured data pipelines to start and stop streaming data from data providers to storage. Each user action is logged to database and the ELOG. Additional storage of metadata can be attached to each pipeline.

Run control should monitor the state of all actors. An actor may be flagged as ‘important’ or ‘critical’ for operation. If an actor is ‘critical’ for operation, then run control should have a mechanism to ensure that it is in the active/running state as much as possible. This may include restarting DAQ, rebooting PCs, sending e-mail, power cycling the cave, etc… If an actor importance is ‘normal’, run control should make modest attempts to ensure an active state.

Run control is operated as a standalone service, running as a background task on a server. Primary user interaction is done via EPICS PVs exported by run control. As such, either command-line, script-based or GUI-based interaction is possible.

Run control by itself should have as few dependencies as possible, easing integration and deployment.

Run control is not a time critical or high performance application, but rather should be maintainable and extensible for a long time period. This puts the python language in the front row for a choice language. Using python also cuts the development cost, because many needed interfaces are already existing (e.g. to channel access, databases, ELOG).

Data providers

Are a source of data in LMD format, providing raw experiment data in the stream, transport or drasi protocols.

From a run control perspective, data providers are in one of two states:
  • able to provide data (active / running / open)
  • unable to provide data (inactive / stopped / closed)

Data stream pauses due to back pressure are of lesser concern to run control, and more likely to happen and more interesting at a lower level.

Internal configuration of data providers is not part of run control. This is handled by configuration of each provider (e.g. DAQ node setup).

Input for run control for each provider:
  • name
  • hostname
  • state
  • data rate produced
  • event rate produced
  • output buffer fill level
  • list of ports
Internal state:
  • provider importance: critical / normal / nice-to-have
Exported PVs with a prefix (<P>:data:<id>:):
  • name
  • hostname
  • state
  • data_rate (MB/s)
  • event_rate (Hz)
  • output_buffer_fill (percent)
  • start
  • stop
  • restart
  • n_ports
  • port:<id>:…

Data provider ports

Are the data output taps on the data providers. Each port has an attached port number and data protocol.

Input for each port:
  • port number
  • data protocol (stream, trans, drasi, other?)
  • port state (free, connected)
  • last time, that run control has seen data on this port
  • attached to pipeline
Exported PVs with a prefix (<P>:data:<data-id>:port:<id>:):
  • name
  • number
  • protocol
  • state
  • time_of_last_data

Storage providers

Are a sink for data in LMD format, accepting a stream of raw experiment data.

Storage providers are in one of three states:
  • stopped
  • active, unable to write data
  • active, able to write data (standby)
  • writing data
Input / feedback from storage providers:
  • name
  • hostname
  • state
  • rate of data written
  • storage path prefix (if any)
  • space left on storage
  • currently opened file path
  • error condition
Internal state:
  • attached to provider
Exported PVs with a prefix (<P>:storage:<id>:):
  • name
  • hostname
  • state
  • data rate (MB/s)
  • path_prefix
  • space left (MB)
  • open_file
  • error
  • stop
  • start
  • restart

Metadata providers

Are making additional information available.

Some obvious examples:
  • accelerator parameters: Ion (primary, secondary), energy, beam line,
    average beam intensity, S1 slits
  • daq scaler rates: trigger, deadtime, event time distribution, detector
    scalers, ROLU, LOS
  • detector rates: LOS/ROLU
  • slow control settings: e.g. magnetic field, target information,
    ROLU window size/offset, detector positions
  • experiment information: facility, eperiment number/name, run number,
    first file number, experiment phase (setup, beam, etc)
  • user information: current shift members
Input:
  • name
  • metadata
    • short format (machine readable)
    • long format (human readable)
    • binary format ?
  • state
Exported PVs with a prefix (<P>:metadata:<id>:):
  • name
  • state

Metadata provider are treated as ‘non-critical’, but a metadata provider that is in an error state should be made obvious to the user.

Logging providers

Run control automatically logs all user actions and actor state changes.

Example:
  • A user U starts a pipeline P.
  • Run control logs:
    • that the pipeline P was started by user U at a time T.
    • that the pipeline P enters running state and that data is being written to the currently open file F in the storage provider S starting from time T.
    • metadata information attached to pipeline P at time T

ELOG logger

A new elog entry is created with user / run number / pipeline name information. Metadata in long format is written to the ELOG entry text.

Database logger

A database entry is written with user action information or pipeline information.

File logger

Similar to the database logger.

Pipelines

Are a connection between a data provider port and a storage provider for a user. The data provider port must be free and this port is then blocked for other pipelines.

Pipelines are created once and can be used on demand. Several used pipelines can be running in parallel, if they don’t occupy overlapping ports. Stopped pipelines may have overlapping ports.

Running or stopping a pipeline requires giving a message as to why this is done (contextual information).

Input:
  • pipeline id
  • name
  • data provider port
  • storage provider
  • storage path (pattern with run number etc)
  • state (free / used)
  • run state (stopped / running)
  • current user
  • requested by user
  • start time
Exported pvs with a prefix (<P>:pipelines:<id>:):
  • name
  • storage_provider
  • storage_path
  • state
  • run_state
  • user
  • requested_by
  • start_time
  • n_metadata_providers
  • metadata:<n>:id
  • metadata:<n>:name
  • metadata:<n>:state

Users

Are the actors that configure, use and run pipelines. Each user can request several pipelines for exclusive use. The user can then run or stop all pipelines synchronously on demand. A used pipeline is blocked for all other users. Another user can request using a pipeline as a hint to other users. Metadata is stored on a per-user basis. One or more metadata providers may be added to a user. The information is then stored in the logging facilities (i.e. database, ELOG, etc…)

A user can be seen as a group of people conducting a specific experiment at a certain facility.

Input:
  • Activate / deactivate a pipeline
  • Start / stop active pipelines
  • Change pipeline storage path
  • Manual (re)set of run number / file number
  • Request usage of an inactive pipeline
  • Experiment name, number
  • Facility name
  • Shift members
Actions:
  • use pipeline
  • free pipeline
  • run pipeline
  • stop pipeline
  • set storage path
  • set run number
Exported PVs with a prefix (<P>:users:<id>:):
  • user_name
  • experiment_name
  • facility_name
  • n_pipelines
  • pipeline:<n>:id
  • pipeline:<n>:name
  • pipeline:<n>:state
  • pipeline:<n>:start
  • pipeline:<n>:stop
  • pipeline:<n>:use
  • pipeline:<n>:free
  • pipeline:<n>:run_number
  • pipeline:<n>:storage_path
Example 1:
A user ‘r3b’ uses the pipeline ‘r3b_main’. The user attaches the metadata providers ‘accelerator’, ‘r3b_main_trigger_scalers’ and ‘r3b_los_rolu_rates’. The user starts running the pipeline.
Example 2:
A user ‘neuland’ uses the pipeline ‘r3b_neuland’.
Example 3:
A user ‘xb’ uses the pipelines ‘xb_left’ and ‘xb_right’.