New GSI Analysis System GO4
Object Oriented On-line Off-line System GO4
GO4 User Requirements Document
H. Essel, N. Kurz, M. Richter; DV&EE
In 1998, GSI has started a project for the evaluation and implementation of a new software system for low and medium energy physics experiments. It will cover data acquisition, data analysis, and set-up control in on-line and off-line mode. The new system shall include features of currently used systems, like the GSI in-house software GOOSY [Ref. 2] and the CERN software packages PAW and CERNLIB. The data acquisition system MBS (Multi Branch System) developed at GSI [Ref. 3] will be integrated. But input data may also come from any other acquisition system.
The new object-oriented system (proposed name GO4: GSI Object Oriented On-line Off-line system) shall be available on various UNIX's and on Windows-NT. It might be based on software currently available like the CERN software packages ROOT or LHC++. But it would need several extensions to handle specific requirements of the low and medium energy physics experiments.
The first project phase has been used to acquire user requirements and to evaluate existing software systems.
The following list of requirements had been revised by most of the potential users of a new system. The users have made their comments to all points and this document takes into account these comments already. The users also have voted for the priority they would like to see in the implementation of the requirements.
Table of contents *
2 Table of contents *
3 Lists *
3.1 List of requirements *
4 Introduction *
4.1 Notes on this document *
4.2 Abbreviations and definitions *
5 General classification of user environments for GO4 *
5.1 Analysis Classes *
5.2 Analysis Operation Modes *
5.3 Target system users *
6 User Requirements *
6.1 General Requirements *
6.2 User Interface *
6.3 Data Management *
6.4 Analysis *
6.5 Display *
6.6 Data Acquisition *
6.7 Constraints *
7 Various specifications *
7.1 List of existing software *
7.1.1 GOOSY *
7.1.2 IDL *
7.1.3 LEA *
7.1.4 LHC++ *
7.1.5 MBS *
7.1.6 National Instruments *
7.1.7 ROOT *
8 References *
3 ListsList of requirements
Requirement 1: P 8 GO4 shall be simple to use and easy to learn*
Requirement 2: P 7 User programming shall only be necessary in one place (analysis program) and in command lists *
Requirement 3: P 7 All functions shall be provided by commands *
Requirement 4: P 8 It shall be a portable data acquisition and analysis system including front-end, workstation or PC, and storage devices for experiments outside GSI *
Requirement 5: P 7 New systems shall run on UNIX and Windows NT to increase acceptance *
Requirement 6: P 7 The system shall have as few licensed software included as possible *
Requirement 7: P 8 A new system shall provide a graphical user interface, GUI *
Requirement 8: P 8 A new system shall provide context sensitive help and/or assistance *
Requirement 9: P 5 A new system shall be programmable through a graphical user interface *
Requirement 10: P 7 The graphical user interface shall be "compatible" with a to be built GUI of the MBS *
Requirement 11: P 8 A new system shall provide a script interface with a full functional script language *
Requirement 12: P 7 All user actions, graphically, command line, or script based, shall be logged *
Requirement 13: P 7 Results of scripts and actions shall be accessible by subsequent scripts and actions *
Requirement 14: P 9 The data analysis shall be controlled during execution *
Requirement 15: P 9 The data analysis shall be able to run interactively or in batch *
Requirement 16: P 5 The GUI shall run on more platforms than the analysis system runs. The user shall be able to add new or correct elements to an existing GUI. *
Requirement 17: P 8 The data analysis shall be independent of the input source *
Requirement 18: P 8 The analysis shall be able to get event input from DAQ servers *
Requirement 19: P 8 A new system shall be able to exchange relevant data with other systems *
Requirement 20: P 8 A new system shall provide a data management *
Requirement 21: P 8 Data elements shall be multidimensional arrays of aggregates *
Requirement 22: P 8 A new system shall automatically save accumulated data in case of termination *
Requirement 23: P 8 A new system shall provide easy to use programming interfaces to the event data *
Requirement 24: P 7 It shall also be possible to call most parts of the analysis package from user programs *
Requirement 25: P 7 A new system shall provide easy to use programming interfaces to the graphics *
Requirement 26: P 6 A new system shall provide tools to map event data to detector geometry *
Requirement 27: P 7 Efficient accumulation of many large matrices, e.g. 4k * 4k long words = 64 MB each shall be provided *
Requirement 28: P 5 The system shall optionally provide the symmetrical creation of multiple-dimensional spectra *
Requirement 29: P 7 A new system shall provide complex projections in multidimensional data spaces *
Requirement 30: P 8 The propagation of data errors shall be handled correctly *
Requirement 31: P 8 Other methods besides fit functions needed for the analysis of statistical data shall be provided *
Requirement 32: P 5 Connections to a Gamma line data base shall be provided *
Requirement 33: P 5 Interactive graphical analysis of gamma-ray coincidence data shall be provided *
Requirement 34: P 6 Analysis of time correlations between events (Alpha ray mother-daughter relations) in a time scale of ms to min shall be provided *
Requirement 35: P 6 A sort of trending storage of data, i.e. keeping events in a ring buffer for a given time window shall be provided *
Requirement 36: P 8 The analysis shall have online access to DAQ control functions *
Requirement 37: P 7 Some parts of the analysis shall optionally run online in the frontends *
Requirement 38: P 7 The analysis speed shall be equivalent to the data acquisition speed *
Requirement 39: P 8 The display shall operate independently of the analysis execution *
Requirement 40: P 9 Scatter plots of all data shall be possible *
Requirement 41: P 8 A simple set-up of scatter plots shall be provided *
Requirement 42: P 8 Color in scatter plots shall be used to distinguish different kinds of points, i.e. conditions or sources *
Requirement 43: P 6 Scatter plots with several points per event shall be provided *
Requirement 44: P 5 Any number of scatter plots shall run in the background *
Requirement 45: P 8 A "real" re-binning of spectra shall be provided, i.e. the transformation from n channels to m channels *
Requirement 46: P 8 The display shall be definable and savable by the user in a simple way *
Requirement 47: P 6 Mixed pictures with scatter plots and spectra shall be provided *
Requirement 48: P 6 The display shall generate ready to publish paper prints *
Requirement 49: P 7 Graphic dumps shall contain all details of a representation including data, scales, comments, errors, windows, etc. *
Requirement 50: P 8 Data links to commercial data analysis and presentation software shall be simple *
Requirement 51: P 7 The time dependent acquisition of single and multiple data shall be possible *
Requirement 52: P 5 Gamma ray input channels shall be sampled by fast flash-ADCs for further detailed analysis (e.g. pile-up) or for storing these data in ring buffers (equidistant in time) *
Requirement 53: P 8 Setting and controlling of the experimental hardware set-up and accelerator parameters by the data acquisition and/or the real-time data analysis shall be possible *
Requirement 54: P 7 A real-time analysis in the data acquisition processor connected with set-up control shall be possible *
Requirement 55: P 7 Constant fractions, amplifiers, high-voltage supplies etc. shall be set and controlled by the data acquisition *
Requirement 56: P 8 Set-up parameters shall be logged by the data acquisition *
Requirement 57: P 7 Existing software shall be used as a development base *
Requirement 58: P 8 Software common to community shall be used *
Requirement 59: P 7 External software shall be kept unchanged except changes are accepted and implemented by the authors *
Requirement 60: P 7 The required features shall be added to the existing software *
This document contains the description of components and
behavior of a data acquisition and analysis software system used in experiments at GSI.
5.1 Analysis Classes
In the following the environment for analysis systems will be split up into three categories:
Experiments with complex detector set-ups (geometry, tracks, reconstruction etc.) and high data volume. GSI examples are HADES and FOPI. The CERN experiment ALICE would fit here, too.
Experiments with complex histograms, statistical analysis. GSI examples are KAOS, Euroball, FRS, ESR.
Experiments with low computing, not too many channels per event, statistical analysis. Examples are small test or laboratory experiments.
Besides the analysis classes there are three analysis operation modes:
Analysis gets data from a DAQ (A, B, C)
Analysis gets data from storage (A, B, C)
The following user categories are intended to become GO4 users (or requesters) although there shall be no principle limitation to very large, high energy physics experiments:
The priority of each requirement has been requested from the potential users of GO4 at GSI. The following scale was used:
0 = no comment, 1 = lowest priority, 10 = highest priority
The result of all priority assignments is listed in this document. Each value has been calculated as the rounded mean value of all assignments without counting the "0" assignments for a requirement. The following graphics show the overall result of the poll.
6.1 General Requirements
These requirements define general features the GO4 shall have.
GO4 shall take into account that its acceptance will depend on a compromise between an easy user interface and a full functional system. Many experiments at GSI and in other laboratories have short lifetimes, inexperienced and untrained users, and external users arriving only days before the start of an experiment. There will also be two levels of usage:
Since most users have only poor programming experience the amount of specific code necessary to adopt the system to individual needs of each experiment shall be limited. Thereby possible errors produced by user code will also be limited.
The whole system shall be accessible through a graphical user interface (GUI), programming interfaces (API), and commands. The command language shall be as close to the programming interface as possible, but simple enough to be used by inexperienced and untrained users. For experiments with fixed, limited functionality a simple (GUI) setup tool shall be provided.
It is necessary to have a hardware and software system which can be carried easily to external experiments at remote institutes or which can be adopted easily by other institutes. Both, acquisition and analysis system shall be separated, i.e. shall run independently from each other such that each part will be exchangeable.
The platforms should be at least:
The order of importance is not yet fixed. The implementation should rely on standards, e.g. POSIX, Java, wherever possible.
Since many users of the new system will be from university institutes or from other countries it would be of help to reduce possible license fees to an absolute minimum. 6.2 User Interface
These requirements define features of the interface to the user, i.e. graphical, procedural, and command interfaces.
A graphical user interface to operate the system improves the usability. It is necessary for all simple experiments run by individuals not well trained in programming languages, specifically not in object oriented methods.
This is normally provided together with a GUI. A tutorial and many real-world examples are required.
This feature would be like the graphical language of LabView or Iris Explorer.
Often the analysis is operated on-line together with the DAQ. Therefore the GUIs shall have the same look and feel.
A comprehensive script language is necessary for batch jobs, but also to execute predefined scripts interactively. All data elements and functions, i.e. graphics shall be available.
For acquisition and analysis run logging, debugging, and recalling purposes the logging of all user actions is essential.
The output of scripts, like fit results, shall be storable in a way that they can be accessed by subsequent scripts, like UNIX pipes, when the result is not a referencable object.
More then one execution thread shall be running at a time, e.g. command interface, analysis loop, display, etc.
This is automatically achieved by a sufficient script interface.
This can be achieved, e.g. by using Java together with Web interfaces. Possible incompatibilities of Java implementations (e.g. Microsoft J++) shall be avoided. The user shall be at least able to tailor the GUI to his needs. 6.3 Data Management
A general purpose data interface shall be provided to read events of arbitrary structure. The analysis software shall be independent of the data input, i.e. on-line or off-line, from anything like tape, disk, DAQ, storage, or other analysis.
The DAQ systems provide various event servers, delivering the event data to various event clients. The new system shall implement individual clients for these servers.
A new system shall be able to process data, i.e. histograms and event or other data, as produced by DAQ, slow control, simulations, and other commonly used systems. This is the GEF format for histograms, the MBS format for event data, N-tuples for compressed event data, and various ASCII formats. The system shall also provide interfaces to implement the support of other event data formats.
Organization of data elements, I/O and storage shall be supported by appropriate tools. All parameters of an experiment or analysis run shall be stored in standard data bases. These include set-up parameters of DAQ, calibration parameters, filters, run specifications, etc. The parameters shall be accessible from the analysis software, from scripts, and from the GUI.
Histograms and other (user) data elements shall be referenced by names. If many data elements of the same kind exist, it shall be possible to process multidimensional arrays of such elements (name arrays).
It saves a lot of time (CPU and human) if accumulated or calculated data are not lost in case of abnormal program termination. This is true especially for histograms. A well implemented exception handling will not only keep the data consistent but it will also ease restart procedures. 6.4 Analysis
This means the "classical" user event routine is doing the analysis work, but the data handling is done by the analysis package, "hidden" to the user. This is called "implicit event loop".
It shall also be possible for a user's program to call routines like "get buffer", "get event", "skip event", etc. In this case the user may control the data flow directly. This is called "explicit event loop".
There shall be an API to access graphical objects like polygons or scatter plot points.
The data representation in the event data might not be suited for further processing. A representation fitting detectors or physical items shall be supported.
The main current problem is the memory limitation for the accumulation of many (50 200) large 2-dimensional spectra. It might be necessary to have fast disk accumulation routines.
In case of 2-dimensional Gamma-Gamma coincidence spectra often the accumulation of just half of the spectrum is necessary, since it is symmetrical to the diagonal. This would save the use of memory and disk storage.
A mechanism like N-tuples is necessary, but also visualization of complex data and various kinds of filters (conditions). The mechanisms shall be fast and interactively configurable.
Specifically time dependent event data need detailed error and dead-time handling. Also statistical errors shall be propagated correctly.
Tools for the handling of Gamma spectra (multi peak spectra) shall be provided, e.g. peak finding algorithms, Gamma peak fit algorithms (not just Minuit), user defined fit functions, and background corrections.
The analysis of complex Gamma ray spectra shall be supported through data from a Gamma line data base.
Complex, two dimensional Gamma ray spectra need specific graphics tools to ease the analysis.
The analysis of time dependant data, e.g. nuclear decay or cooling effects, need special treatment of data and results. To find long term correlations the data shall be "labeled" with time stamps and sorted in a time dependant way.
This is needed for finding long-term correlations, specifically for the analysis of various decays.
Sometimes event data shall be accumulated and analyzed on-line to steer some DAQ hardware set-ups, like stepping motors.
When the online analysis is needed to control the DAQ it would be necessary to have direct access to the control hardware. In this case the analysis shall run in the DAQ front-end. Only a subset of the functionality is needed. The visualization shall also be done on a remote node.
In many cases the complete on-line analysis of data is essential for controlling an experiment. In such cases the analysis speed may define the dead-time of the whole data acquisition. This requirement depends very much on the processor speed, the code efficiency, the code optimization, and other software and hardware environmental parameters. 6.5 Display
Therefore the system shall be multi-threaded or a high performance data exchange between the histogram and the display tasks shall be available.
Therefore again the system shall be multi-threaded. A high performance data exchange between the analysis input and the display shall be available.
The definition of scatter plots, i.e. the parameters and conditions defining a scatter plot, shall be simply declared by commands.
Scatter points could belong to different sources or different conditions (e.g. true/false). This shall be marked by different colors.
It shall be possible to run several scatter plots simultaneously in the background of a currently visible data display. To fulfill this requirement independently of the X-11 backing store facility, one could think of bit spectra accumulated by the analysis loop.
An algorithm for the re-binning of data shall be available.
A collection of display objects shall be describable and - combined as a new object - storable and recallable.
Pictures as a conglomeration of different spectra and scatter plots shall be definable. These pictures can be called as a whole instead of a series of single plots.
The quality of data graphics shall be as high as possible. This would allow the creation of publication ready pictures. Therefore the manipulation of graphic object shall be as extensive but also as simple as possible. As an alternative requirement 50 could be used.
Thereby a data representation can be redone and even modified easily at any time.
Data exchange with software packages like Origin, Mathematica, IDL, SATAN/GD (from GSI) shall be possible in a simple way. Such an easy interface would allow further detailed data analysis with commercially available or institute internal software packages. 6.6 Data Acquisition
The base of the data acquisition at GSI will be the Multi-Layer Multi-Branch Data Acquisition System, MBS [Ref. 3]. Nevertheless it shall be possible to use any other data acquisition system in connection with GO4. The following requirements are just extensions to the already existing system and to the parts just under development.
The time dependence of accumulated data (counters, spectra, ...) shall be preserved by the data acquisition system. This feature is important for experiments like beam cooling ESR, laser experiments, production of rare isotopes or elements. The time intervals are normally not equal and the acquired data have a time dependent normalization.
A detailed pulse analysis of certain input channels allow the detection and correction of abnormal situations like pulse pile-up. Thereby the efficiency and the resolution of detector systems can be improved.
For many experiments (e.g. on the ESR) there shall be a close connection and feedback between the hardware set-up control, the accelerator control, the data acquisition system, and even the data analysis system.
The analysis of data already in the front-end processor allows the evaluation of early trigger conditions or the reduction of input channels. Feedback mechanisms also need such a fast analysis.
In some cases, the time or temperature drift of the front-end electronics or power supplies can only be corrected after a more detailed analysis of the acquired data.
The logging of the whole experiment set-up shall be done in a unified manner. 6.7 Constraints
There is no need and no time to develop a new system from scratch.
The preferred programming languages shall be C++ or Java.
Good relations to the authors of external software used in GO4 are essential.
Required features shall only be added to existing software packages.
Currently no further request categories.7.1 List of existing software
LEA LEan Analysis software; http://www-gsi-vms.gsi.de/anal/lea.html
Libraries for HEP Computing - LHC++, Software package for the LHC experiments at CERN Geneva, Swiss; http://wwwinfo.cern.ch/asd/lhc++
Multi-Layer Multi-Branch Data Acquisition System MBS; MBS User Manual and MBS Reference Manual; http://www-gsi-vms.gsi.de/daq/home.html
LabView and BridgeView are software products of National Instruments Corp., Austin, Texas USA; http://www.natinst.com
The ROOT System, NA49 experiment, CERN Geneva, Swiss; http://root.cern.ch
[Ref. 1] ESA Software Engineering Standards, European Space Agency PSS-05-0, Issue 2 February
[Ref. 3] Multi-Layer Multi-Branch Data Acquisition System MBS; MBS User Manual and MBS Reference Manual;http://www-gsi-vms.gsi.de/daq/home.html
GSI Helmholtzzentrum für Schwerionenforschung, GSI