Overview

This section gives an overview of the components involved in acquiring data, which is performed by each component that provides data to the final data product and is coordinated by OCM and creating the data product to be archived and notifying the Online Archive System, which is performed by DPM. This means that other components in OCS are regularly not included.

Introduction

Stylistic Conventions

Note

Although diagrams sometime follow UML styling they are not created to be formally correct, but to convey information efficiently.

The following visual convention is used for components (or systems), control and data flow:

digraph G {
    # Config
    node [shape=component,fontname=helvetica,margin="0.22,0.09",fontsize=11];
    graph [fontname = "helvetica",nodesep=0.45,bgcolor=transparent];
    edge [fontname = "Lucida Console", fontsize=10, margin="0.23,0.09"];

    subgraph Comps {
        Component;
        Data [shape=tab];
    }

    subgraph Flows {
        # Components
        A;
        B;
        C;

        A -> B [label=" Control Flow (solid)"];
        B -> C [label=" Data Flow (dashed)",style=dashed, arrowhead=onormal];
    }
}

Fig. 5 Stylistic conventions

The direction of control refers to who the initiator is. The most common case is to use request/reply in which case the diagram shows how requestor initiates the control from A to command B.

Data flow is typically reserved for the out-of-band information that is not carried as part of the control flow. In practice data flow in this manual usually means FITS files being created by a component and that are later transferred for consumption in another component. Strictly speaking the actual transfer then depends on how components are deployed. The data flow can be thought both as the logical information transfer and the physical information transfer, depending on deployment and context.

Note

There are exceptions to using request/reply for e.g. the interaction with OLAS, but in this overview it can be considered conceptually equivalent.

Conceptual Model

A conceptual data model can be useful as a basis for understanding the more detailed documentation. The following diagram show data flow, data entities and multiplicities for how a single Data Product is created for a single Data Acquisition:

digraph G {
    # Config
    background = transparent;
    node [shape=component,fontname=helvetica,margin="0.22,0.09",fontsize=11];
    graph [fontname = "helvetica",nodesep=0.45,bgcolor=transparent];
    edge [fontname = "helvetica", fontsize=10, style=dashed, arrowhead=onormal];

    # Components/Data

    {
        rank=same;
        OCM [label="OCM[1]"];
        DataSource [label="Data Source [1 .. \*]"];
    }

    DataSourceKeywords [shape=tab,label="JSON Keyword [0 .. \*]"];
    OcmKeywords [shape=tab,label="JSON Keyword [0 .. \*]"];

    FITS [shape=tab, label="FITS [0 .. \*]"];
    DPM [label="DPM [1]"];
    DataProduct [shape=tab,label="Data Product [1]"];
    Receiver [label="Receiver [0 .. \*]"];

    # Flow
    OCM -> OcmKeywords -> DPM;
    DataSource -> DataSourceKeywords -> OCM;
    DataSource -> FITS;# [label="0 .. \* files"];
    FITS -> DPM;
    DPM -> DataProduct;
    DataProduct -> Receiver;
}

Fig. 6 Conceptual model of a Data Acquisition which results in a Data Product delivered to configured receivers. For sake of readability the figure does not include the details surrounding control flow. A Data Source is any component that is configured to be used for the specific Data Acquisition. Colloquially it may be referred to as a subsystem like the telescope or detector subsystem. More accurately the Data Sources is one or more processes from those subsystems that implement a supported interface for acquiring data.

  • There is one instance of OCM coordinating the Data Acquisition and acts as the interaction point for clients.

  • For each Data Acquisition there are 1 to any number of Data Sources providing data. A Data Source can be any process that implement the supported MAL interfaces from [RD3,RD4].

    Tersely the Data Sources implementing Metadata Acquisition Interface [RD3] are started/stopped automatically based on the state change of primary Data Sources:

    • Are informed by OCM when to start/stop. This happens before any primary Data Sources are started and after all primary Data Sources are stopped.

    • Can provide FITS keywords via FITS file or via MAL command response.

    • Can provide one or more FITS files containing keywords and/or extensions.

    • Examples: FCS Device Manager and RTCTK Metadata Collector.

    The Data Sources implementing DCS Recording Interface [RD4]:

    • Are informed by OCM when to start (either directly or synchronized to e.g. an absolute time).

    • Can be stopped by OCM by the request of a user with StopDaq(). It will not be stopped automatically by OCM.

    • Can decide to stop by itself (if e.g. configured with a fixed integration time), in which case it informs OCM via publish/subscribe topic.

    • Can provide one or more FITS files containing keywords and/or extensions.

    • Examples: NGC2 and CCF Technical Cameras

  • Each Data Source may be provide 0 to any number of FITS files or keywords encoded in JSON format. FITS files are stored on a file system whereas the JSON keywords are provided to OCM via the MAL interface.

  • There is one instance of DPM handling a Data Acquisition that receives all JSON keywords from OCM and transfers all FITS files from the origin filesystem to where DPM is deployed.

  • DPM then produce a single Data Product using input sources in a process referred to as merging.

  • DPM can deliver that Data Product to any number of receivers using post-processing recipes. In practice this will almost always be one recipe with one receiver and that is to deliver the Data Product to OLAS.

There is a special case for OCM where it is always included as a data source implicitly. OCM will always deliver standard and user provided FITS keywords for the final Data Product.

For a more detailed overview see section Data Product Creation.

Important

All configuration that is related to a Data Acquisition is a per-Data Acquisition property. Data Acquisitions are designed to be independent to allow concurrency without surprising side-effects. This also means there is no static data source configuration in OCM. This and other parameters are provided when creating the Data Acqusition.

Note

OCM and/or DPM is not responsible for deleting Data Products that might no longer be useful, after post-processing. This activity falls within the scope of an operational procedure to free disk space of files after confirming they can be removed.

Control and Data Flow

The following sections provides a simplified overview of the control and data flow from Data Acquisition to Data Product delivered to the Online Archive System (OLAS) 1.

digraph G {
    # Config
    node [shape=component,fontname=helvetica,margin="0.22,0.09",fontsize=11];
    graph [fontname = "helvetica",nodesep=0.45,bgcolor=transparent];
    edge [fontname = "Lucida Console", fontsize=10, margin="0.23,0.09"];

    # Components
    Sequencer [label="Sequencer/Client"];
    {
        rank=same;
        OCM [label="OCM*"];
        DPM [label="DPM*"];
    }
    source [label="Data Source(s)", fontname="helvetica italic"];
    OLAS [label="OLAS"];

    #Sequencer -> OCM;
    OCM -> DPM [minlen=2, style=dashed, arrowhead=onormal];
    OCM -> DPM [minlen=2];
    #OCM -> source;

    source -> DPM [style=dashed, arrowhead=onormal];
    DPM -> OLAS;
    DPM -> OLAS [style=dashed, arrowhead=onormal];

    # Use large weight to create vertical alignment
    edge [weight=1000];
    Sequencer -> OCM;
    OCM -> source;
}

Fig. 7 Process overview. Components marked with * are covered by this manual.

Normally it is the Sequencer that is the client when interacting with OCM, but of course any client will function the same. It requests new Data Acquisitions from OCM, specifying the sources to acquire data from and other parameters.

OCM coordinates the Data Acquisition by commanding a number of data sources such as science detectors, function controllers and telescope. In the diagram these are abstracted as the Data Source(s) component. When the Data Acquisition completes OCM commands DPM to create the Data Product from the acquired data.

There are no constraints on number or locality of data sources involved in a Data Acquisition. If a component implements supported interfaces correctly and is reachable over network, OCM can control it to acquire data. Refer to [RD2] for options when it comes to Data Acquisitions that span multiple ICSs.

Once the Data Product is complete it is delivered to the archive system OLAS or which ever post-processing recipe is configured.

1

DPM supports per Data Acquisition configurable post-processing recipes, but the standard, and also default, is to interface with OLAS to archive the Data Product. Refer to section Data Acquisition Process for additional details.

Control Flow

This section provides an overview of the resulting control flow for individual Data Acquisitions. OCM supports any number of concurrent, but independent, Data Acquisitions. For additional details on the Data Acquisition process and how to control it c.f. section Data Acquisition.

digraph G {
    # Config
    node [shape=component,fontname=helvetica,margin="0.22,0.09",fontsize=11];
    graph [fontname = "helvetica",nodesep=0.45,bgcolor=transparent];
    edge [fontname = "Lucida Console", fontsize=10, margin="0.23,0.09"];

    # Components
    Sequencer [label="Sequencer/Client*"];
    {
        rank=same;
        OCM;
        DPM;
    }
    source [label="Data Source(s)*", fontname="helvetica italic"];
    OLAS [label="OLAS*"];

    Sequencer -> OCM [label=" 1."];
    OCM -> DPM [label=" 3.", minlen=2];
    OCM -> source [label=" 2."];

    DPM -> OLAS [label=" 4."];
}

Fig. 8 Control flow overview.

Description of the control flow:

  1. The client initiates a new Data Acquisition, specifying which data sources to acquire data from using the command StartDaq(). The client continues to be able to control the Data Acquisition using the OCM Data Acquisition MAL interface OcmDaqControl.

  2. OCM coordinates the Data Acquisition by commanding data sources to start, stop or abort, as requested by client.

  3. When data has been acquired OCM commands DPM to produce a Data Product from a specification on how to merge the data together.

  4. When DPM has created the Data Product it is delivered to OLAS. This done using a special purpose interface and not a normal request/reply MAL control interface.

Data Flow

This section provides additional details on the data flow. To give the full picture of how the Data Products are formed the following diagram also show how Data Product FITS keywords can be provided as part of the control flow from the client.

digraph G {
    # Config
    node [sep="+2", shape=component,fontname=helvetica,margin="0.22,0.09",fontsize=11];
    graph [fontname = "helvetica",nodesep=0.45,bgcolor=transparent];
    edge [fontname = "helvetica italic", fontsize=10, margin="0.23,0.09"];

    # Components
    Sequencer [label="Sequencer/Client*"];
    {
        rank=same;
        OCM;
        DPM;
    }
    source [label="Data Source(s)*", fontname="helvetica italic"];
    OLAS [label="OLAS*"];

    Sequencer -> OCM [label=" 1. keywords"];
    OCM -> source [style=invisible,arrowhead=none];
    # Not Data Flow:
    # OCM -> DPM [label=" 3."];
    OCM -> DPM [label=" 2.", style=dashed, arrowhead=onormal];
    source -> DPM [label=" 2.", style=dashed, arrowhead=onormal];
    # Not Data flow:
    # DPM -> OLAS [label=" 3."];

    DPM -> OLAS [label=" 3.", style=dashed, arrowhead=onormal];
}

Fig. 9 Data flow overview.

Description of the data flow:

  1. When a new Data Acquisition is initiated FITS keywords can be provided both at the very beginning with the StartDaq() command and after it has started with the UpdateKeywords() command.

  2. The individual FITS files created during Acquiring are transferred by DPM to the host where it is deployed, to be merged into the final Data Product. This also include OCM, which provides primary HDU keywords to be merged.

    The individual FITS files follow ESO guidelines and specifications and may contain, apart from mandatory FITS keywords, also ESO hierarchical keywords and/or FITS extensions.

    Note

    Files are transferred explicitly using scp or rsync (TBD) if source files are not reachable on a DPM local mount. Files are transferred implicitly if files are located on a distributed file system, but reachable from the DPM host (i.e. reachable on a locally mounted filesystem).

  3. Once Data Product is created by DPM it is delivered to OLAS.

    If DPM is deployed on the same file system where files are delivered to OLAS, then no additional transfer is made. If the destination file system is either different or remote, another Data Product file transfer is made.

Data Product Creation

Note

See Version 2.0.0 for current limitations and heuristics for determining if an in-place target should be used.

This section provide an overview of how the final Data Product is created from individual files. The process is fairly simple and mechanical to reduce configuration complexity. In addition the foreseen data volumes makes complicated processing prohibitively expensive. The rule of thumb is that acquired data should be created in the desired format rather than modifying it afterwards.

Given a list of sources (FITS files or JSON keywords) and a target2:

  1. Sources are provided in priority order from high to low.

    The order is significant in the following ways:

    • It determines the relative order of Value Keywords (i.e. value keywords from first source always precede value keywords from subsequent sources).

    • It determines the HDU extension order.

  2. Target primary header is (re-)created

    Keywords are merged to target primary HDU, using default or user provided keyword rules. If target also contains keywords those will be included first.

    • The keywords are taken from the primary HDU if the source is a FITS file.

    • If no keyword rules are provided all non-structure keywords are used.

    • If multiple source provide keywords with the same name the keyword from the highest priority source is kept.

  3. FITS extensions from sources are copied to target.

    The extensions are copied in priority order from sources as they are. No modifications are done to the extensions.

Important

Merging multiple single-HDU FITS files with data is not supported. Data sources should instead be configured to produce FITS file with one or more extensions that can be appended to the same target file. I.e. there is no support for converting primary a HDU to an HDU extension.

  1. Special keywords are added or updated.

    For each HDU the FITS checksum keywords are computed and added/updated:

    • ORIGFILE

    • ARCFILE

    • CHECKSUM

    • DATASUM

User is able to specify the source order and other aspects such as keyword rules, using Data Product configuration parameters provided with StartDaq() (TBC).

For details see section Data Product Specification.

2

One of the source FITS files may be designated as the target to allow in-place merge, where that source file will act as the base for the subsequent sources to merge into.

If no source is designated as the target an empty FITS file will automatically be created.