Common DFOS tools:
|
dfos = Data Flow Operations System, the common tool set for DFO |
make printable | new: | see also: | ||||||||
v2.0: |
call_IT | |||||||||
v3.0: |
|
|||||||||
|
Note: - In this documentation, IDPs means Internal Data Products and stands for science data products as created by the phoenix process. - MCALIBs is short for master calibrations, as created by the phoenix process. - If nothing is mentioned in particular, the documentation applies to all kinds of products. |
In some parts this documentation splits into sections applicable for the ingestion of IDPs, and others for MCALIBs. The IDP part then is shaded light blue (like this cell), |
while the MCALIB part is shaded light-yellow (like here). You can then the ignore the respective other part. |
PHO ENIX |
ingestProducts for phoenix |
The tool ingestProducts is enabled for standard DFOS and PHOENIX environments. The environment is recognized from the key THIS_IS_PHOENIX in .dfosrc which is set to YES for PHOENIX environments, and to NO for DFOS environments. |
Furthermore, if THIS_IS_PHOENIX=YES, the tool can recognize the MCALIB mode (ingestion of phoenix-generated master calibrations) via the key MCAL_CONFIG in config.phoenix. |
Supported modes for ingestProducts | |||
environment | enabled? | using ... | storage |
DFOS_OPS | |||
CALIB | YES | dpIngest | NGAS |
SCIENCE | NO | ||
PHOENIX | |||
CALIB (MCALIBs) | YES** | dpIngest | NGAS |
SCIENCE (IDPs) | YES* | ingestionTool | phase3 |
In the following the details for the PHOENIX environment are described. The behaviour in the DFOS_OPS environment is documented here.
IDP ingestion means to ingest the science data products and their ancillary files into the phase3 database (for the header information) and into NGAS (for the files). Before starting ingesting an IDP stream, the phase3 environment needs to be defined in config.ingestProducts:
These configuration keys are defined together with ASG. The tool ingestProducts is effectively a wrapper that first calls a preparation tool (converter), and then the phase3 ingestion tool. The converter tool is either a dfs provided tool (for UVES: it adds header information for phase3 compliance and modifies the FITS file structure), or a shell-script customized to the instrument (adding header information for phase3 compliance). They are described in the following.
|
Conversion tool: idpConvert and idp2sdp
UVES. For the UVES IDPs, there is the wrapper idpConvert around the special DFI-provided converter tool idp2sdp. It is needed to transform the pipeline-delivered output files into the SDP (science data products) standard format which is a binary table for spectroscopic data. This is the task of the conversion tool. It is instrument specific and is provided by DFI. For the current installation it is called For convenience it is wrapped in the helper tool idpConvert.idp2sdp Other IDPs. For the other IDPs, all structural conversion is done by the pipelines, and only header keys need to be added. This is done by customized header-conversion tools like idpConvert_xs and idpConvert_gi, which are created and maintained by QC. Find their description below. Installation (UVES only!) idp2sdp comes as part of the phase3 software installation. idpConvert is installed on sciproc@muc08:$HOME/UVES/bin. Config file (UVES only!) The idp2sdp tool has a config file in $DFO_CONFIG_DIR, idp2sdp.cfg (note its special syntax, due to its non-DFOS nature):
How to call (UVES only!)
|
IDP conversion (other instruments)
The corresponding wrapper scripts are always called like idpConvert_xs/gi etc. Their name is configured in config.ingestProducts under CONVERTER. |
IDP output (all instruments)
The converted products are found in the subdirectory $DFO_SCI_DIR/<date>/conv. The converter log file (from idp2sdp or from the conversion scripts) is found in $DFO_SCI_DIR/<date>/CONVERTED. This log file is also exported to the qc@qcweb site and can be found in http://qcweb/~qc/<RELEASE>/logs/<date>/CONVERTED. |
Ingestion: call_IT and IngestionTool
The ingestion tool (call_IT as a wrapper, /opt/dfs/share/IngestionTool.jar as the main component) provides the same kind of functionality as the dpIngest tool for DFOS master calibrations. It is a java package that is provided by DFI. It takes all files from a specified directory, ingests them into ngas, and extracts the header keys into the data repository and from there into the phase3 database. For convenience the ingestion tool is wrapped in the helper tool call_IT. This helper tool and the ingestion tool itself are used in the same way for all IDP projects. Installation The IngestionTool comes as part of the phase3 software installation. The wrapper tool call_IT is installed on the local $DFO_BIN_DIR. Config file The IngestionTool tool has a config file in $DFO_CONFIG_DIR, ingestiontool.properties. It is filled and maintained by the developer. How to call
The tool expects the converted IDPs in the subdirectory $DFO_SCI_DIR/<date>/conv. The IngestionTool log is found in $DFO_SCI_DIR/<date>/INGESTED. This log file is also exported to the qc@qcweb site and can be found in http://qcweb/~qc/<RELEASE>/logs/<date>/INGESTED. call_IT adds some information to the tool log file: a bit of statistics (number of new files ingested, already existing files) and of performance (time needed for ingestion). The tool performance is about 1 sec per (UVES) IDP. |
The log file of the ingestion tool checks the successful execution of the three main steps:
The IngestionTool log has also entries for each single IDP ingestion. Note that the tool log lists every single entry in the database as a "file" which is actually wrong. If an ingested fits file is registered in 5 other fits files as ANCILLARY file, each such record is counted as a "file" by the tool. This blows up the statistics in the log file. Don't get confused! |
Configuration of ingestProducts for IDP ingestion: special config keys for phoenix
The tool uses the standard DFOS config.ingestProducts. For the PHOENIX environment, there exist the following special keys:
Section 1: general | ||
# special config keys not needed for DFOS, only for phoenix: | ||
PATH_TO_IT | <full pathname of IngestionTool installation> | # needed for the time being |
CONVERTER | <full pathname of converter tool installation> | # needed for the time being |
PROGRAM_NAME | phase3 program name | |
COLLECTION_NAME | phase3 collection name | |
RELEASE_TAG | phase3 release tag |
In PHOENIX mode the tool reads a few configuration keys from config.phoenix:
config.phoenix | |||
RELEASE | <PROC_INSTRUMENT>, e.g. UVES | <RELEASE_TAG>, e.g. UVESR_2 | read for updating the statistics in daily_idpstat |
INSTR_MODE | <PROC_INSTRUMENT>, e.g. UVES | <INSTR_MODE>, e.g. UVES_ECH | read for updating the statistics in daily_idpstat |
phoenix 2.0 supports not only the creation of IDPs but also of master calibrations. This process has many similarities with the IDP production: it is project driven (not bound nor triggered by daily operations), and comes as a batch (many processing jobs). The main difference to the IDP production is that the (selected) pipeline products are ingested without modifications, do not constitute a phase 3 project (i.e. do not require coordination with ASG), and do not constitute a stream. Otherwise many aspects of their production and ingestion are very similar to the production and ingestion of operational master calibrations. In particular, the underlying ingestion tool (dpIngest) and ingestion storage (NGAS) is exactly the same. Once ingested, phoenix-created master calibrations are identical to the ones created by the daily workflow. Their main motivation comes from reprocessing after pipeline changes or improvements. Master calibrations are ingested as they are created by the pipelines. Hence no conversion is needed. Nevertheless the ingestion process has two steps, in formal analogy to the IDP ingestion:
Before calling the ingestion, the phoenix tool already does a check for the proper file names to be used upon ingestion (see there). |
Deletion of previous instances
Depending on the configuration, the tool ingestProducts will decide before the ingestion if any pre-existing master calibrations should be deleted. Per default, only those master calibrations get deleted and overwritten which have a new instance (by name). This might however result in an unwanted mix of old and new master calibrations. In the operational environment, many if not all calibrations are processed and ingested, no matter if actually used for science reduction:
Therefore it might be reasonable to not only overwrite older instances but delete the ones which get no new version. The tool supports this by configuration. The user may want to decide to delete (hide) certain master calibrations always, no matter if they get replaced or not. Several cases could occur:
In general, only those pre-existing master calibrations get deleted which are configured as PHX_DELETE. The deletion of pre-existing master calibrations is a critical step and can be fine-tuned by calling ingestProducts in DEBUG mode, which is interactive and offers file lists for review before actual hiding. The listings are done for the following cases:
|
Ingestion of master calibrations
The tool first creates these lists and executes the file DELETEs, calling dpDelete in -force mode. Then, it calls dpIngest in the usual way (as for daily operations). All actions (deletion and ingestion) are listed in the standard list_ingest_CALIB file. If configured, the qc1_update part is executed, using the QC1 database tables names if configured in the section QC1_TABLE of the configuration file. |
Configuration of ingestProducts for MCALIB ingestion: special config keys for phoenix
The tool uses the standard DFOS config.ingestProducts. For the PHOENIX environment, there exist the following special keys:
|
To call the tool in PHOENIX mode for IDPs, make sure to call it in the PHOENIX environment: $THIS_IS_PHOENIX must be YES. This is controlled in $HOME.dfosrc. |
To call the tool in PHOENIX mode for MCALIBs, you must
|
You can call the DEBUG mode of ingestProducts in the PHOENIX MCALIB environment:
The tool will then ask you for confirmation about the important steps of instance deletion (used only for MCALIB environment). |
Any other call mode is documented in the main page.
The tool writes (both for IDP and MCALIB mode) into the statistics file $DFO_MON_DIR/PHOENIX_DAILY_<RELEASE>, updating the columns for number and size of ingested IDPs. It also calls qc1Ingest of those entries into the DFO database table daily_idpstat and monthly_idpstat (see also the WISQ workflow statistics). For MCALIBs, the corresponding parameters need to be interpreted as applicable for MCALIB products. Since the focus of the QC1 tables is to monitor the creation and ingestion process (in terms of performance, disk space etc.), this mixing-up of IDPs and MCALIBs seems justified.
The last execution of the tool is written into the log file $DFO_LST_DIR/list_ingest_SCIENCE_$DATE.txt. All executions of the tool are logged into $DFO_SCI_DIR/<date>/INGESTED which is also exported to qcweb as http://qcweb/~qc/<RELEASE>/logs/<date>/INGESTED.
The monitor tool phoenixMonitor displays whether or not a certain night with IDPs has already been converted and ingested, by checking for the files $DFO_SCI_DIR/<date>/CONVERTED and INGESTED. For MCALIBs, the tool checks if the files have been properly renamed and ingested.
For IDPs:
|
For MCALIBs only:
|
For IDPs or MCALIBs: