Common Trending and QC tools:
Documentation

tqs = Trending and Quality Control System
*make printable   new: see also:
 

v2.2.1:
- option -n (needed for dfosCron)

v2.3:
- $WEB_SERVER replaced by $DFO_WEB_SERVER

closely collaborating with trendPlotter
[ used databases ] databases observations..data_products; reads opslogs text files; reads headers for setup keys and mjd-obs
[ used dfos tools ] dfos tools qcdate; dataclient for headers
[ output used by ] output used by trendPlotter , autoDaily
[ upload/download ] upload/download up: LAST_FILES, ftpWatcher
javascript javascript jCheckHealth (stored inside the tool) to calculate on the web page the ftpWatcher

qc1Parser

Description

This tool parses the OPSLOG files ($DFO_OPS_DIR/QC1_<instr>.<date>.ops.log) which contain information from the pipeline processing on Paranal and are delivered by the Data Handling Administrators (DHA) into the QC machines (under $DFO_OPS_DIR). They are parsed for QC1 parameters which are then fed into the Health Check monitor.

Since generally not all required information is contained in the OPS LOG files, header files are read by qc1Parser. To be as up-to-date as possible, the tool calls dataclient to download the most recent headers (from today), unless this is disabled by either HDR_DOWN=NO (configuration) or option -n (no hdr_update).

The tool is designed to feed the trendPlotter tool. It is embedded in, and called by autoDaily. The trendPlotter calls to update the HC reports are executed by autoDaily.

Short excursion on OPSLOG data. The Paranal pipelines produce QC1 parameters, just like the pipelines used by QC. The Paranal pipelines use a static set of master calibrations and a default set of recipe parameters, often optimized for speed but not for ultimate accuracy. In that sense the Paranal QC1 parameters are preliminary. The ones produced by QC Garching are considered final and are stored in the QC1 database (QC1DB). With the fast data transfer and the incremental processing scheme, the Paranal QC1 values are usually not important anymore, but represent a fall-back solution in case of delivery delays. Therefore they are still handled in the dfos system.

OPSLOG data are delivered once per hour by DHA via ftp to the QC Garching machines.

Structure of QC1 OPSLOGs.

ops log parameter set QC keys
QC1_<instr>_2024-11-11.ops.log   set1 (-START...-STOP) QC key 1
QC key 2
QC key 3
QC key 4
set2 (-START...-STOP) QC keys ...
set3 (-START...-STOP) QC keys ...
etc.  

The tool checks for '-START' content. Each time such a line is found, all content afterwards is collected into a data set which gets a unique identifier (SETn where n is incremented by one when a new set is found). A data set is finished when the '-STOP' tag is detected.

Next, all data sets are analyzed. They are characterized by a configured unique string which could be PRO.CATG, TPL.ID or recipe ID. This depends on the instrument pipeline, is not standardized and therefore kept configurable. Examples could be 'uves_ech_bias' marking a UVES BIAS data set, or 'MASTER_LOC_FRAME' being the identifier for a VIMOS IFU localization data set.

Once a data set is identified, the tool searches its lines for configured key names (like e.g. QC BIAS MASTER RMS) and catches their content. These keys are in most cases QC1 parameter keys but in principle could be any key in the ops logs.

Typically QC1 ops logs are not complete. They typically lack information about mjd-obs or setup keys which are crucial for trending. Hence mjd-obs is always read from the obs_metadata..data_products database. Headers are scanned by the tool in order to feed missing keys. The user configures the data source (OPSLOG, HEADER).

HC plots and trendPlotter. The tool is designed to deliver input data for the Health Check (HC) process, which is served by the trendPlotter tool. Typically various HC reports exist per instrument, e.g. BIAS. For each such HC report, there is one report type defined for the qc1Parser which has its own unique identifier. Normally the OPSLOG data for the BIAS report are collected into $DFO_TREND_DIR/par_BIAS.dat .

trendPlotter reads from ... LOCAL bias.dat, flat.dat etc.
QC1DB giraffe_bias, giraffe_flat etc.
  OPSLOG
extracted files, e.g.
qc1Parser fills ...   par_BIAS.dat
  par_FFLAMP.dat

Scanning strategy. The tool parses the opslogs in chronological order, from $TODAY backwards $N_PARSE days (configured). It is however tied to the civil date scale, meaning gaps in the opslog deliveries are counted. For instance, if $TODAY is 2024-01-17 and $N_PARSE is configured as 7, all opslogs earlier than 2024-01-11 are not considered, thus neglecting gaps for dates 2024-01-16, 2024-01-15 and 2024-01-12. In the example below, there are cases when no data were acquired and hence no opslog exist (2024-01-16), and also cases when data were acquired but no opslog entry were created (perhaps because the pipeline was not working; 2024-01-15 and 2024-01-12).

For instruments with data delivery on disks, N_PARSE should cover the typical gaps between data acquisition and data processing by QC (typically 10-12 days). For instruments with fast data delivery via the internet, the opslog data are a fallback solution and need to cover only the past few days.

civil date ops log headers output table
2024-01-17 yes yes yes
2024-01-16 no no no (no ops logs)
2024-01-15 no yes no (no ops logs)
2024-01-14 yes yes yes
2024-01-13 yes yes yes
2024-01-12 no yes no (no ops logs)
2024-01-11 yes yes yes
2024-01-10 yes yes no (out of range)
2024-01-09 yes yes no (out of range)

Incremental scanning and output. Per default, the tool works incremental in two aspects: incremental per date, and incremental per file.

Incremental per date: qc1Parser parses, per default, only OPSLOG files from the 3 latest dates. The outdated entries in that table (older than $N_PARSE days as configured) are detected and removed automatically. This strategy gives much better performance than a full scan. For situations when a full scan is desirable, this can be enfoced by qc1Parser -f.

Incremental per file: qc1Parser employs a differential scheme also per file. The $DFO_OPS_DIR has two subdirectories, XDIFF and XREF. The OPSLOG files from the last 3 dates are copied from $DFO_OPS_DIR to $DFO_OPS_DIR/XREF. The tool then does a unix diff. If found, that difference (and not the whole file) is written into $DFO_OPS_DIR/XDIFF. If no difference is found, nothing is written there. Then, the tool is working against the content of XDIFF only, and thereby does not repeat unnecessary scans. After it has finished the scans, the difference files are deleted, the latest version of the OPSLOG files is copied as new reference file, and the same procedure is repeated the next time. With the hourly delivery pattern, qc1Parser effectively scans only the content of the latest opslog file which has been added during the last hour.

trendPlotter jobs. The tool collects all new calls of trendPlotter jobs in a file called $DFO_JOB_DIR/opslog_execHC. These calls are evaluated by autoDaily at the end of its execution.

Mail notification. The user may want to configure MAIL_NOTIF to YES, to get notified if a new $DFO_JOB_DIR/opslog_execHC has been invoked.

ftpWatcher. Two components of the HealthCheck process are monitored by this component:

The timestamp of the last opslog file in the QC system is evaluated against the current timestamp (NOW) by the javascript jCheckHealth. Likewise, the timestamp of the last execution of qc1Parser is also evaluated by the javascript against NOW. Both are constantly checked to be smaller than HOUR_DIFF, a hard-coded variable set to 2 hours. The output comes in two versions:

ftp
parser
both the ftp and the parser processes are ok
ftp
parser
the ftp needs your attention
ftp
parser
the ftp and the parser process need intervention (or the parser only, this cannot be distinguished!)

Note: this monitor process is set up such that both boxes will turn eventually red if no update occurs ("page is frozen").

Output

The LAST_FILES_<instr>.html file are transfered to the QC web site, to /qc/<instrument>/reports/ (as configured). There they are linked to the Health Check trending reports.

How to use

Type qc1Parser -h for on-line help, and qc1Parser -v for the version number.

You can call the tool anytime from the command line:

qc1Parser parses the OPSLOG files for the 3 last dates, updates the par_<...>.dat tables, deletes outdated entries, and exits (incremental mode).

qc1Parser -f [full] parses all OPSLOG files in the configured time range, and creates the result tables from scratch.

qc1Parser -j as above, plus creates $DFO_JOB_DIR/opslog_execHC in the end (mostly for use within autoDaily).

qc1Parser -n enforces to not call dataclient (this is reasonable only within other tools); HDR_DOWN configuration is overridden.

qc1Parser is normally called by the workflow tool autoDaily. This is the recommended way.

Configuration file

The tool configuration file (config.qc1Parser) defines:

Section 1: general parameters
N_PARSE 8 number of most recent nights of OPSLOG files to be parsed
HDR_DOWN YES|NO YES: download header files (not necessary if already provided by cron job); optional, default: YES
MAIL_NOTIF YES|NO notify $OP_ADDRESS if new trendPlotter job started
POST_PLUGIN pgi_qc1Parser optional plugin, expected under $DFO_BIN_DIR, executed just before scp and exit
Section 2: Definition of report types
Each report type defined here will have a corresponding trendPlotter report type.
NAME BIAS name of report
IDENTIFIER MASTER_BIAS string to be used as unique identifier (could be pro.catg, tpl.id etc.)
FILTER YES|NO optional flag, to indicate if a filter has to be applied
DP_TYPE BIAS, or FLAT,LAMP etc. DPR.TYPE of corresponding raw files

Section 3: Definition of QC1 keys to be parsed
Here all keys are defined which shall be found. Two data sources exist: OPSLOG and HEADER; all keys for all reports need to be defined here. ARCFILE and MJD-OBS are always scanned in addition, no need to define them here.

REPORT_TYPE see section 2
SOURCE OPSLOG | HEADER

OPSLOG or HEADER

FITS_KEY_NAME QC.BIAS.MASTER.MEDIAN etc.  
Section 4: Definition of filters
If the FILTER tag has been set to YES for a report, the filter needs to be defined here. It is specified by valid shell code, enclosed by &&...&& (end of line).
REPORT_TYPE BIAS &&egrep "Medusa1|Medusa2" | grep "L543.1"&&
(this example finds all entries with string Medusa1 or Medusa2 and L543.1 .

Operational hints.


Send comments to <rhanusch@eso.org>
Last update: November 4, 2013