Common DFOS tools:
|
dfos = Data Flow Operations System, the common tool set for DFO |
make printable | new: | see also: | ||||||||
- v1.3: FILE_MODE always
HDR - find here workflow description to correct headers |
OCA syntax | |||||||||
|
This tool analyses the headers of raw files for unusual entries. The purpose is to automatically detect anomalies in the meta-data of the data stream (FITS header), find files which go unclassified with the current OCA configuration, and filter data with unwanted properties.
Three different kinds of file attributes can be detected by the tool:
The tool uses the DFS tools ABbuilder for classification, and fitsreport for reporting. It can be run in three different modes:
SCAN. In this mode, all headers for the specified date are scanned. The tool creates ABs for them, using the standard OCA classification configuration for the daily workflow. However, the association and organisation configuration files are replaced by dummies, so that the created ABs cannot be used operationally (they are stored in $DFO_AB_DIR/FILTER only temporarily and are deleted after runtime). The RAWFILE section of all created ABs is then scanned, to obtain the set of classified raw files. This is checked against the list of all raw files. The difference is the set of UNCLASSIFIED raw files. This set, if not empty, is displayed and fitsreported. No specific configuration is required for this first step.
Then, the header data are analyzed for empty FITS keys which should not be empty. The user defines in a fitsreport configuration file which keys should not be empty. Then fitsreport is executed against the input files. All files returning an empty key are reported as EMPTY files.
These actions are performed on headers, meaning they can normally be executed as soon as new headers have been downloaded. Any anomaly found can then be investigated, reported to Paranal, or to dbcm. This would give an opportunity to fix the database entries so that the fits files, to be downloaded later, can already be fixed.
FILTER. In this mode the tool scans the headers of the files. It is intended to detect files with certain properties and remove them from the standard processing workflow. The ${FILES_MODE} parameter is read from the config.createAB and depending on its value the headers (if FILE_MODE=HDR) or headers and fits files (if FILE_MODE=FITS) are hidden in the ${HIDE_DIR}/$DATE. The filter mode can handle two kinds of properties:
RAW_MINNUM (minimum number of raw files) is an optional classification criterion which is, strictly speaking, not OCA-compatible: it is not based on FITS key content, and cannot be expressed in OCA syntax. Nevertheless it is operationally important. It is therefore configured in the standard ${DFO_INSTRUMENT}_organisation.h as line
//RAW_MINNUM <raw_type> <value>
This line can be put anywhere in the configuration file. For good reasons it is usually configured in the same <raw_type> section as the OCA rules.
If found, the <value> is checked against the actual number of raw files in the (dummy) AB. If that number is lower than the threshold, the headers, or headers and raw files, are hidden. Hence, when it comes to creation of real ABs by createAB, they are no longer visible and don't disturb the workflow (no ABs, no VCALs). Find more here.
${RLS_CONFIG} is configured in the tool configuration (see below). It contains, in OCA syntax (to be ready for ABbuilder execution, no macros, no gcc pre-compilation!), a set of OCA 'if' statements, like
if ( DPR.CATG=="CALIB" and DPR.TYPE=="LAMP,FLAT" and TPL.NEXP==1)
or
if (rule set #2) or if (rule set #3) etc.
and some dummy entries to make this an OCA-valid statement. The user may list arbitrary rules (based on fits header keys), connected by 'or'. All files matching these rules will either be listed (by default), or listed and hidden (option -H). 'Listed' means a short fitsreport is created, and data are linked to a user-configured $HIDE_DIR/<date>. 'Hidden' means they are moved to $HIDE_DIR/<date>, thus removing them from the daily workflow. These data are finally deleted by finishNight. Note that with version 1.1, both fits and header files are moved to $HIDE_DIR/<date>, to provide a consistent handling for both the fits and hdr file_mode of createAB. If the filterRaw call is embedded in createAB (which is the normal case) then that tool takes care of moving the headers back into the $DFO_HDR_DIR/$DATE at the very end. Thereby the content of the header directory is always complete and consistent.
If FILTER_MODE is configured as "INV_FILTER", the rules are actually used in the reverse sense: all configured rules are applied to find files which are *not* hidden, all others are hidden. This makes sense when e.g. for spectroscopic modes the number of standard setups is small and well-defined while the total possible number of setups may be too high to configure.
Report. The execution report is written as "filter_${DFO_INSTRUMENT}_<date>.txt" into $DFO_LST_DIR.
Hiding. If files have been hidden, a file SOME_FILES_HAVE_BEEN_HIDDEN is written into the original directory, this to indicate that the directory content is not complete anymore.
Workflow for fixing header problems. One of the main purposes of the tool is to detect header problems. If you know the fix (the correct key values), it is extremely useful to get this correction into the header database (SAFIQ). The workflow is:
detect header problem | suggest correction to dbcm@eso.org | wait for confirmation email | update header |
filterRaw | hideFrame | ngasClient -H <hdr> |
How to use. Type filterRaw -h | -v for on-line help and version;
filterRaw -d 2025-12-30 -m SCAN
to scan the header data for date 2025-12-30, and
filterRaw -d 2025-12-30 -m FILTER -H
to filter the fits files and hide them if any matching files are found.
Below is an overview of the different possible modes.
mode | data source | properties checked | action | place in workflow |
SCAN | hdr files in $DFO_HDR_DIR/<date> | UNCLASSIFIED | list; email if configured | after header download, together with createReport and checkConsist |
same | EMPTY | list; email if configured | same | |
FILTER | fits files in $DFO_RAW_DIR/<date> | FILTER | list, hide if configured; email if configured | after fits file download and checkDownload, as part of createAB |
ALL | SCAN plus FILTER |
Installation
Use dfosExplorer, or type dfosInstall -t filterRaw .
The tool requires the DFS tools ABbuilder and fitsreport.
All configuration goes to $DFO_CONFIG_DIR/OCA: config.filterRaw (tool configuration), filter_$DFO_INSTRUMENT.RLS (the OCA configuration), and the fitsreport cfg file(s) (up to three are possible).
config.filterRaw
The file config.filterRaw is the tool configuration:
Section 1 | ||
TOOL_MODE | INTER | AUTO | INTER: interactive; AUTO: automatic mode |
HIDE_DIR | /data03/data/HIDE | root for directories with hidden files |
EMAIL_NOTIF | YES | NO | YES: reports are sent to $OP_ADDRESS |
FILTER_SWITCH | FILTER | INV_FILTER | FILTER: configured filtering rules are used to select and hide files INV_FILTER: inverse behaviour (configured rules are used to find files *not* to be hidden, all others are hidden |
Section 2 OCA configuration (for FILTERing only) |
||
RLS_CONFIG | filter_<ins>.RLS | specify here the name of the OCA config file for filtering, in compiled OCA syntax |
Section 3 |
||
FITS_REPORT_UNCLASS | e.g. fitsreport_unclassified.cfg | names of fitsreport config files for the three reports (could be the same) |
FITS_REPORT_EMPTY | e.g. fitsreport_empty.cfg | |
FITS_REPORT_FILTER | e.g. fitsreport_filter.cfg |
These files have the structure of fitsreport config files. Check out here for details.
filter_<ins>.RLS
See the example being delivered with the package.
Option SCAN:
1. Check if data have already been hidden; warning message if yes
2. scan header files for UNCLASSIFIED files:
2.1 call ABbuilder with stripped-off standard configuration to classify files, create dummy ABs
2.2 extract RAWFILEs from created ABs
2.3 check against list of all raw files, difference is list of UNCLASSIFIED files
2.4 run fitsreport on these files (configured under FITS_REPOPRT_UNCLASS) to display their properties
3. scan header files for files with empty keys (EMPTY)
3.1 use fitsreport to check for empty keys (configured under FITS_REPORT_EMPTY)
3.2 report the ones found
4. Exit
Option FILTER:
1. scan headers to find RAW_MINNUM violations:
1.1 call ABbuilder with stripped-off standard configuration to classify files, create dummy ABs
1.2 check ABs against configured RAW_MINNUM values
1.3 extract RAWFILE names from ABs which violate the threshold, move these file to configured $HIDE_DIR/<date>
2. filter headers, or headers and fits files, with configured properties (FILTER)
2.1 call ABbuilder with OCA configuration filter_<ins>.RLS to classify
2.2 extract the classified files from the ABs
2.3 if option -H has been used, move these files to configured $HIDE_DIR/<date>, and the corresponding headers as well
2.4 create a report for filtered files
or (INV_FILTER)
2.1 call ABbuilder with OCA configuration filter_<ins>.RLS to classify
2.2 extract the classified files from the ABs
2.3 move all other files to configured $HIDE_DIR/<date>, and the corresponding headers as well
2.4 create a report for filtered files
3. write report $DFO_LST_DIR/filt_${DFO_INSTRUMENT}_<date>.txt
How to embed in regular workflow
Call 'filterRaw -d <date> -m SCAN' as soon as headers are available.
Enable FILTER_RAW in config.createAB to YES, this will call 'filterRaw -d <date> -m FILTER -H' just before operational ABs are created.
Last update: April 26, 2021 by rhanusch |