Common Trending and QC tools:
|
tqs = Trending and Quality Control System |
make printable | new: v4.1: NOTE: any reference to THRESHOLD methods PER and OFF has been removed; it is discouraged, will be de-commissioned, and shall not be used on the HC monitor. |
see also: trendPlotter configuration (standard; non-standard) pet.py: the python plot engine for trendPlotter matplotlib documentation here |
||||||||||
v4.0:
|
|
|||||||||||
topics: Reports and types | data sources | hierarchies | output | closeup plots | statistics | downloads | web transfer | HISTORY | comments | usage | configuration | FAQ | hints and pitfalls |
enabled for parallel execution | |
enabled for condor execution |
trendplotter is the standard QC tool for generating trending plots. It is designed to provide in a standard way all plots required for the Health Check monitor, one of the flagships of QC operations. Supported features are:
The tool supports classical trending plots (with mjd-obs as abscissa), as well as correlation plots (arbitrary numerical X and Y axes, parametrized by time). Trending plots are supported for a pre-defined epoch (either current or history), for the FULL available time, or for a (short) monitoring period without history.
The tool is designed for parallel execution.
Comparison to qc1Plotter. The other common tool for generating trending plots is the qc1Plotter, the web-based interactive plotting tool which creates simple, dynamic trending reports. It is designed to create single X-Y plots, with basic statistics. It is a great tool for quick-look analysis. The trendPlotter provides much more functionality and can generate complex plots, optimized for information display.
Like the qc1Plotter graphs, it can be configured to contain statistical information, to mark outliers etc. The trendPlotter creates a report, combining multiple graphs, a tutorial text, and statistical results in a web page. The result page also offers direct links for data downloads, using the qc1Browser. The product plot is embedded in an image map which can be used to view closeup versions of each individual plot. All product graphics come in png format which offers good display quality at reasonable size.
trendPlotter uses the python package matplotlib (documentation here) to generate the report plots, and the highcharts library for interactive close-up plots. The python procedure pet.py is called by the tool to create the plots, both for the full report and in closeup mode. The user does not need to be python fluent (although some level of familiarity is useful for the configuration).
Each report has its own configuration. The structure of the report configuration file is detailed here.
Interactive close-up plots. With version 4.0, static close-up plots have been replaced by interactive close-up plots that use the Highcharts Java library. The necessary Java code is created by pet.py. The interactive plot offers the following features:
Reports. A report comes as a single HTML page, with a single embedded graph (a trending report is a "single sheet of paper"). Specified by -r <report_name>, each report has its unique name (per instrument). All information about its content is coded in the configuration file config.tp_<report_name>, to be provided under $DFO_CONFIG_DIR/trendPlotter (!).
Format. The report is a single plot over which an HTML image map is overlayed. Each plot has an associated active area on that map which reacts on two events:
Moving your mouse over any active area displays a formatted "tooltip" that contains plot information like statistics (average, thresholds, number of points), QC1 parameter name and table, and also includes the corresponding scores. Clicking on the active area will display a closeup version of the plot.
Types. The reports can have the following types:
Data sources. trendPlotter can use the following data sources:
The QC1DB data source is the standard.
|
Report organisation I: a report contains n trending plots (here: plot1-4). One plot contains m data sets. Data sets can be fed from QC1DB table records or from LOCAL data files. See an example here. |
A report consists of one or several trending plots. A plot displays one or several data sets. A data set consists of:
The latter case applies if you want to plot a correlation (X vs. Y) where the time parameter does not explicitly display but is still required to match the data by time.
A data set must be contained in a single table, you cannot combine e.g. column X from table 1 and column Y from table 2. A LOCAL table, like all other tables, must always contain mjd-obs.
Report groups.group navigation bar: |
|||||||||||||||||||||
|
|
Report organisation II: Reports can be grouped to report groups. See an example here. |
Reports can be grouped to a report group. This is highly recommended. If e.g. a flat file report exists for each of three standard filters, these three reports should be grouped together. This means that the output html file has an additional horizontal navigation bar for easy navigation between the members of the group:
same group: | bias | dark | linearity | fp-noise | contamination |
trendPlotter marks reports configured as REPORT_FREQ = DAILY with a little orange ball to highlight them as particularly important.
Output. The tool output is an image file embedded in an HTML page with tabular result information. The image file comes in png format. Each plot has a closeup version which is linked to the main page through the HTML image map mechanism. When moving the mouse over a plot, tooltips are displayed for that plot (for plots created by versions 2.5 and higher). The tooltips contain an extraction of the result table and a slightly shrinked version of the closeup plot.
The format of the HTML page slightly differs for the two report types:
HEALTH: | HISTORY: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
The HEALTH-type report HTML page has a headline with the following information:
The NEWS section has three parts:
The DATE table is the same as on the calChecker interface. It has, for the last 7 days, the following links:
In the same line, behind the DATE table, there are the two status flags for data transfer and for ngas access. Finally there is the button to launch the php script for the forced, external refresh of the HC monitor, launching an autoDaily session . More about the process behind this 'forced refresh' button here. Find more information about php scripts under "other".
The HISTORY-type plots have a HISTORY navigation bar designed for easy navigation (see below).
The vertical navigation bar is created by the TQS tool webNavBar. More ...
The QUICK-LOOK version of the Health Check plot is produced by the (plugged-in) tool scoreHC. All QUICK-LOOK pages are linked to each other with the usual group and vertical navigation bars: you can navigate within the QUICK-LOOK domain just like you can within the HEALTH domain. Toggling between QUICK-LOOK and HEALTH is possible with the "scores&comments..." and "HC plots..." links.
Closeup plots (new with v4). Clicking on any of the plots brings you to the dynamic closeup version, click here for an example. You can interact with the data by zooming in (drag the left mouse button), display data values, and enable or disable datasets (click the dataset descriptor at right). Printing and downloading as PNG and PDF is possible, as well as downloading the data values. These functionalities are provided by the Highcharts Java library.
Statistics. If configured, a plotted data set is statistically analyzed. Certain average and error options are available. The numbers are collected in a table as part of the HTML output page. This table also accomodates the caption, links, comments etc.
Here is an example of the results table:
|
The statistics also display as tool tips if you mouse over the respective plot. This works only for the first QC1DB parameter set per plot.
Download links. The output HTML page offers links to download the data displayed in the reports (only if the configured data source is QC1DB). There are three links which only differ in time range (this: the same time range as the plots; last_yr: last 365 days of data; all = full time range). For a trending plot, mjd-obs, civil_date and the Y parameter are queried by default, other columns can be configured to be added. A correlation plot downloads mjd-obs, civil_date, X and Y, plus other columns as configured.
By default, all png and html output remains under $DFO_TREND_DIR/reports until the next run of trendPlotter. This may be useful for report development.
Calling trendplotter with option -f will transfer all products (HTML page, png main and closeups) to the QC web server, into
The path is configured under QC_DIR in the main configuration file.
The directory structure on the QC web server is maintained by trendPlotter. The tool does a direct secure copy (scp) to the web server (w1/w2 alias www.eso.org). It calls a script 'qcDircheck' on w1/w2 which checks for the existence and creation of $QC_DIR.
HISTORY plots. This report type has some special features for maintenance. The task to create (or recreate) all HISTORY plots can become challenging when the whole data set is already several years deep. Also, you do not want to create all HISTORY plots from scratch if a new report needs to be added to the report group, or if a new period has started. Therefore, some components of the output page are dynamically added by the browser from a central source through the "virtual include" mechanism:
These files can be modified, and the modifications are then updated in all related HTML pages immediately. Virtual includes are logically like a symbolic link being evaluated by the browser. Their content is included in the HTML page just like any other valid code, although it is represented in the original page as a line starting with "<!-- #virtual include " (therefore 'virtual').
Navigation bar. The HISTORY-type plots have a HISTORY navigation bar designed for easy navigation:
|
The HISTORY plots open as a separate window. Each available history report is linked with a graphical symbol. Reports which are older or younger than the currently selected one by more than a year, are not linked one by one but only one per year (to avoid over-crowding). The most comfortable way for history navigation is using the red arrows.
Check out here how to create all HISTORY plots for a report.
Full range. This is another possible value for the RANGE parameter (instead of the usual 60/90 etc. days). This value is used for reports which display all historical data points, starting with START_DATE. A FULL report makes sense when data are taken only every month or even more rarely, or when one wants to display the total evolution over time in one single report. If configured, a FULL report may display as additional link in the HISTORY navigation bar:
|
More ...
MONITOR range. Finally there is a special range called MONITOR which can be configured as 'MON_<n>' where n is between 1 and 20 (days). A MONITOR report has the special feature that its timescale covers the last <n> days, as a closeup to see the trend for the last few days. (Actually MON_1 covers less than 1 day, 18 hours.) Find more info here, and the PWV monitor as an example. A MONITOR report can be called in mode HEALTH only. MONITOR reports are reserved for special cases.
You can create, update and delete comments related to a HC report. For the QC scientist, there are various ways to interact with the comment editor:
The Paranal daytime or nighttime astronomer can access the same buttons on the HC report or the score overview page.
The comments are managed by a php script provided by scoreHC. Find more information about the php script under "Other".
The comment interface has a mandatory DATE field to be filled out. This field is used by 'trendPlotter' to check if the comment is still valid. After 14 days, the comment is considered outdated, and the tool starts to send an email to the QC scientist with this information. It is up to the QC scientist to decide what to do: most likely delete, or update. The tool does not delete outdated comments by itself. Report comments with historical interest should go to the "HISTORY" part of the tutorial pages. General comments should go to documentation.
Type trendPlotter -h for on-line help, trendPlotter -v for the version number.
Operational options:
tool call | function called | products |
trendPlotter -r <report_type> [-t HEALTH] |
create report <report_type> in HEALTH mode (which is the default); the report starts with the latest available date ("current") and goes back by the configured time range | <report>_HC.html |
trendPlotter -r <report_type> -t HEALTH -f |
as before, with transfer of all products to web server | as before, with ftp to web server |
trendPlotter -r <report_type> -t HISTORY [-s 2015-10] [-f] | create report <report_type> in HISTORY mode; the start-time is specified by -s <YYYY-MM>; if unspecified, it is calculated to be the latest possible one; optional flag -f for ftp | <report>_<yyyy>_<no>.html |
trendPlotter -r <report_type> -N [-f] | create all HISTORY navigation bars; optional flag -f for ftp | navbar_<report>_<yyyy>_<no>.html |
trendPlotter -r <report_type> -A | create jobs for creating the complete HISTORY | executable job file $TMP_DIR/list_history |
trendPlotter -T | create all jobs for complete HISTORY | executable job file $TMP_DIR/list_history |
The following combination of parameters can be called:
-t |
-r <report> | -s <start> | -f | -N | -A |
HEALTH |
x! |
x |
|||
HISTORY |
x! |
x |
x |
x | x |
x!: mandatory
For certain technical reports (but not for the HC monitor reports!) there
is the option
trendPlotter -c <my_special_config> ...
See here for more.
How to operate
Operationally you should run trendPlotter jobs from job files $DFO_JOB_DIR/JOBS_HEALTH and $DFO_JOB_DIR/JOBS_TREND.
You can run several trendPlotter instances at the same time. Each instance has its own $JTMP_DIR created as $TMP_DIR/$$ and deleted after execution. If called with option -f, the result files are scp'ed from there to the web server. Without option -f, they are moved to $DFO_TREND_DIR/reports.
For execution of a trendPlotter queue by condor you have to specify the option -u <user>, e.g. -u vircam, in order to make condor find the home directory and source the environment properly.
trendPlotter has, due to its configuration complexity, its own configuration directory: $DFO_CONFIG_DIR/trendPlotter. The following configuration files exist:
name | required/optional | comment | |
trendPlotter | config.trendPlotter | required | tool configuration |
<any> | optional | non-standard tool configuration, for technical reports | |
per report: | config.tp_<report_name> | required | report configuration |
<report_name>.txt | optional | text file with some description of the report, to show up on the HTML result page | |
<report_name>.inf | optional | information file to be included in the report HTML page, to contain links and information useful for this report | |
<report_name>.msg | required |
local copy of the news file on the web server; displayed in HEALTH reports only, it is managed by the hcComments.php script (maintained by scoreHC). |
|
<report_name>.fdf | required if format is CUSTOM |
format definition file, used for non-standard, user-defined
report formats |
|
delta.tp_<report_name> | optional | "delta" configuration file, defines configuration which differs from the standard configuration for a certain period in time; evaluated for HISTORY reports only | |
per report group: | <group_name>.grp | optional | report group file for the definition of a report group |
The typical report REPORT will require one configuration file, config.tp_<REPORT>, and the two text files <REPORT>.txt and <REPORT>.inf. For non-standard formats, the format definition file <REPORT>.fdf is required. For HISTORY reports, you may also need a delta configuration file. Finally, a report group is defined in a group definition file.
Report files. There exist several optional text files which can be added to the report:
The tool configuration file is described below. The report configuration is described separately.
The tool reads its generic configuration file (config.trendPlotter), plus one specific for the report. All configuration files for trendPlotter are collected under $DFO_CONFIG_DIR/trendPlotter.
1. config.trendPlotter
Section 1: general parameters | ||
BROWSER_DISPLAY | YES|NO | display the new plot in browser |
CGI_URL | http://archive.eso.org/bin/qc1_cgi | URL to qc1 cgi script |
QC_URL | http://www.eso.org/observing/dfo/quality/GIRAFFE/reports | URL for HTML reports |
QC_DIR | /home/qc/qc/GIRAFFE/reports | corresponding path on $WEB_SERVER |
HELP_URL | http:/www.eso.org/qc/ALL/HC_help.html | URL for help page (for HC plots) |
qc_giraffe@eso.org | your QC ("functional") e-mail address. A "contact" link now appears on each Health Report plot page | |
INCLUDE_QUALITY_EVENTLOG | YES|NO | include the link to the event log in the horizontal navbar on the HC monitor pages |
QUALITY_EVENTLOG_URL | www.eso.org/eventlog/home | link to the quality event log |
1b. Non-standard tool configuration: is described here
2. config.tp_<report>: the report configuration files are described here.
Q: What is the easiest way to configure a new trendPlotter report?
A: The easiest way is to use a template file from one of the already operational one of the operational plots.
Q: What is the maximum number of plots in a report?
A: In principle it is unlimited, if you use a custom format. But there are good reasons
for limiting the template formats to support a maximum of 10 plots. Fonts and labels become
unreadable and the plot is too crowded. If you need many plots in your report, split it into
a report group.
Q: I am using a correlation plot, have set MARK_LAST to YES and now
the plot is overcrowded with large pinkish symbols.
A: For correlation plots with many input data, the MARK_LAST option may indeed mark
too many data points scattered around in the diagram. Switch this option off then.
Q: Can I connect my data points by a line?
A: Yes, but use this option wisely. This option makes sense for oversampled data sets only, but most if not all trending plots are undersampled meaning there are less data points than necessary to cover the full time evolution of the underlying parameters.
Connecting lines should be used only when a real trend, without scatter, should be visualized, or e.g.
in a correlation plot.
Q: How can I create a set of HISTORY plots?
A: You need 3 steps for creating a completely new set of HISTORY plots.
Q: How can I maintain a set of HISTORY plots?
A: The following maintenance is needed:
Q: The current Y range in my plots is inappropriate for data earlier
than 2011. How can I change the configured Y range for those HISTORY plots?
A: Use the delta_config mechanism: define in a delta file those lines containing the differing values, and add a final column specifying the applicable time range. More ...
Q: Why is there a need to define a CONDITION in (up to) four different
syntaxes?
A: Sorry for that ... A condition is applied to the data set plotted and to the data set offered
for download.
These data sets are retrieved from different sources: plot data from the QC1 database, requiring
SQL syntax and QC1 db names; data for downloads from the same database but using the cgi tool
qc1Browser with its own syntax. Finally, for the score support, the CONDITION is also required for a joined query so that the database keys are prefixed by 'QC.'
Q: Can I add non-standard graphical artwork to a plot, like marking a special mjd, plot a square or a special label?
A: Yes, there is the FIT_RULE mechanism. This "fitting rule" was
introduced to offer simple functionality for plotting a user-defined straight line. But it
can actually be used to contain arbitrary python code (one line; commands separated by ';'),
or calls of a python command file (execfile('...')).
See here for details.
There are some examples on the web with "advanced FIT_RULEs":
Check out the config files here to see how this was configured.
Q: I need a special plot format not being offered as a template. Is this supported, and how do I proceed?
A: You have complete flexibility with the plot format. The offered templates are considered to simplify your configuration, and also to have a certain common look and feel for all trending plots. If you have a good reason, you can define your own format, using a format definition file (.fdf). More ...
Q: I need to modify the data set before plotting. What is supported by trendPlotter?
A: A simple modification can be done using a COMPUTE_RULE. E.g.,
you can multiply the data column by a factor. In general, you cannot do more sophisticated
things like combining several columns from a QC1 db table. If you need this, you would better
define a new database column, feed it and use it directly for the trending plot. More ...
Q: The number and order of scoring boxes in my html QUICK-LOOK plot is wrong.
A: This is probably due to the fact that you are using scoreQC
with the thresholding option "HC", and your trendPlotter reports use a CUSTOM (<report_name>.fdf)
format. To fix this you will need to add a sequential list of all cells in the html table.
An example of such a modified CUSTOM format can be found here: STABILITY.fdf
Q: I have a set of trendPlotter jobs in a cronjob file, and want to call
a job from the command line. Do I need to care about avoiding conflicts?
A: No. trendPlotter is enabled to run many instances in parallel,
only limited by CPU performance.
Hints and Pitfalls (thanks to Mark for this section)
Then, in terms of scoring, there is only one data set.
In all other cases you will find that the QUICK-LOOK version will have no score results. Instead, a flag displays "multiple data sets".
If you have such plots and want to use HC thresholding you have the following options to fix this issue:
- split your reports into one data set per plot (this is strongly recommended, to avoid confusion!)
- apply your "HC" scoring to a report with one data set per plot, but do not publish this report to the web (i.e. never invoke the "-f" option when trendPlotter is run on this report). In contrast, your multiple data set per plot report will not be scored, but will be published for external web access. The advantage is that you can have both plots that can be easily scored, and plots that are visually more intuitive. The disadvantage is that you must maintain two trendPlotter configurations for a single report type that must always be
As an alternative, you could also publish both reports to the web but link only one to the navigation bars.
Last update: April 26, 2021 by rhanusch |