Logging System
Logging is a general mechanism used to store any kind of status
and diagnostic information in an archive, so that it is possible to
retrieve and analyze it at a later time.
ACS Logging System is based on CORBA Telecom Log Service[RD14]
and on the ACS Event and Notification System architecture
Applications can log information at run time according to
specific formats in order to record[RD01
- 6.2.1 Logging]:
The execution of actions
The status of the system
Anomalous conditions
Logging includes for example:
Device commands - reception and
execution of commands from devices [RD01
- 14.1.1 Logging of commands]
Debugging - Optional debugging
messages, like notification of entering/leaving specific code
sections.
Unrecoverable programmatic errors
Alarms - change of status in alarm
points
Miscellaneous log messages. Applications can log events
regarded as important to archive, for example receivers changing
frequency, antennas set to new targets etc.
Each log consists of an XML string with timestamp[RD01
- 6.2.1 Logging], information on the object sending the log and
its location in the system, a formatted log message. Context
specific information is entered as (name, value) pairs. The XML
format is defined in the Log Markup Language (logMl) XML Schema.
High level logs (for example logs directed to operators) are
"type safe logs" defined in XML files, in a way analogous
to error code specifications.
The XML specification files allow
to define context value pairs.
Code generation is used generate
type safe helper classes in all supported languages from the XML
specification files.
Getters and setters are generated
for all context value pairs.
Each type safe log
specification is associated in the XML specification files to
documentation providing help description of the error and of the
recovery procedures to be taken[RD01
- 6.3.5 Configuration].
Help handling is implemented as links to XML help pages.
Low level logs can be free format and are not required to be
specified in XML type safe definitions.
The logging system is centralized so that eventually the logs are
archived in a central log.
Log clients can subscribe to the Log Notification Channel. The
permanent Log Archive [RD01 -
6.2.2 Persistency] (an RDBMS) is essentially such a client and
its implementation is left to the application.
Applications log data using the API provided by ACS. This API
provides methods for logging information of the different kinds
specified above. A C++, Java and Python APIs are provided. The APIs
are based on the standard logging facilities provided by the
implementation language, when available:
the C++ API is based on the ACE
Log
the Java API is based on the J2SE
Java Logging
the Python API uses a generic ACS CORBA Logging service
Logs can be cached locally on the machine where they are
generated by a logging proxy and transmitted to the central log on
demand or when the local buffer reaches a predefined size. High
priority logs are not cached but are transmitted to the central log
immediately. The main purpose of caching is to reduce network
traffic.
Figure 3.9: Logging System
In particular the C++ API is optimized for performance and
reduced network traffic and implements caching.
In ACS, logs are directly generated in XML format on the
publishing side and transported as XML strings inside the logging
service. During the ACS 7.x development cycle, we experimented with a
binary transport based on IDL structures, hoping that this would
improve performance. To our surprise, however, we discovered that the
CORBA-encoded structures occupied 3 times as much space as their XML
counterparts, and that this produced performance degradations
of a factor of 3 in C++ and 15
in Java. Needless to say, this avenue is now closed.
The log API allows for filtering based on the log level, so that
log entries with low priority do not get logged. The filter can for
example be set to log or not log debug messages. The filter level is
determined at run time and can be set through a command or through
external configuration files. This filtering is done at the level of
the container, so that only log messages that pass the filter are
sent out over the network.
The API includes support classes (logging guards) to prevent
message flooding. It allows the application developers to control the
logging of repeated error messages. We have decided NOT to implement
an automatic mechanism capable of detecting whether the same message
is being logged at too high a rate, because we think that a reliable
implementation is very difficult to achieve and would in any case
heavily impact performance. But this means that careless developers
can flood the system with logs. We will have to assess later whether
to implement a more sophisticated mechanism.
A generic CORBA ACS Logging Service is available for any CORBA
application to log even if no specific API is available. The IDL
interface of the CORBA ACS Logging Service provides, on top of the
standard CORBA Telecom Logging Service, methods for each type of log
defined in ACS. Values for the relevant log fields are passed as
parameters and the Service takes the responsibility of formatting
them in appropriate transport structures and sending them to the
CORBA Telecom Logging.
A user interface (logging GUI client) allows to monitor the logs
while they are produced, in quasi-real-time, and to browse the
historical database off-line. It provides filtering and sorting
capabilities [RD01 - 6.2.3
Filtering]. It provides searching capabilities. It is useful to
filter logs by inclusion and exclusion and logical combinations based
on any logging field, like:
The log monitor is implemented as a Java application and it uses
the ALMA Archive interfaces for accessing the historical logs in the
ALMA Archive.
If the Logging System crashes or cannot keep up with log
requests, subsystem performance should not be affected but the log
system should record that logs have been lost. The log system will
run with lower priority than normal application processes, in
particular on the control system.
The logging system is NOT meant to allow "re-run"
observing sequences from log messages, but to allow analyzing a
posteriori the exact behavior of the system. Since the system will
never be the same twice, it will never be possible to execute twice
exactly the same set of actions. The task of making observing
sequences reproducible must be assigned to the higher level
observation scheduler and the logging system has to be used to
understand what eventually did not go as planned in the observing
sequence and to fix the identified problems.
Debugging logs are always present in code. Multi-level output of
debugging logs can be switched on/off by setting the priority for log
filtering. No compile time debugging flags are necessary, except when
strict real time concerns apply.