Error
System
The Error System provides the basic mechanism for applications to
handle internal errors and to propagate from server to client
information related to the failure of a request. An error can be
notified to the final user and appear in a GUI window if an action
initiated by the user fails (for example a command activated from a
GUI fails).
The Error System propagates error messages making use of OO
technology. The basic error reporting mechanism is to throw an
exception.
The error reporting procedure must be the same regardless of
whether the communication is intra-process (local) or inter-process
(remote), synchronous or asynchronous.
Errors can be propagated through the call chain as an ACS
Exception or an ACS Completion to be reported to the action requester
[RD01 - 6.3.6 Scope].
An ACS Completion is used to report both successful and
unsuccessful completion of asynchronous calls. More in general, ACS
Completions are used as return value or out-parameter in any case
where it is not possible/meaningful to throw an ACS Exception.
The ACS Error System provides a mean to chain consecutive error
conditions into a linked list of error objects. This linked list is
also called an Error Trace [RD01
- 6.3.2 Tracing]. It is not required that all levels add an entry
in the trace, but only the ones that provide useful information.
Both ACS Exceptions and ACS Completion contain an Error Trace.
The Completion can contain no Error Trace in case of successful
completion.
Representation (see class diagram for details):
ACS Error Trace is represented as a recursive CORBA structure,
i.e. as a structure that contains information about the error and a
reference to another structure of the same type (the previous Error
Trace), to build the linked list. The Error Trace contains all
information necessary to describe the error and identify uniquely the
place where it occurred. In particular it contains an error message
and context specific information as list of (name, value) pairs. For
example, in the typical case of impossibility of opening a file:
the error message would be: “File
not found”
the context information would be: (“filename”, “<name
of non existing file>”)
ACS Exceptions are represented as CORBA Exceptions and contain an
Error Trace member
ACS Completion is represented as a CORBA structure and contain an
Error Trace member, that can be NULL in case of successful completion
Figure 3.7: Error System data entities
Inside a process, an ErrorTrace, ACSException and
Completion are typically encapsulated into an instance of a Helper
class, in the native language used for the implementation of the
process (C++, Java and Python helper classes are provided directly by
the ACS Error System). This class implements all operations necessary
to manipulate respectively Error Trace and ACSException and a
Completion CORBA structure.
For inter-process CORBA communication, the ACSException and
Completion Helper classes transparently serialize the corresponding
CORBA structure and transfer it to the calling process [RD01
- 6.3.1 Definition]. There it is de-serialized into a new Helper
class, if available.
Using this approach only CORBA Structures are transferred between
servants and clients and it is not necessary to use Object by Value
to transfer a full-fletched object. For languages where a Helper
class is available, the encapsulation of the Error Trace structure
into the Helper is equivalent to the use of Object by Value. On the
other hand, languages where no Helper class is available can still
manipulate directly the Error Trace structure. Also, Object by Value
support from the ORB is not required.
In a distributed and concurrent environment, an error condition
is propagated as an Error Trace over a series of processes until it
reaches a client application, which either fixes the problem or logs
it. At any level it is possible to:
Fully recover the error. The Error Trace is destroyed and
no trace remains in the system.
Close the Error Trace, logging the whole Trace in the Logging
System and then destroying the Error Trace. The error could not
be recovered at the higher level responsible for handling it and it
goes in the log of anomalous conditions for operator notification and
later analysis. This typically happens for unrecoverable errors or
when an error has to be recorded for maintenance/debugging purposes.
This option is used also to log errors that have been recovered but
that we want to record in the log system, for example to perform
statistical analysis on the occurrence of certain recoverable
anomalous conditions.
Propagate the error to the higher level, adding local
context-specific details to the Error Trace (including object and
line of code where the error occurred). The higher level gets
delegated the responsibility of handling the error condition. This
strategy implies that we might lose important error logs if the
higher level application does not or cannot fulfill its own
obligation to log error messages (for example because it crashes). It
is therefore allowed at any time to log an important error trace
(typically at debug level) as a precautionary measure to make sure it
does not get lost. Logging it at Debug level ensures that the
redundant information is not visible unless we are looking at logs in
debug mode.
Figure 3.8: propagation of Error Trace
Error Trace, Completion and Exception Helper classes offer
operations for
handling the addition of new error levels to the Error Trace while an
instance is passed from one application layer to the higher one or
from a servant to its client.
converting to/from Error Trace, Completion and Exception
navigating the elements of the Error Trace, i.e. querying for a
specific error in the chain or for error details
Each Completion and Error Trace are identified uniquely by a pair
(Completion/Error Type, Error Code). If a Completion contains an
Error Trace, it represents and error and therefore its Completion
Type is the same as the Error Type of the top level Error Trace
structure.
Error and Completion specification is done through XML
specification files.
The XML specification files allow
defining context value pairs.
Code generation is used generate
type safe helper classes in all supported languages from the XML
specification files.
Getters and setters are generated
for all context value pairs.
Each (Completion/Error Type,
Error Code) pair is associated in the XML specification files to
documentation providing help description of the error and of the
recovery procedures to be taken[RD01
- 6.3.5 Configuration].
Help handling is implemented as links to XML help pages.
The Error System provides an API to access error documentation.
(Not implemented yet)
An Error Editor GUI tool is used by application developers
defining new (Completion/Error Type, Error Code) pairs and to manage
them (insert, delete, edit the description) without having to edit by
hand XML specification files. The tool ensures that (Completion/Error
Type, Error Code) pairs are unique in the system and generates one
XML specification file for each Completion/Error Type. (Partially
implemented)
User interface tools allow navigating through the error logs and
through the levels of the Error Trace [RD01
- 6.3.4 Presentation]. The logging display GUI is used to browse
errors and a specific Error display tools is used to browse the Error
Trace corresponding to a single error. (Partially implemented)
It is important to take into account that exceptions can skip
several layers in the call tree if there is no adequate catch clause.
In this case the error stack will not have a complete trace of the
call stack.
Particular attention must go into properly declaring exceptions
in the signature of methods and in catching them; in C++, specifying
exceptions in the signature of a method which then throws an
exception not included in that specification can cause the process to
abort at run time. The alternative approach, i.e., not
specifying any exceptions at all in the signature, cannot be used for
CORBA methods. Therefore the ACS error system documentation contains
guidelines to follow in implementing proper and safe error/exception
handling, in particular in C++.