3 APPLICATION SOFTWARE REQUIREMENTS

Put your logo here!

3 APPLICATION SOFTWARE REQUIREMENTS

This chapter defines requirements applicable to the development of all VLT application software. Unless otherwise stated they are applicable to WS and LCU software.

For software development, ESO provides VLT Common Software and VLT Standards. The software shall make use to the largest possible extent of the provided VLT Common Software and Standards. Equivalent, but different, implementations of the functionality provided will not be accepted

Due to the long elapsed time and considering that the VLT project is still under development, both cannot be frozen from the beginning of this project. The project shall make use of the most recent concepts and tools, as they are made available from ESO and still compatible with the project phase: behaviour definition for analysis, interface definition for design, code for implementation, etc. For instance, it will be mandatory to use:

� during implementation, a newer coding standard not available at the beginning of the project but before the implementation

� during the design, a library which code is not yet available, but the callable interface has been defined.

If not explicitly defined, methodologies and development rules shall be derived from the existing VLT Standards, keeping the overall development as homogeneous as possible.

ESO will try to keep backward compatibility on released software. There are normally two releases per year.

The VLT Common Software is described by User Manuals. For the parts that are not yet available and for future extensions to the existing parts, ESO will anticipate in due time other documents (functional specification, etc.) that allow the understanding of the future available functionality.

Exceptionally and with ESO approval, the usage of a newer release can be temporary delayed if critical activities are on going.

3.1 General Requirements

1. All motions of mechanical devices, e.g. moving a motor, or other important actions on a hardware device, e.g. integration of a detector, switching on a lamp etc., shall be logged when the action has finished. Example of log:

"Guideprobe has moved to position 2000".

Exceptions to this rule shall be made to devices moving continuously, e.g. a tracking axis, as this would saturate the logging system. In this case start and stop of the continuous motion shall be logged, e.g. start tracking/stop tracking.
In case discrete motions occur frequently, e.g. more often than every 10 seconds, it must be possible to disable/enable the particular logging of these events.
Additionally processes shall log that they start executing and when they terminate. Exceptions from this rule are tasks performing a specific short action and then terminates. In this case it is better to log the action performed. Typical example is a task executed by the LCU command interpreter moving a piece of hardware, waiting for the movement to finish and then terminates.

2. To fulfil the previous requirement, logs shall be grouped into categories defined by log id

� log ids below 100 are reserved for CCS/LCC internal use, see [1] and [2].

� logs shall be grouped into one of the following types:

� Normal user logs as specified above

� Normal user logs which can occur frequently (more often than every 10 seconds)

� Maintenance logs

� Test/debug logs used during testing/debugging

A future release of LCC will provide a possibility to enable/disable logs with a certain or within a certain range of log ids.
The log ids shall be specified in such a way that it is simple to change a log from one group to another.

3. For processes or procedures waiting for some event, time-out shall be used to avoid hang-ups and dead-lock situations.

4. Processes or procedures activated periodically shall be monitored to check that the process or procedure is still alive. It can be done in the following ways:

a. LCU processes shall use a VxWorks watchdog (VxWorks function wdStart). This watchdog shall be configured to call a routine when the time-out expires. By calling wdStart from the monitoring process with a shorter period than the time-out, the watchdog timer can be prevented from expiring. The routine called from the watchdog when it expires shall inform the system that the monitoring process is not performing as expected.

b. WS processes shall use RTAP watchdogs to perform the same functionality.

5. Values returned from subroutine and procedure calls shall be checked for error.

6. The access to processes shall be restricted using the booking system to prevent unauthorized access.

7. Applications must support reconfiguration of the booking configuration. If for example parts of a VLT environment is reassigned to another user during operation, e.g. to a maintenance engineer while the observer continues working in a reduced mode, e.g. taking darks or calibrating. When the observer ends his session the software must cope with this situation where parts of the environment is no longer available without blocking when it tries to access and shut down this part which is reassigned to another user.

8. Each software module shall supply a branch configuration file(s) as defined in CCS On Line Database Loader User Manual [3], defining the database branch it uses.

9. With each WS or LCU environment a DATABASE.db file defining its database shall be supplied, see also section 4.5.

10. Each software module shall supply an error definition file listing all errors reported by the module, see section 3.2.2 Error/Failure Reporting on page 27.

11. With each WS/LCU system all alarms referring to the subsystem shall be documented, see section 3.2.2 Error/Failure Reporting on page 27.

12. Function interfaces shall not be too complex, e.g. not have too many parameters. In particular LCU functions shall not have more than 10 parameters to make it possible to call them from the VxWorks shell.

3.1.1 General Requirements only applicable to LCU Software

1. Each LCU shall be able to work autonomously, i.e. it shall not be dependant of any other LCU or workstation to be able to work within the scope of its normal, local control. An exception to this is during boot and setup phase, e.g. when the LCU boots and applications are started. During this phase it accesses the database definition files, CDT and CIT files and files in the environment directory.

2. All relevant signals connected to an LCU (voltage levels, switch positions etc.) must be available as input signals, easy to read via standard I/O functions and LCC/CCS commands.

3. All moving or adjustable devices shall be available for individual, independent movement (adjustment) commands for test and maintenance purposes.

4. Each LCU software module shall supply a script file for the downloading and the necessary initialisation of the software on the LCU, see also [8]. The module script shall be installed in the LCU environment boot script with the tool vccConfigLcu. This boot script shall be configured to install the software required by the LCU subsystem.

5. With each LCU subsystem all safety events and interlocks connected to the subsystem shall be documented, see section 3.3 Safety on page 28.

6. With each LCU subsystem there shall be a device file listing all software devices controlled by the LCU. This file is used by the LCU Management functions described in the LCC User Manual [2].

7. At least 30% (up to 1 MByte) of the available memory on the LCU shall be reserved for future extensions.

3.1.2 Application Data

1. All configuration parameters shall be stored in the local database. It is not acceptable to have hardcoded values.

2. Parameters set by a command, like position command to a motor, command state of a shutter (open, close), integration time of a detector etc., shall be stored in the database and shall not be modified until a new command to this process is received. This is called the `set value' of the parameter. Parallel to this parameter there shall be a parameter in the database for the real status of the unit (position, remaining integration time etc.). This is the `actual value' of the parameter.

3. Parameters and status values shall be stored in the local database. Application programs having local copies of database parameters must make sure that there is no inconsistency between the database and local variables. If the local copy is updated, then the corresponding database attribute must be updated as well. To avoid such inconsistencies it is recommended not to use local copies.

The LCU local database shall have a complete status of the subsystem the LCU is controlling, to enable monitoring of the system through the database. This implies that the database is updated when the hardware status is read or the software status changes.

4. For actions like moving a motor, opening a shutter, making an exposure etc., there shall be a status parameter indicating the state of the action, e.g. moving/initialising/ready.

5. Direct addresses to the local database, see CCS/LCC User Manual [2], must be resolved again with the CCS/LCC function dbGetDirAddr when the database have been reloaded or after a reboot of the LCU or restart of the WS environment.

3.1.3 Communication Protocol

1. All process to process communication shall use the Message System. Data exchange can also be done through the database. UNIX/VxWorks message queues and signals shall not be used. VxWorks semaphores can be used for mutual exclusion.

2. The destination process must be prepared to start executing any command within a "command response time" (similarly to an interrupt response time) of 100 milliseconds (maximum value for VxWorks, typical on Unix). This value has to be considered as a generic requirement when no more demanding specific requirements for higher command rates exist.

3. The command response time defines also implicitly the maximum time for the execution of a synchronous action following a command. To execute actions which take longer time, techniques like spawning and/or forking shall be used, so that the main process is ready to accept in time the next command.

4. For every command at least one reply shall be returned, when the requested action has been completed. In case of failure, the reply shall report back an error (error reply). It shall not be possible for the sender to ask the destination not to get a reply.

5. Periodic or multiple replies can also be a consequence of a command. There shall always be an indication of the last reply. Sender programs have to be prepared to accept a variable number of replies, up to when the last reply arrives.

6. Timeouts on replies to commands should have values greater than 1 second and typically double than the normal execution time of the destination process (for actions up to about 20 seconds) or 20-30 seconds more for longer actions. For commands sent over a network the network time-out must also be considered (up to 10 seconds for the TCP/IP protocol).
Documentation of destination processes must include necessary time-out values for its commands.

7. Sender programs shall normally enforce timeouts when waiting for a reply, to avoid hanging. (There might be exceptions, e.g. when time is passed as a parameter and can be modified like in case of an exposure.)

8. Periodic replies will typically be used for start-monitoring commands, where for example some kind of status could be returned from the time the command is given onwards.

9. One reply only is the guideline for other commands, except when Verbose mode is enabled.

10. The normal way for a user to follow the development of an action following a command, is to monitor the effects of it via a user interface linked to the real-time database. For example a user wanting to move the telescope will monitor telescope coordinates on a specific panel.

11. Additionally it is recommended that the main destination programs support the VERBOSE command (see common commands). The meaning of this is that intermediate replies and/or additional log messages shall be inserted in this mode before the last reply. These conditional intermediate replies are meant to be of help, to provide additional information to the user about the current activity, for the user, who chooses to work in verbose mode and they should only occur every few seconds. Therefore they have only sense for long commands. Typically one first intermediate reply should be given after the command activation time, as soon as a command has been accepted for execution. All these additional replies shall start with `VERBOSE', so it is clear to the receiver that they are not normal replies.

12. The use of a Verbose command in a high level module shall not imply that this command is propagated to lower level modules. So the meaning of Verbose is local to the process to which it is applied. In other words testing a WS software module in verbose is one activity and one gets certain intermediate replies at that level. This is different from testing an LCU software module in Verbose mode, which assumes different kind of needs at a lower level. So sending a Verbose command to the WS software, should not result in having it sent also to the (possibly related) LCU software.

13. Each received command shall be checked for formal correctness (e.g. syntax of command, range of parameters and checked against the present status and all relevant conditions, including input signals and local database parameters of the subsystem), before it is accepted for execution.

3.1.4 Commands

1. For each process accepting commands, a Command Definition Table (CDT) containing a list of all commands with the corresponding parameters with their range shall be provided. The list shall also indicate which parameters are optional. For each command there shall also be a reference to a help text explaining the command and its parameters. The file shall be used by processes to make a static check of the parameter values. The syntax of this file is defined in the LCC User Manual [2].

2. Repetition of commands shall not give errors, e.g. a command to open a door shall be accepted with no action and not return an error if the door is already open. The same applies for state transition commands, e.g. the command ONLINE shall be accepted with no action if the system is already in the state ONLINE.

3.1.4.1 Commands on LCUs

1. The LCU shall have both elementary commands (e.g. move motor to encoder value) and complex commands where SI units (meters, kilograms and amperes and derived units and subunits e.g. volt, mm etc.) are used. A complex command can execute one or several elementary commands.
Units of parameters shall be indicated in the Command Definition Table (CDT).
Units of database attributes shall be indicated as a comment in the database definition files, dbl class definitions etc.

2. Normally no direct commands can be exchanged between LCUs. There are exceptions, which have to be agreed upon case by case with ESO, as normally synchronization problems can be solved with the use of the Time Reference System, see section 3.1.5.

3. All functions available as a command shall normally also have a programmatic interface (procedure call). Exceptions to this have to be agreed with ESO.

4. Normally an LCU is not supposed to have any knowledge about other LCUs or systems. If an LCU needs such information it should be provided from a supervising node. Normally all configuration parameters shall be initialized at startup and other parameters shall be sent down to the LCU with a command using the Message System. No direct access shall be made from the LCU to the database of another node.

3.1.4.2 LCU Standard Commands

The commands defined in this section are mandatory and shall be accepted by each process accepting commands.

Each LCU process accepting commands (software device) shall accept the standard commands, including one parameter containing the name of the software device to control (one process can control one or several software devices)

� INIT: initialize the software and the hardware device(s) it is controlling. The command must be accepted in the state LOADED and can also be accepted in the states STAND-BY and ON-LINE. This command shall be followed by the command STANDBY or ONLINE to terminate in the state STAND-BY or ON-LINE.

� STATE: return the main state of the software device. The state of a software device shall be one of:

OFF; the software device is not operational (e.g. software not loaded, hardware not available). This state cannot be returned by the software device as it is not running.

LOADED; software is loaded and running but some software and/or hardware is in an un-determined state (= not initialised). This is the state where no activity can take place, except starting the system (= go to a higher state). It is though possible to perform test and maintenance activities in the state LOADED, e.g. moving hardware in maintenance mode.

STAND-BY software loaded and initialized, hardware initialized but in stand-by (applicable parts will be switched off, brakes clamped etc.)

ON-LINE software loaded and initialized, hardware initialized and fully operational. The software is ready to receive and act immediately on commands.This is the normal state for operation.

A software device or an LCU can on its own initiative only change its state from higher to lower, where ON-LINE is the highest state and OFF is the lowest. It can only change its state to a higher state as a consequence of a command, e.g. the commands STAND-BY or ON-LINE.

� STATUS: return the relevant status of the software device and the hardware device(s) it is controlling. Example of status reported is position of motors, limit switches etc.

� STANDBY: sets the software device in stand-by state; possible actions are: go to stand-by position, switch off lamps, close shutters, engage brakes etc.; the proper state of a software device in stand-by state has to be specified for each software device of a subsystem. Initial state can be LOADED or ON-LINE. If the initial state is LOADED, the software device has to be initialized if not already done.

� ONLINE; takes the software device from LOADED or STAND-BY state to normal operating (ON-LINE) state. If the initial state is LOADED, the software device has to be initialized if not already done.

� OFF: switch off the software device; the hardware controlled by the software device is switched off (power off or disable) and if it has a brake, the brake is engaged. Puts the software device in the state LOADED.

� SELFTST: perform a self-test of the software device. The test shall be performed on level (a) as defined in section 3.4.1 requirement 2.

� TEST: perform a complete test of the software device and the hardware devices it is controlling. The test shall be performed on level (b) as defined in section 3.2.6 requirement 2.

� SIMULAT: start simulation of the hardware device controlled by the software device (hardware simulation)

� STOPSIM: stop simulation of the software device; stop simulation shall put the software device in non-initialized state (= LOADED)

� STOP: stop the current action performed by the software device; no change of main state.

� VERBOSE: put the software device in verbose mode, see section 3.1.3, requirement 11.

� VERSION: to retrieve the version of the application software. Shall return the software version and the date of the last compilation of the module. If RCS is used for version control the RCS header shall be used for the version. It can be implemented with the following code:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

static char *rcsId="@(#) $Id$";

static void *use_rcsId = (&use_rcsId,(void *) &rcsId);

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

ccsCOMPL_STAT xxxVersion(

OUT char *version,

OUT ccsERROR *error

)

{

sprintf(version,"@(#) xxx $Revision$, __DATE, __TIME__");

}

� EXIT: terminate the process.

and additionally the special commands:

� KILL: terminate the process

� BREAK: stop the current action performed by the process.

The BREAK and KILL commands are special in the sense that they are not delivered as commands to the destination process. Instead they trigger signal handlers for BREAK and KILL. They are analog to the commands STOP and EXIT.

3.1.4.3 WS Standard Commands

Workstation software units shall accept the standard commands:

� INIT: to initialize the software unit

� STATUS: return the relevant status of the software unit

� STATE: returns the state of the software unit. The state of a software unit shall be
one of:

OFF: the software unit is not operational, e.g. software not running. This state cannot be returned by the software unit as it is not running.

LOADED: software is running but some software and/or hardware is in an undetermined state (= not initialised). This is the state where no activity can take place, except starting the system (= go to a higher state). It is though possible to perform test and maintenance activities in the state LOADED, e.g. moving hardware in maintenance mode.

STAND-BY: all units controlled by this unit shall be in STANDBY. Monitoring activities can take place.

ON-LINE: the software unit is fully operational and ready to receive and act immediately on commands. This is the normal state for operation.

� STANDBY: to set the software unit in STAND-BY state

� ONLINE: to set the software unit in ON-LINE state

� OFF: to set the software unit in LOADED state

� SELFTST: to perform a selftest of the software unit

� SIMULAT: to start simulation of the software controlled by this software unit (software simulation)

� STOPSIM: to stop the simulation of software

� STOP: stop the current action performed by the software unit

� VERBOSE: put the software unit in verbose mode, see section 3.1.1.3, requirement 11.

� VERSION: to retrieve the version of the application software

and additionally the special commands:

� KILL: terminate the process

� BREAK: stop the current action performed by the process.

The BREAK and KILL commands are special in the sense that they are not delivered as commands to the destination process. Instead they trigger signal handlers for BREAK and KILL.

3.1.4.4 Optional LCU Commands

A process controlling hardware device(s) shall also accept the following specific commands for each device if the corresponding functionality is required. The commands are optional and must only be accepted if the corresponding functionality is needed.

� CHECK: check if the last action has finished

� WAIT: wait for the last action to finish

� DISABLE: disable hardware access of the device, can be used if the hardware device is not available (disconnected or dismounted). If the software device receives a command to act on the hardware device in a way which is not possible, e.g. move it, it shall return an error.
The `higher level' process using the device must know that the device is disabled and shall not send any command to the device. It must then decide if it can continue without that device or not.

� FIXED: to fix a hardware device in a fixed position. If the software device receives a command to act on the hardware device in a way which is not possible, e.g. move it, it shall return an error. It shall still return status like position etc.
The `higher level' process using the device must know that the device is fixed in a certain position and shall not send any command to move the device. It must then decide if it can continue with the device in that position or not.

� ENABLE: enable hardware access of the device (this is the normal state)

� MOVE: move the device to a given absolute position

� MOVEREL: move the device to a position relative to the current position

� SET: set a device in a specified state; examples: set lamp on, set shutter open etc.

� HANDSET: set the device in handset mode

� HSETOFF: exit handset mode.

3.1.5 Time System

1. Synchronization between processes shall be achieved by using the Time Reference System. Synchronization between two LCU or WS processes on the same CPU or on different CPUs can be done by sending them a specific command to start an action at a particular time, e.g. schedule a process to run at a particular time.

2. All time required shall be retrieved from the Time Reference System through Time Handling. The purpose of this is to have all activities synchronized to UTC.

3.1.6 LCU Drivers

1. All LCU hardware shall always be accessed via a driver. For standardized hardware interfaces there will normally be a driver supplied by ESO. Standard interface boards are defined in the Electronic Design Specification [10]. If there is no driver available for a specific interface then one should be developed. Drivers shall be developed according to the conventions defined in the Driver Development Guide [4].

2. A driver shall conform to the VxWorks driver interface, that is the driver shall supply the calls create, delete, open, close, read, write and ioctl if applicable. Additionally there shall be a routine to install the driver and one routine to install the I/O devices. I/O device configuration parameters shall be stored in the device descriptor table.

3. The driver shall control the access of the hardware in such a way that two processes can't access it simultaneously. This shall be implemented with the open call in such a way that only one process can open an I/O device in write or read/write mode and only a process which has opened the device in write or read/write mode has the permission to write to the device. To open the device in read mode and to read from the device is not restricted.

4. There must however be a way to bypass the write restriction. If for example the process which has opened the I/O device in write mode is hanging, it is not possible to write to the device any more. To overcome this there shall be a command to the driver to force closing of the device. After that it is possible to reopen the device in write mode again. This command must only be used to recover from such a deadlock.

5. The driver shall control the access, provided by the open/close calls giving the following possibilities (read accesses are always permitted for a process which has opened the I/O device in read or read/write mode):

� Exclusive read/write access: write access is granted only to the process that performed the open with this option. Requesting Exclusive access to a device already opened in Exclusive or Shared access mode gives back an error.

� Shared read/write access: write access is granted to the process that performed the open and to any other process that later on will request a similar open. Requesting Shared access to a device already opened in Exclusive access mode gives back an error.

� Test read/write access: write access is granted to the process that performed the open with this option without limitation to the present opening status. Requesting Test access to a device already opened in Test access mode gives back an error, i.e. Test access mode is only granted to one process. This mode shall only be used by test and maintenance software. The idea is that test and maintenance software can access the driver without interfering with other processes using the driver.

� Read access: only ioctl read commands are granted to the process that performed the open with this option. Requesting read access to a device can be done without limitation to the present opening status.

6. The driver shall check all relevant error conditions and return errors to the calling process.

7. The drivers shall normally interface to one piece of hardware (one VME board) on a fairly low level, mirroring the hardware functionality. On top of the driver there shall be intermediate routines with a general hardware independent interface (examples of this are LCC I/O handling, motor control module etc.). These intermediate routines are called from high-level software implementing the functionality of the LCU subsystem.

3.2 Failure Mode Operation

This section describes how the software shall react to errors and abnormal situations. It also defines maintenance software, which can be used to reduce the probability of errors.

3.2.1 Error/Failure Modes

The basis of error handling is that all error conditions are checked and that all errors and abnormal events are reported to the error handling node.

Errors and abnormal events are subdivided into 3 categories:

� Fatal
No recovery is possible. The application shall shut down all hardware activities in a controlled way and abort all software activities. The hardware shall also be able to shut down the system in case of CPU failure. Example: CPU power supply failure, memory error etc.

� Serious
The system cannot perform what it was supposed to do, but there is no need to shut down. Devices causing or affected by the error shall be stopped.
Attempts to recover can be performed from the controlling node either automatically by software or manually.
Example: a device goes on hardware limit, moving parts are close to collision, temperature is too high for operation etc.

� Warning
An anomalous situation is reported, but the system can continue to work.
Example: temperature is high, but not too high to continue operation.

For all applications all errors and abnormal events shall be grouped in one of the above three categories. See also next section Error/Failure Reporting.

The following table gives some examples of different error categories:

General System Errors

Fatal

System crash, network failure, Rtap failures.

Communication Errors

Fatal

Subsystem processes not available or crashed. Message system failure.

Database Errors

Fatal

Database in-accessible.

Access Errors

Serious

Missing files, unset environment, invalid path, privileges.

System Management Errors

Serious

Disk space full e.g. logging data

User Errors

Warning

Invalid commands, invalid parameters, illegal state, etc.

3.2.2 Error/Failure Reporting

1. Errors shall be reported and logged according to what is defined in the CCS/ LCC User Manual, [1] and [2] section Error System.

2. Each software module shall supply an Error Definition File containing the error message corresponding to each error. The format of this file is defined in the LCC User Manual [2]. The error messages shall give a clear, unambiguous and unique description of the error. The error message shall also be detailed enough to make it clear what caused the error.

3. Alarm handling. Alarms are triggered through the database. For alarms which are not directly assigned to signals or database attributes reaching limits or illegal values, special attributes has to be defined in the database, to handle them. The alarm shall be configured to trigger when the attribute changes to a specific value or exceeds a certain limit. The alarm system is described in the CCS User Manual, [1].

Alarms from an LCU shall be handled differently during the engineering phase and when the LCU subsystem is integrated into the VLT software:

� During the engineering phase, i.e. before the LCU subsystem is integrated with the CCS software, [1], the database alarm attributes shall be configured to give abnormal events. Abnormal events will be sent to all processes including the engineering user interface (lccei) requesting them with the function evtAttachAlarm or the command EVTATTA.
There shall be an Alarm Definition File defining all abnormal events. The format of this file is defined in the LCC User Manual [2].

� When the LCU subsystem has been integrated with the CCS software, [1], the alarms will be handled by the CCS On-Line Database. To enable this, the scan system of the WS has to be configured to copy all database alarm attributes to the CCS On-Line Database, see the CCS User Manual [1]. Delivered with the subsystem shall also be a list with all possible signals and database parameters to be reported to the workstation and a list of possible alarms from the LCU.
Abnormal events can still be used locally on the LCU to inform local processes of abnormal events.

4. It shall be possible to trigger any error or alarm message from any application through software, to give the possibility to simulate any error or alarm condition interactively.

5. Each application must be designed in such a way that it avoids flooding the system with alarms, error messages and logs.

The CCS User Manual [1] defines the general error and alarm handling of the VLT.

3.2.3 Recovery

1. The system shall try to recover from failures. Retries shall be done when appropriate on failing devices. If it is not possible to recover, it shall shut down the failing device, parts of the subsystem, or the whole subsystem. The recovery shall take place before the device changes state due to the failure. Once the state has changed, there is no recovery possible at LCU level, because a device or LCU can never change its state to a higher state autonomously (higher meaning going from e.g. LOADED to STAND-BY or ON-LINE).

2. Retries must also be included in the local command handling so that recovery of the most common time-out and no-answer situations is attempted, before coming up with an error.

3. Retries shall be an exception in a system under normal operation and be limited to abnormal situations caused by failing hardware.

4. The warnings produced at every trial shall be logged, to be able to measure reliability.

3.2.4 Performance

Failure mode performance has to be defined for every system in the corresponding functional specification.

1. Normally there is no Operation possible after a Fatal error, apart from shut-down and restarting.

2. In case of Serious errors various degraded operation modes might be possible, after unsuccessful recovery attempt.

3. Normally there shall be no Warnings during normal operation. A high number of Warnings when no hardware failure is involved, can only indicate a mismatch or marginal behaviour of the software and indicate the need to tune it (e.g. adjusting hardware related timeouts, number of retries etc.).

4. It should also be avoided that working parts of the system are affected by failing units, for example because errors or abnormal events keep being reported and flood the system.

3.3 Safety

Many LCU subsystems will be connected to one or more hardware stops, see List of Service Connection Points [17] for details. See also the Electronic Design Specification [10] and the Service Connection Point Technical Specification [16] for more information. For the VLT the following hardware stops are defined:

� Emergency stop: a total stop and power-off of everything (including LCUs, LAN etc.) on and within ONE telescope enclosure. `Emergency stop' is activated via `protected' (under a glass cover) push-buttons.
No action is possible by the LCU software (the LCU is switched off).

� Motion stop: will stop a number of predefined axes, e.g. altitude, azimuth, dome rotation, rotators etc., by hardware; this can be done e.g. by switching off power amplifiers and putting an axis on brake. All `Motion stop' buttons in a given area will have identical action. An `area' in this sense is a physically separated working area, such as, for example, the telescope area (including the enclosure), the Coud� lab, the interferometry lab etc.

This system can be used during installation and maintenance to prevent accidents. `Motion stop' is activated via push-buttons (of different colour and shape than the `Emergency stop' buttons) and/or via key switches.

The LCU controlling a device connected to the `Motion stop' shall be informed of this stop via a status signal. When a `Motion stop' is activated, the LCU application software shall shut down this device. This is to prevent the device from moving when the `Motion stop' is deactivated, and before the subsystem has been restarted.

� Local disable switch: disables the motion of a specific unit, e.g. the altitude motion. This can be implemented, for example, by switching off a power amplifier, or by disconnecting the unit from its power supply, or in some other way (like `Motion stop', but selective). The switch can only be manually operated! If there is a need for computer controlled power switching, there has to be an additional switch to accommodate this.

An LCU controlling a device connected to a `Local disable switch' shall be informed of this stop via a status signal. When a `Local disable switch' is activated, the LCU application software shall shut down this device. This is to prevent the device from moving when the `Local disable switch' is deactivated, and before the subsystem has been restarted.

� Local power switch: switches off power to a unit; a rack, an LCU or a piece of equipment. Each such unit must have its own Local power switch.

The action of the LCU software on a `Local power switch off' depends very much on which equipment was actually switched off and has to be specified from case to case.

� Interlock: an automatic and immediate stop of an individual axis in case of a dangerous situation, or to avoid a dangerous situation. The signal triggering the interlock action is a hardware signal (e.g. a limit switch, a temperature sensor, a pressure sensor etc.), and the action is completely performed by hardware. The action is a very basic and `crude' action. There are two groups of interlocks:

� Interlocks local to a subsystem; e.g. azimuth axis stops if oil pressure to bearings is too low.

� Interlocks between subsystems; e.g. altitude axis stops if M1 support oil pressure is too low.

Some interlocks might need an override possibility (to allow for getting out of the dangerous situation). This is normally implemented by use of a spring loaded key switch.

An LCU controlling a device connected to an `Interlock' shall be informed of this stop via a status signal. When an `Interlock' is activated, the LCU application software shall shut down this device. This is to prevent the device from moving when the `Interlock' is deactivated, and before the subsystem has been restarted.

For any LCU subsystem, the following principles shall apply:

1. For each subsystem there shall be a description of all safety events and the corresponding actions to be taken by hardware and/or software.

2. Any subsystem software has to take the equipment of the subsystem to a safe state in any error, emergency or dangerous situation, without relying on other subsystems or computers.

3. All subsystems have to define:

� their own, internal interlocks.

� interlock signals that are needed in other subsystems

� interlock signals needed from other subsystems.

4. The subsystem software shall continuously monitor the state of the equipment it is controlling (via input signals and software parameters) and if the equipment comes into a state which can endanger people or equipment, it shall immediately stop the equipment. Control outputs shall be validated so that they are not out of range. The monitoring of signals and parameters shall be done by using the event monitoring and abnormal event handling software described in the LCC User Manual [2].

5. The LCU software shall always be able to accept a stop or abort command and execute them to stop ongoing activity within 400 ms.

6. The LCUs shall report all stops and interlocks as abnormal events, see the LCC User Manual [2].

3.4 Maintenance

The VLT Maintenance Concept is defined in VLT Maintenance Concept [11]. The LCU software shall follow this concept and additionally the following shall apply:

1. Preventive maintenance has to be carried out periodically for each subsystem. Preventive maintenance shall be supported by maintenance procedures and maintenance software.

2. Corrective maintenance shall be supported by maintenance procedures and software. The aim of this software is to quickly locate failing LRU's and to isolate the fault. See also section 3.4.1. Corrective maintenance shall also be supported by Troubleshooting software.
The maintenance support software shall be available for different kind of users

a. simple go/no go tests for operations staff

b. more in-depth test procedures for maintenance staff performed from the control room

c. for failures more difficult to locate it will be necessary to disconnect part of the system, to apply test adaptors or other test equipment and to interact locally on the system tested.

The tests might involve movements of the equipment and will require restarting of the system.

The tests shall be performed in maintenance mode (stand-alone state).

Actions requiring special care or lengthy initialisation procedures after the test has been completed shall only be available to authorized staff.

3. To support maintenance, all signals and local database parameters shall be accessible locally and from other nodes. This does not only include signals directly connected to the LCU. It must also be possible to access signals connected to systems below the LCU (e.g. on a field-bus or a programmable logic controller). All moving and adjustable parts have to be individually movable and adjustable through commands from another node. Similarly it must be possible to adjust configuration parameters for motors and to access the status of all movable parts (e.g. encoder position, temperature, current etc.).

4. ESO provides utilities supporting maintenance. Such utilities are the logging system, sampling software and engineering test tools for I/O boards and motors etc.

3.4.1 Self-checking Test Programs and Diagnostic Routines on the LCU

To facilitate test and maintenance at the LCU level, test and maintenance software shall be delivered. Three levels of test and maintenance software shall be supplied:

1. Monitor level.
LCC shall be configured to monitor all input signals in background, updating the database. When a signal is configured for events, either using the scan system or directly attaching an event to the signal, LCC automatically starts scanning the signal and updates the database image of the signal. The sampling period can be configured through the LCC function evtSetSampleRate.
Whatever process variables which cannot be monitored through this mechanism, shall be monitored by a specific back-ground task, to continually carry out checks such as power supply levels, temperatures, pressures, hardware status signals (limit switches, power amplifier errors, encoder errors, emergency switches etc.), levels of analog signals, correct termination of executed commands (time-out) etc. An error report shall be sent to the control node on any detected irregularity.
Monitoring processes shall be supervised by a watch-dog or obituaries, so that the system is informed when the process dies.

2. Self-test level.
Each application process on the LCU accepting commands, e.g. each software device, shall accept a self-test command. It will be sent to the software device at start-up and on request, see the LCC User Manual [2]. The self-test shall be available at different levels:

a. checking hardware configuration and initialization procedures; it shall check that the correct configuration of boards with the correct setup is present. Mechanical devices shall NOT be moved. This test is always performed at boot time.

b. exercise mechanical equipment, electronic modules and assemblies used

c. perform a complete test of the electronic modules or mechanical unit (LRU) used. The purpose of this test is to check if the module/unit is working correctly. The test shall only cover one module/unit at time.
It shall be a low level test, using as much as possible existing drivers, but if necessary also directly access registers on process interface boards. The test shall be automatic, giving the result OK or indicating a failure. The module/unit shall be tested without disconnecting any cables and without connecting any test equipment. This will make it possible to run the test remotely, e.g. from the VLT control room or from Europe.

The test at start-up or boot of the LCU shall be on level (a). It shall not move any mechanical functions. LCC performs this self-test on all devices registered in the deviceFile, e.g. sends the command SELFTST to the devices.

The self-test will not restore the initial status after terminating.

3. Simulation level.
Each application process on the LCU accepting commands, e.g. each software device, shall accept the command enable/disable simulation. In simulation mode it shall simulate all hardware controlled by the device. The software shall respond to commands as in normal operation but no hardware shall be activated. Hardware status shall return default values set up at the configuration of each signal. Applications in simulation mode shall return reasonable values reflecting the state of the devices it is controlling and the commands previously received. Commands which normally take some time to execute, shall preferably return after some reasonable time. The proper timing is not guaranteed in simulation.

When a software device or the whole LCU returns to normal operation, the software and hardware of that device or LCU will have to be reinitialized. The software device shall go the state LOADED when exiting from simulation mode. The software device shall not be allowed to enter simulation mode if any hardware device is active e.g. a motor is moving, a detector is integrating etc. The software device shall also enter/exit simulation mode when the LCU enters/exits simulation mode, see the LCC User Manual [2].

3.4.2 Control Loop Tuning

A set of functions shall be available to assist in the engineering job of tuning a cascaded control loop, for the axis where a combined hardware/software solution of torque/velocity and position is implemented. The following gives examples of such functions:

a. make a specified position step

b. move axis with specified, constant velocity

c. move axis with specified, constant acceleration

d. make sinusoidal movement, with specified parameters.

General System Errors	Fatal	System crash, network failure, Rtap failures.
Communication Errors	Fatal	Subsystem processes not available or crashed. Message system failure.
Database Errors	Fatal	Database in-accessible.
Access Errors	Serious	Missing files, unset environment, invalid path, privileges.
System Management Errors	Serious	Disk space full e.g. logging data
User Errors	Warning	Invalid commands, invalid parameters, illegal state, etc.

Quadralay Corporation
http://www.webworks.com
Voice: (512) 719-3399
Fax: (512) 719-3606
sales@webworks.com