Telemetry Republisher

Overview

The Telemetry Republisher RTC Component reads telemetry data in MUDPI format originating in the HRTC, and forwards (republishes) it using DDS reliable Multicast to one or more SRTC nodes.

Prerequisites

Telemetry data is published via FastDDS provided by ELT Development Environment. FastDDS QoS profiles have to be provided in telemDataPathDdsQos.xml installed in $INTROOT/resource/config/rtctk/dds/ or any other location under rtctk/dds/ in $CFGPATH. Different file name can be provided in the component’s configuration (see Configuration section below for details).

Note

The FASTRTPS_DEFAULT_PROFILES_FILE environment variable shall not be used (set). This is particularly problematic if the file that the variable points to contains the same QoS profiles.

The Telemetry Republisher works optimally for MUDPI traffic with Ethernet MTU of size 9000. Nevertheless it works also with different sizes. Thus it is recommended to first exercise Telemetry Republisher with default (by OS set) MTU, and afterwards change the MTU size for specific MUDPI traffic.

Customisation

The Telemetry Republisher component does not require any compile-time customization. It runs out of the box and only needs to be configured accordingly, please refer to section Configuration for more details.

Running

The Telemetry Republisher can be started (after deployment) by invoking the command:

$ rtctkTelRepub tel_repub_1 file:$INTROOT/run/exampleTelRepub/service_disc.yaml

Mandatory command line arguments are the component instance name (first) and the service discovery endpoint (second). Service discovery information is retrieved from the specified service registry located in $INTROOT/run/exampleTelRepub/service_disc.yaml.

The component can be stopped either by sending Exit command or by pressing Ctrl-C.

Commands

To initialise the Telemetry Republisher component send Init using the Client application:

$ rtctkClient tel_repub_1 Init \
              -s file:$INTROOT/run/exampleTelRepub/service_disc.yaml

As a result the component reads its configuration from the Run-time repository, it creates entities for listening to MUDPI traffic on a UDP socket, and entities for publishing DDS agnostic topic. At this stage the reading and publishing is not yet started, but subscription to created DDS topic(s) can be established.

This can be seen in the logs. E.g. a log states that “connection” to the particular DDS topic (TestTopic0) i.e. corresponding DDS subscriber comes up:

[10:26:06:580][INFO ][tel_repub_1] on_publication_match (TestTopic00)

When a DDS subscriber “disconnects” to the particular (TestTopic0) Telemetry Republisher with name tel_repub_1 logs such a kind of message:

[10:26:06:580][INFO ][tel_repub_1] on_publication_match (TestTopic00)

Commands Enable/Disable are used to transition between states On:NotOperational:Ready and On:Operational:Idle where the Run command can be applied.

When sending command Run the republisher starts listening to the MUDPI traffic on the sockets and publishes aggregated topics via DDS. The activity can be checked by looking at different Telemetry Republisher metrics (statistics): Online Database (OLDB) Data Points. Similar information can be obtained also in regular DEBUG messages for what the component log level needs to be set to DEBUG level. By default the log messages is produced every 60s, the message contains number of received MUDPI frames, number of frames per sample, receiving rate, estimated loop (sample) frequency, and the number of frames and samples skipped so far.

E.g. how to set log level to DEBUG:

$ rtctkClient tel_repub_1 SetLogLevel DEBUG \
              -s file:$INTROOT/run/exampleTelRepub/service_disc.yaml

E.g. DEBUG logs for MUDPI topic with TopicId 3 and 2, number of frames is 6:

[09:54:28:743][DEBUG][tel_repub_1] [3] MUDPI processor received 13434 frames (6 frames/sample) 990.58/s, estimated loop freq 165.10, errors (0 : 1)
[09:54:29:457][DEBUG][tel_repub_1] [2] MUDPI processor received 15174 frames (6 frames/sample) 971.40/s, estimated loop freq 161.90, errors (0 : 1)

The listening and publishing is stopped by sending the Idle command.

Command Exit stops the component, means exiting the process. At this stage some diagnostic messages are logged. Highest occupancy of the internal queue/buffer in absolute number and percentage.

E.g. For topic TestTopic00 the max occupancy was just 10 slots, what means 0.033% of the queue whole size. High percentage is an indicator of possible performance problem on DDS publishing side of Telemetry Republisher.

[13:06:36:405][DEBUG][tel_repub_1] [TestTopic00] Max TelRepub buffer occupancy 10 (0.03333 %).

Similar diagnostic log messages we get for DDS publishers.

E.g. log message means that DDS publisher side has not detected any sample drop.

[13:06:36:405][INFO ][tel_repub_1] [TestTopic00] Received: 0 samples. No samples skipped

E.g. log message means that there were dropped 35 samples out of 9345. This indicates likely problem with DDS publishing, slow DDS subscriber, …

[17:12:45:745][INFO ][tel_repub_1] [TestTopic00] skipped: 35 samples out of 9345 ratio: 0.00374531835206. Last @: 7340

As all the Configuration for the Telemetry Republisher is static the Update command does not have any effect.

Configuration

Configuration for Telemetry Republisher component is stored as for other components in a file in YAML format. The configuration file name has to correspond to the name of the component instance. The configuration contains just the static part meaning that configuration can not be updated i.e. taken into account during running invoking Update command. If configuration is changed then the component should be restarted or reinitialized i.e. call Init.

The configuration can be divided into three groups:

  • common configuration

  • receivers configuration

  • DDS topics configuration

Common Configuration

Example configuration (YAML): In the common configuration part can be specified the DDS QoS Profile to be used for setting QoS DDS entities like DDS participant and DDS publisher, and allowed network interfaces for DDS.

Configuration Path

Type

Description

dds_qos_profile

RtcString

(optional) DDS QoS Profile to be used for setting QoS of DDS entities like DDS participant and DDS publisher. The specified QoS Profile needs to be contained in the telemDataPathDdsQos.xml file, or in the file specified in configuration (dds_qos_file). If an empty string is given then the default is used (RtcTk_Default_Profile). This string should correspond to the profile_name XML tag attribute of the desired elements that should be used from the QoS XML file.

dds_qos_file

RtcString

(optional) Name of DDS QoS XML file. The file should be found in $CFGPATH under rtctk/dds/. If not specified the default name is used: telemDataPathDdsQos.xml.

dds_interface_white_list

RtcVectorString

(optional) List of network interfaces to be used by DDS. The interfaces are for the local machine where the component is running. If given, this list will replace any settings under the <interfaceWhiteList> XML tag for the transport descriptors found in the DDS QoS Profile (dds_qos_file).

An example of a common configuration block:

static:
    dds_qos_profile:
        type: RtcString
        value: RtcTk_Default_Profile

    dds_qos_file:
        type: RtcString
        value: telemDataPathDdsQos.xml

    dds_interface_white_list:
        type: RtcVectorString
        value:
            - 127.0.0.1
            - 192.168.5.44
            - lo
    ...

Receivers Configuration

The Telemetry Republisher can listen to more receivers which are to be specified in mudpi_receivers section. Each receiver is specified as rcv_NN where NN goes from 00 to two digit number (index) of receivers - 1. E.g. for two receiver we have: rcv_00 and rcv_01.

Note

rcv_1 wont work as index as it is just one digit, correctly this would be rcv_01. The index has to start with 00 (not 01). There should be no gap in numbers. E.g. rcv_00, rcv_01, rcv_09 will configure just two receivers.

For each receiver the following needs to be specified:

Configuration Path

Type

Description

ip

RtcString

IP address to be used for listening to. E.g one address corresponds to one receiver and can only listen on one NIC.

Important It must be just an IP address and not for example a host name.

port

RtcInt32

Port to listen to.

thread_policies

NUMA Policies

Defines optional NUMA policies for the UDP receiver thread.

Example configuration for two receivers:

static:
    # ...
    mudpi_receivers:
        rcv_00:
            ip:
                type: RtcString
                value: 127.0.0.1
            port:
                type: RtcInt32
                value: 6000
            thread_policies:
                cpu_affinity:
                    type: RtcString
                    value: "1"
        rcv_01:
            ip:
                type: RtcString
                value: 127.0.0.1
            port:
                type: RtcInt32
                value: 6500
    # ...

DDS Topic Configuration

Similarly as for receivers, the Telemetry Republisher can operate on many (DDS) topics. DDS topics are configured in topics section.

Each DDS topic is specified as topic_NN where NN goes from 00 to two digit number (index) of topics - 1. E.g. for two topics we have: topic_00 and topic_01.

Note

topic_1 wont work as index is just one digit correct would be topic_01. The index has to start with 00 (not 01). There should be no gap in numbers. E.g. topic_00, topic_01, topic_09 will configure just two topics.

For each topic the following needs to be specified:

Configuration Path

Type

Description

name

RtcString

Topic name.

Important: names should be unique per Telemetry Republisher.

mudpi_topic

RtcInt32

Map to MUDPI topic id.

Important: each MUDPI topic id needs to have corresponding DDS topic i.e. topic that has configuration that maps (=has defined mudpi_topic) to that specific MUDPI topic id.

rcv

RtcInt32

Receiver index where topic specified in mudpi_topic will be received.

Important: receiver with particular index needs to be configured in mudpi_receivers section.

queue_size

RtcInt32

(optional) size of the internal queue between MUDPI receiver and DDS publisher in number of topic samples.

sim_freq

RtcInt32

(optional) If specified, and value is different than 0 the topic is generated with the specified frequency. As there is no need for corresponding MUDPI topic in this case mudpi_topic and rcv can be omitted. Important: The frequency should be reasonable not to get system too busy.

thread_policies

NUMA Policies

Defines optional NUMA policies for the DDS publisher thread.

Example configuring three topics:

static:
    # ...
    topics:
        topic_00:
            name:
                type: RtcString
                value: "TestTopic00"
            mudpi_topic:
                type: RtcInt32
                value: 0
            queue_size:
                type: RtcInt32
                value: 100
            rcv:
                type: RtcInt32
                value: 0
        topic_01:
            name:
                type: RtcString
                value: "TestTopic01"
            mudpi_topic:
                type: RtcInt32
                value: 1
            rcv:
                type: RtcInt32
                value: 0
        topic_02:
            name:
                type: RtcString
                value: "TestTopic02"
            mudpi_topic:
                type: RtcInt32
                value: 1
            rcv:
                type: RtcInt32
                value: 0

The configuration can be always inspected using Configuration Tool e.g. to check if a certain configuration datapoint exists.

Errors

During the initialization i.e in On:NotOperational:Initialising several errors can occur:

In case of problem to create MUDPI/UDP receiver an error message is logged, and component goes to Error state.

[18:22:57:234][ERROR][tel_repub_1] Component tel_repub_1 problem to create MUDP receiver part: bind: Cannot assign requested address
[18:22:57:235][ERROR][tel_repub_1] Nested exceptions:

1. ActivityInitialising: failed
2. Component tel_repub_1 problem to create MUDP receiver part
3. bind: Cannot assign requested address

In the above case the problem is binding UDP socket to a particular IP address.

If there is no MUDPI receiver that a particular topic wants to use as a source of MUDPI topic data we get an error:

[14:22:21:256][ERROR][tel_repub_1] Activity.Initialising: failed, exception: Component tel_repub_1 Receiver Index out of range: 6 not possible to assign MUDPI topic Id: 0
Source file: ../reusableComponents/telRepub/src/telRepubBusinessLogic.cpp
Line no.: 249
Function: CreateDdsPubs

This would work if we have (at least) 7 receivers defined in Configuration (indexed from 0 to 6), and created.

A warn message like:

[18:31:40:082][WARN ][tel_repub_1]  mlockall failed:  RUNNING WITHOUT LOCKED MEMORY

means that memory can not be locked to prevent memory swapping and thus reduce the performance. Memory can be locked just if Telemetry Republisher is run as root.

During the publishing i.e. in On:Operational:Running state error messages can be logged:

If a sample is lost for particular MUDPI topic (TopicId: 3) the Telemetry Republisher reports (and continues) as:

[10:08:21:297][ERROR][tel_repub_1] [3] sampleId: 374758624 (frameId: 1) expected sampleId: 374758626 skipped samples : 1

In case if the received sample Id is lower than expected. This can result when the source of MUDPI restarted (without reinitializing the Republisher). The WARN log message looks like:

[11:00:24:309][WARN ][tel_repub_1] [3] sampleId: 0 (frameId: 1)  expected sampleId 124. The received sampleId is lower than expected. This can result when the source of MUDPI data is restarted.

… and when it resynchronises we get message like:

[10:15:36:032][WARN ][tel_repub_1] [3] sampleId: 374762680 frameId: 1 Synched again.

In case if there is not enough space (=free slot) in the internal queue / buffer to push there sample we get error message like:

[11:12:23:022][tel_repub_1] [9] ERROR: problem again to get free slot in repub buffer for *current* sampleId: 338

What could be a consequence of slow publishing of DDS topic, what might indicate a problem with network, and/or DDS QoS configuration, and/or slow DDS subscriber.

In case if there is no mapping between MUDPI topic Id and DDS topic such a message is logged.

TopicId: 1234 has no mapping to DDS topic

This message means that topic with Id 1234 has no corresponding mapping. It might be that there is no defined DDS topic in the Configuration that maps to 1234 i.e. has no mudpi_topic datapoint.

Timeout to send (write) to topic is reported with log:

[08:49:05:104][ERROR][tel_repub_1] [TestTopic2] SampleId: 374896008. DDS write timeout!

The internal telemetry republisher queue overrun and thus samples are lost for a particular topic (TestTopic1) at particular SampleId

[08:49:10:178][DEBUG][tel_repub_1] [TestTopic1] SampleId: 374906361 overrun, so far skipped 1705 samples. Last @: 374906360

In both cases the problem could be slow subscribers, or some other DDS problem.

Online Database (OLDB) Data Points

Component Metrics

Telemetry Republisher uses Component Metrics Service to write the following metrics to the OLDB:

For each topic is under path /<component-name>/metrics/counter/<topic-name>/ possible to find following performance counter metrics:

Table 6 Topic Performance Counter Metrics

OLDB Path

Type

Description

mudpi_processor/frequency_estimate

RtcDouble

MUDPI samples receiving frequency estimate.

mudpi_processor/frames_received

RtcInt64

MUDPI frames received.

mudpi_processor/frame_errors

RtcInt64

MUDPI frame errors.

mudpi_processor/sample_errors

RtcInt64

MUDPI sample errors.

dds_publisher/sent_samples

RtcInt64

DDS samples published.

Information about Telemetry Republisher threads is possible to find under:

  • for DDS publisher threads - one per thread: /<component-name>/metrics/thread/dds_publishers/dpub<topic-name>/ (for simulated DDS publisher /<component-name>/metrics/thread/dds_publishers/spub<topic-name>/)

  • for UDP receiver threads: /<component-name>/metrics/thread/udp_receivers/udp_rcv<upd-receiver-idx>/

Note

For details about Data Points please refer to Component Metrics OLDB Data Points.

As the OLDB data path needs to be lowercase are topic names (<topic-name>) converted to lower case.

Thread names (dpub<topic-name> / spub<topic-name>) are always truncated to 16 characters.

Limitations and Known Issues

The Data Wrangling mechanism is not yet implemented.

The payload size of the agnostic topic is limited to 2560000 bytes.

The performance depends also on the machine where the republisher runs.

In some cases when a subscriber (Telemetry Subscriber or Generic DDS Subscriber) crashes or have some other problems Telemetry Republisher gets in trouble and generates messages like:

[14:59:14:347][ERROR][tel_repub_3] [LoopData07]SampleId: 14702879. DDS write timeout!
...
[14:59:14:568][ERROR][tel_repub_3] [7] sampleId: 14702901 frameId: 1. Again problem to get free slot in repub buffer for *current* sample.
[14:59:14:568][ERROR][tel_repub_3] Processing packet for topicId: 7, failed with error: Queue Overflow

In this case is safest to restart the Telemetry Republisher.