SIL2LinuxMP

The SIL2LinuxMP project aims at the certification of the base components of an embedded GNU/Linux RTOS running on a single-core or multi-core industrial COTS computer board. Base components are boot loader, root filesystem, Linux kernel and C library bindings to access the Linux kernel. With the exception of a minimal set of utilities (to inspect the system, manage files and start test procedures), user space applications are not included.

Traditionally, safety-critical systems isolate the safety-related functions into a single node or a small number of nodes that exclusively cover a minimal and simple functionality. Such nodes usually run on simple single-core processors and use a minimum software set [0]. But with the growing complexity of systems, e.g. including network security requirements, complex control algorithms and even cognitive functions, much more nodes would be required than the policy of using small and simple nodes can handle. The main reason for this limitation is related to the exponentially increasing requirement of inter-node communication. Thus, it may still be possible to keep the single nodes simple - but at the cost of an over-boarding overall system complexity that can no longer be managed.

In our opinion KISS [1] was always a design principle, aiming at ensuring that the requirements and design, ideally, fit into the minds of the developers and users, but not about necessarily keeping the implementation simple. This is very nicely expressed by the NASA Software Safety Standard.

"This Standard does not discourage the use of software in safety-critical systems. When designed and implemented correctly, software is often the first, and sometimes the best, hazard detection and prevention mechanism in the system." [NASA NPR 8719.13B 1.2]

A further development of the past decade simply is that the target hardware for a lot of safety-related systems and their software components have been single-core CPUs, which can be qualified as an close-to-extinct species by mid 2014[2] . Aggravating the issue is the fact that many of the traditional safe RTOS simply do not support multi-core, and adding multi-core support often amounts to a re-design and re-implementation. Moving an OS/RTOS to a multi-core environment is a fundamental design change that will effectively invalidate most of traditionally used software components rather than just limiting the benefit of re-using "proven-in-use" concepts and components.

Finally, we can observe a significant change in the dynamics of technical evolution: CPUs that would last for two decades are becoming rare, and while they do exist, using them de-couples projects from main stream technology and with that from know-how and vital experience. Keeping safety-related systems a step or two behind the hype is a matter of common sense, but letting it fall too far behind effectively puts safety-related systems in uncharted lands. In this case, "well-tested" processors and software components run in environments they were never designed for and run applications that were not envisioned at the time when the OS/RTOS was designed.

One possible - and maybe not the only possible - answer to these challenges is to utilize main-stream multi-core ready Open Source RTOS like GNU/Linux RTOS (based on RT_PREEMPT). Precisely this is the goal of the SIL2LinuxMP project.

Jailhouse Safe (JHS)

Safety related systems are generally thought of as simple systems - ideally 8 bit micro-controllers without an operating system, utilizing boolean logic... For contemporary systems this approach is increasingly inadequate - full featured mixed-criticality, fully-connected, high-performance systems are needed to address the challenges of contemporary safety related systems.

Our take on functional safety is "Managed complexity allowing to manage risk"

keeping things simple is thus not a mater of "simple code" but a matter of how to approach a problem with the right means for the given complexity of the system at hand while adhering to the KISS principle.

Contrasting this with IEC 61508-3 Ed 2 7.4.2.6

"As far as practicable the design shall keep the safety-related part of the software simple"

Our take on this "discrepancy" - as much as necessary - as little as possible. A partitioning hypervisor is one building block in system de-composition and a high-level divide-and-conquer strategy. The prime problem resolution intended by JHS is to provide this top level fault-containment/logical-isolation capability to systems engineers building mixed criticality systems on multi-core hardware.

[0] IEC 61508-3 Ed 2 7.4.2.6 As far as practicable the design shall keep the safety-related part of the software simple.

[1] "Keep it simple, stupid" is a design principle introduced in the 60s by the US Navy for battleships.

[2] Practically all hardware vendor announcements made in 2014 are multi-core systems.