Runtime Verification

Runtime Verification (RV) is a lightweight (yet rigorous) method that complements classical exhaustive verification techniques (such as model checking and theorem proving) with a more practical approach for complex systems.

Instead of relying on a fine-grained model of a system (e.g., a re-implementation a instruction level), RV works by analyzing the trace of the system’s actual execution, comparing it against a formal specification of the system behavior.

The main advantage is that RV can give precise information on the runtime behavior of the monitored system, without the pitfalls of developing models that require a re-implementation of the entire system in a modeling language. Moreover, given an efficient monitoring method, it is possible execute an online verification of a system, enabling the reaction for unexpected events, avoiding, for example, the propagation of a failure on safety-critical systems.

Runtime Monitors and Reactors

A monitor is the central part of the runtime verification of a system. The monitor stands in between the formal specification of the desired (or undesired) behavior, and the trace of the actual system.

In Linux terms, the runtime verification monitors are encapsulated inside the RV monitor abstraction. A RV monitor includes a reference model of the system, a set of instances of the monitor (per-cpu monitor, per-task monitor, and so on), and the helper functions that glue the monitor to the system via trace, as depicted below:

Linux   +---- RV Monitor ----------------------------------+ Formal
 Realm  |                                                  |  Realm
 +-------------------+     +----------------+     +-----------------+
 |   Linux kernel    |     |     Monitor    |     |     Reference   |
 |     Tracing       |  -> |   Instance(s)  | <-  |       Model     |
 | (instrumentation) |     | (verification) |     | (specification) |
 +-------------------+     +----------------+     +-----------------+
        |                          |                       |
        |                          V                       |
        |                     +----------+                 |
        |                     | Reaction |                 |
        |                     +--+--+--+-+                 |
        |                        |  |  |                   |
        |                        |  |  +-> trace output ?  |
        +------------------------|--|----------------------+
                                 |  +----> panic ?
                                 +-------> <user-specified>

In addition to the verification and monitoring of the system, a monitor can react to an unexpected event. The forms of reaction can vary from logging the event occurrence to the enforcement of the correct behavior to the extreme action of taking a system down to avoid the propagation of a failure.

In Linux terms, a reactor is an reaction method available for RV monitors. By default, all monitors should provide a trace output of their actions, which is already a reaction. In addition, other reactions will be available so the user can enable them as needed.

For further information about the principles of runtime verification and RV applied to Linux:

Bartocci, Ezio, et al. Introduction to runtime verification. In: Lectures on Runtime Verification. Springer, Cham, 2018. p. 1-33.

Falcone, Ylies, et al. A taxonomy for classifying runtime verification tools. In: International Conference on Runtime Verification. Springer, Cham, 2018. p. 241-262.

De Oliveira, Daniel Bristot. Automata-based formal analysis and verification of the real-time Linux kernel. Ph.D. Thesis, 2020.

Online RV monitors

Monitors can be classified as offline and online monitors. Offline monitor process the traces generated by a system after the events, generally by reading the trace execution from a permanent storage system. Online monitors process the trace during the execution of the system. Online monitors are said to be synchronous if the processing of an event is attached to the system execution, blocking the system during the event monitoring. On the other hand, an asynchronous monitor has its execution detached from the system. Each type of monitor has a set of advantages. For example, offline monitors can be executed on different machines but require operations to save the log to a file. In contrast, synchronous online method can react at the exact moment a violation occurs.

Another important aspect regarding monitors is the overhead associated with the event analysis. If the system generates events at a frequency higher than the monitor’s ability to process them in the same system, only the offline methods are viable. On the other hand, if the tracing of the events incurs on higher overhead than the simple handling of an event by a monitor, then a synchronous online monitors will incur on lower overhead.

Indeed, the research presented in:

De Oliveira, Daniel Bristot; Cucinotta, Tommaso; De Oliveira, Romulo Silva. Efficient formal verification for the Linux kernel. In: International Conference on Software Engineering and Formal Methods. Springer, Cham, 2019. p. 315-332.

Shows that for Deterministic Automata models, the synchronous processing of events in-kernel causes lower overhead than saving the same events to the trace buffer, not even considering collecting the trace for user-space analysis. This motivated the development of an in-kernel interface for online monitors.

For further information about modeling of Linux kernel behavior using automata, see:

De Oliveira, Daniel B.; De Oliveira, Romulo S.; Cucinotta, Tommaso. A thread synchronization model for the PREEMPT_RT Linux kernel. Journal of Systems Architecture, 2020, 107: 101729.

The user interface

The user interface resembles the tracing interface (on purpose). It is currently at “/sys/kernel/tracing/rv/”.

The following files/folders are currently available:

available_monitors

  • Reading list the available monitors, one per line

For example:

# cat available_monitors
wip
wwnr

available_reactors

  • Reading shows the available reactors, one per line.

For example:

# cat available_reactors
nop
panic
printk

enabled_monitors:

  • Reading lists the enabled monitors, one per line

  • Writing to it enables a given monitor

  • Writing a monitor name with a ‘!’ prefix disables it

  • Truncating the file disables all enabled monitors

For example:

# cat enabled_monitors
# echo wip > enabled_monitors
# echo wwnr >> enabled_monitors
# cat enabled_monitors
wip
wwnr
# echo '!wip' >> enabled_monitors
# cat enabled_monitors
wwnr
# echo > enabled_monitors
# cat enabled_monitors
#

Note that it is possible to enable more than one monitor concurrently.

monitoring_on

This is an on/off general switcher for monitoring. It resembles the “tracing_on” switcher in the trace interface.

  • Writing “0” stops the monitoring

  • Writing “1” continues the monitoring

  • Reading returns the current status of the monitoring

Note that it does not disable enabled monitors but stop the per-entity monitors monitoring the events received from the system.

reacting_on

  • Writing “0” prevents reactions for happening

  • Writing “1” enable reactions

  • Reading returns the current status of the reaction

monitors/

Each monitor will have its own directory inside “monitors/”. There the monitor-specific files will be presented. The “monitors/” directory resembles the “events” directory on tracefs.

For example:

# cd monitors/wip/
# ls
desc  enable
# cat desc
wakeup in preemptive per-cpu testing monitor.
# cat enable
0

monitors/MONITOR/desc

  • Reading shows a description of the monitor MONITOR

monitors/MONITOR/enable

  • Writing “0” disables the MONITOR

  • Writing “1” enables the MONITOR

  • Reading return the current status of the MONITOR

monitors/MONITOR/reactors

  • List available reactors, with the select reaction for the given MONITOR inside “[]”. The default one is the nop (no operation) reactor.

  • Writing the name of a reactor enables it to the given MONITOR.

For example:

# cat monitors/wip/reactors
[nop]
panic
printk
# echo panic > monitors/wip/reactors
# cat monitors/wip/reactors
nop
[panic]
printk