Real-Time Systems

This paper is an edited version of a chapter on the subject that was contained in SEPA, 4/e. It is provided for background and historical information.

The design of real-time computing systems is the most challenging and complex task that can be undertaken by a software engineer. By its very nature, software for real-time systems makes demands on analysis, design and testing techniques that are unknown in other application areas.

Real-time software is highly coupled to the external world. That is, real-time software must respond to the problem domain (the real world) in a time frame dictated by the problem domain. Because real-time software must operate under rigorous performance constraints, software design is often driven by hardware as well as software architecture, operating system characteristics as well as application requirements, programming language vagaries as well as design issues.

In his book on real-time software, Robert Glass [GLA83] provides a useful introduction to the subject of real-time systems:

The digital computer is becoming ever more present in the daily lives of all of us. Computers allow our watches to play games as well as tell time, optimize the gas mileage of our latest generation cars, and sequence our appliances.... [In industry, computers control machines, coordinate processes, and increasingly, replace manual skills and human recognition with automated systems and "artificial intelligence."]

All these computing interactions –be they helpful or intrusive –are examples of real-time computing. The computer is controlling something that interacts with reality on a timely basis. In fact, timing is the essence of the interaction... An unresponsive real-time system may be worse than no system at all.

No more than a decade ago, real-time software development was considered a black art, applied by anointed wizards who guarded their closed world with jealousy. Today, there just are not enough wizards to go around! Yet, there is no question that the engineering of real-time software requires special skills. In this chapter we examine real-time software and discuss at least some of the skills that are required to build it.

1.1 System Considerations

Like any computer-based system, a real-time system must integrate hardware, software, human, and data base elements to properly achieve a set of functional and performance requirements. In Chapter 10 [SEPA, 5/e], we examined the allocation task for computer-based systems, indicating that the system engineer must allocate function and performance among the system elements. The problem for real-time systems is proper allocation. Real-time performance is often as important as function, yet allocation decisions that relate to performance are often difficult to make with assurance. Can a processing algorithm meet severe timing constraints, or should we build special hardware to do the job Can an off-the-shelf operating system meet our need for efficient interrupt handling, multi-tasking and communication, or should we built a custom executive Can specified hardware coupled with proposed software meet performance criteria These, and many other questions, must be answered by the real-time system engineer.

A comprehensive discussion of all elements of real time systems is beyond the scope of this book. Among a number of good sources of information are [SAV85], [ELL94], and [SEL94]. However, it is important that we understand each of the elements of a real-time system before focusing on software analysis and design issues.

Everett {EVE95] defines three characteristics that differentiate real-time software development from other software engineering efforts:

The design of real-time system is resource constrained. The primary resource for a real-time system is time. It is essential to complete a defined task within a given number of CPU cycles. In addition, other system resources, such as memory size, may be traded against time to achieve system objectives.

Real-time systems are compact, yet complex. Although a sophisticated real-time system may contain well over 1 million lines of code, the time critical portion of the software represents a very small percentage of the total. It is this small percentage of code that is typically the most complex (from an algorithmic point of view).

Real-time systems often work without the presence of a human user. Therefore, real-time software must detect problems that lead to failure and automatically recover from these problems before damage to data and the controlled environment occurs.

In the section that follows, we examine some of the key attributes that differeniate real-time systems from other types of computer software.

1.2 Real-Time Systems

Real-time systems generate some action in response to external events. To accomplish this function, they perform high-speed data acquisition and control under severe time and reliability constraints. Because these constraints are so stringent, real-time systems are frequently dedicated to a single application.

Until recently, the major consumer of real-time systems was the military. Today, however, significant decreases in hardware costs make it possible for most companies to afford real-time systems (and products) for diverse applications that include process control, industrial automation, medical and scientific research, computer graphics, local and wide-area communications, aerospace systems, computer-aided testing, and a vast array of industrial instrumentation.

1.2.1 Integration and Performance Issues

Putting together a real-time system presents the system engineer with difficult hardware and software decisions. [The allocation issues associated with hardware for real-time systems are beyond the scope of this book (see [SAV85] for additional information)]. Once the software element has been allocated, detailed software requirements are established and a fundamental software design must be developed. Among many real-time design concerns are: coordination between the real-time tasks; processing of system interrupts; I/O handling to ensure that no data is lost; specifying the system's internal and external timing constraints, and ensuring the accuracy of its data base.

Each real-time design concern for software must be applied in the context of system performance. In most cases, the performance of a real-time system is measured as one or more time-related characteristics, but other measures such as fault-tolerance may also be used.

Some real-time systems are designed for applications in which only the response time or the data transfer rate is critical. Other real-time applications require optimization of both parameters under peak loading conditions. What's more, real-time systems must handle their peak loads while performing a number of simultaneous tasks.

Since the performance of a real-time system is determined primarily by the system response time and its data transfer rate, it is important to understand these two parameters. System response time is the time within which a system must detect an internal or external event and respond with an action. Often, event detection and response generation are simple. It is the processing of the information about the event to determine the appropriate response that may involve complex, time-consuming algorithms.

Among the key parameters that affect the response time are context switching and interrupt latency.Context switching involves the time and overhead to switch among tasks, and interrupt latency is the time lag before the switch is actually possible. Other parameters that affect response time are the speed of computation and of access to mass storage.

The data transfer rate indicates how fast serial or parallel, as well as analog or digital data must be moved into or out of the system. Hardware vendors often quote timing and capacity values for performance characteristics. However, hardware specifications for performance are usually measured in isolation and are often of little value in determining overall real-time system performance. Therefore, I/O device performance, bus latency, buffer size, disk performance, and a host of other factors, although important, are only part of the story of real-time system design.

Real-time systems are often required to process a continuous stream of incoming data. Design must assure that data are not missed. In addition, a real-time system must respond to events that are asynchronous. Therefore, the arrival sequence and data volume cannot be easily predicted in advance.

Although all software applications must be reliable, real-time systems make special demands on reliability, restart, and fault recovery. Because the real-world is being monitored and controlled, loss of monitoring or control (or both) is intolerable in many circumstances (e.g., an air traffic control system). Consequently, real-time systems contain restart and fault-recovery mechanisms and frequently have built-in redundancy to insure backup.

The need for reliability, however, has spurred an on-going debate about whether on-line systems, such as airline reservation systems and automatic bank tellers, also qualify as real-time. On one hand, such on-line systems must respond to external interrupts within prescribed response times on the order of one second. On the other hand, nothing catastrophic occurs if an on-line system fails to meet response requirements; instead, only system degradation results.

1.2.2 Interrupt Handling

One characteristic that serves to distinguish real-time systems from any other type is interrupt handling. A real-time system must respond to external stimulae–interrupts–in a time frame dictated by the external world. Because multiple stimulae (interrupts) are often present, priorities and priority interrupts must be established. In other words, the most important task must always be serviced within predefined time constraints regardless of other events.

Interrupt handling entails not only storing information so that the computer can correctly restart the interrupted task, but also avoiding deadlocks and endless loops. The overall approach to interrupt handling is illustrated in Figure 1.1. Normal processing flow is "interrupted" by an event that is detected by processor hardware. An event is any occurrence that requires immediate service and may be generated by either hardware or software. The state of the interrupted program is saved (i.e., all register contents, control blocks, etc. are saved) and control is passed to an interrupt service routine that branches to appropriate software for handling the interrupt. Upon completion of interrupt servicing, the state of the machine is restored and normal processing flow continues.

In many situations, interrupt servicing for one event may itself be interrupted by another, higher priority event. Interrupt priority levels (Figure 1.2) may be established. If a lower-priority process is accidentally allowed to interrupt a higher-priority one, it may be difficult to restart the processes in the right order and an endless loop may result.

Figure 1.1

To handle interrupts and still meet the system time constraints, many real-time operating systems make dynamic calculations to determine whether the system goals can be met. These dynamic calculations are based on the the average frequency of occurrence of events, the amount of time it takes to service them (if they can be serviced), and the routines that can interrupt them and

Figure 1.2

If the dynamic calculations show that it is impossible to handle the events that can occur in the system and still meet the time constraints, the system must decide on a scheme of action. One possible scheme involves buffering the data so that it can be processed quickly when the system is ready.

1.2.3 Real-Time Data Bases

Like many data-processing systems, real-time systems often are coupled with a data base management function. However, distributed data bases would seem to be a preferred approach in real-time systems because multi-tasking is commonplace and data are often processed in parallel. If the data base is distributed, individual tasks can access their data faster and more reliably, and with fewer bottlenecks than with a centralized data base. The use of a distributed data base for real-time applications divides input/output "traffic" and shortens queues of tasks waiting for access to a data base. Moreover, a failure of one data base will rarely cause the failure of the entire system, if redundancy is built in.

The performance efficiencies achieved through the use of a distributed data base must be weighed against potential problems associated with data partitioning and replication. Although data redundancy improves response time by provided multiple information sources, replication requirements for distributed files also produces logistical and overhead problems, since all the file copies must be updated. In addition, the use of distributed data bases introduces the problem of concurrency control. Concurrency control involves synchronizing the data bases so that all copies have the correct, identical information free for access.

The conventional approach to concurrency control is based on what are known as locking and time stamps. At regular intervals, the following tasks are initiated: (1) the data base is "locked" so that concurrency control is assured; no I/O is permitted; (2) updating occurs as required; (3) the data base is unlocked; (4) files are validated to assure that all updates have been correctly made; (5) the completed update is acknowledged. All locking tasks are monitored by a master clock (i.e., time stamps). The delays involved in these procedures, as well as the problems of avoiding inconsistent updates and deadlock, mitigate against the widespread use of distributed data bases.

Some techniques, however, have been developed to speed updating and to solve the concurrency problem. One of these, called the exclusive-writer protocol maintains the consistency of replicated files by allowing only a single, exclusive writing task to update a file. It therefore eliminates the high overhead of locking or time stamp procedures. 

1.2.4 Real-Time Operating Systems

Choosing a real-time operating system (RTOS) for a specific application is no easy chore. Some operating system classifications are possible, but most do not fit into neat categories with clear-cut advantages and disadvantages. Instead, there is considerable overlap in capabilities, target systems, and other features.

Some real-time operating systems are applicable to a broad range of system configurations, while others are geared to a particular board or even microprocessor, regardless of the surrounding electronic environment. RTOS achieve their capabilities through a combination of software features and (increasingly) a variety of micro-coded capabilities implemented in hardware.

Today, two broad classes of operating systems are used for real-time work: (1) dedicated RTOS designed exclusively for real-time applications and (2) general-purpose operating systems that have been enhanced to provide real-time capability. The use of a real-time executive makes real-time performance feasible for a general-purpose operating system. Behaving like application software, the executive performs a number of operating system functions–particularly those that affect real-time performance–faster and more efficiently than the general purpose operating system.

All operating systems must have a priority scheduling mechanism, but RTOS must provide a priority mechanism that allows high-priority interrupts to take precedence over less important ones. Moreover, because interrupts occur in response to asynchronous, nonrecurring events, they must be serviced without first taking time to swap in a program from disk storage. Consequently, to guarantee the required response time, a real-time operating system must have a mechanism for memory locking–that is, locking at least some programs in main memory so that swapping overhead is avoided.

To determine which kind of real-time operating system best matches an application, measures of RTOS quality can be defined and evaluated. Context switching time and interrupt latency, (discussed earlier) determine interrupt-handling capability, the most important aspect of a real-time system. Context switching time is the time the operating system takes to store the state of the computer and the contents of the registers so that it can return to a processing task after servicing the interrupt.

Interrupt latency, the maximum time lag before the system gets around to switching a task, occurs because in an operating system there are often non-re-entrant or critical processing paths that must be completed before an interrupt can be processed.

The length of these paths (the number of instructions) before the system can service an interrupt indicates the worst-case time lag. The worst case occurs if a high-priority interrupt is generated immediately after the system enters a critical path between an interrupt and interrupt service. If the time is too long, the system may miss an unrecoverable piece of data. It is important that the designer know the time lag so that the system can compensate for it.

Many operating systems perform multitasking [WOO90], or concurrent processing, another major requirement for real-time systems. But to be viable for real-time operation, the system overhead must be low in terms of switching time and memory space used.

1.2.5 Real-Time Languages

Because of the special requirements for performance and reliability demanded of real-time systems, the choice of a programming language is important. Many general purpose programming languages (e.g., C, FORTRAN, Modula-2) can be used effectively for real-time applications. However, a class of so-called "real-time languages" (e.g., Ada, Jovial, HAL/S, Chill, and others) is often used in specialized military and communications applications.

A combination of characteristics makes a real-time language different from a general-purpose language. These include the multitasking capability, constructs to directly implement real-time functions, and modern programming features that help ensure program correctness.

A programming language that directly supports multitasking is important because a real-time system must respond to asynchronous events occurring simultaneously. Although many RTOS provide multitasking capabilities, embedded real-time software often exists without an operating system. Instead, embedded applications are written in a language that provides sufficient run-time support for real-time program execution. Run-time support requires less memory than an operating system, and it can be tailored to an application, thus increasing performance.

A real time system that has been designed to accommodate multiple tasks must also accommodate intertask synchronization [KAI83]. A programming language that directly supports synchronization primitives such as SCHEDULE, SIGNAL, and WAIT greatly simplifies the translation from design to code. The SCHEDULE command schedules a process based on time or an event; SIGNAL and WAIT commands manipulate a special flag, called a semaphore, that enables concurrent tasks to be synchronized.

Finally, features that facilitate reliable programming are necessary because real-time programs are frequently large and complex. These features include modular programming, strongly enforced data typing, and a host of other control and data definition constructs.

1.2.6 Task Synchronization and Communication

A multi-tasking system must furnish a mechanism for the tasks to pass information to each other as well as to ensure their synchronization. For these functions, operating systems and languages with run-time support commonly use queuing semaphores, mailboxes, or message systems. Semaphores supply synchronization and signaling but contain no information. Messages are similar to semaphores except that they carry the associated information. Mailboxes, on the other hand, do not signal information but instead contain it.

Queuing semaphores are software primitives that help manage traffic. They provide a method of directing several queues–for example, queues of tasks waiting for resources, data-base access, and devices, as well as queues of the resources and devices. The semaphores coordinate (synchronize) the waiting tasks with whatever they are waiting for without letting tasks or resources interfere with each other.

In a real-time system, semaphores are commonly used to implement and manage mailboxes. Mailboxes are temporary storage places (also called a message pool or buffer) for messages sent from one process to another. One process produces a piece of information, puts it in the mailbox, and then signals a consuming process that there is a piece of information in the mailbox for it to use.

Some approaches to real-time operating systems or run-time support systems view mailboxes as the most efficient way to implement communications between processes. Some real-time operating systems furnish a place to send and receive pointers to mailbox data. This eliminates the need to transfer all of the data–thus saving time and overhead.

A third approach to communication and synchronization among processes is a message system. With a message system, one process sends a message to another. The latter is then automatically activated by the run-time support system or operating system to process the message. Such a system incurs overhead because it transfers the actual information, but it provides greater flexibility and ease of use.

1.3 Analysis and Simulation of Real-Time Systems

In the preceding section, we discussed a set of dynamic attributes that cannot be divorced from the functional requirements of a real-time system:

• interrupt handling and context switching

• response time

• data transfer rate and throughput

• resource allocation and priority handling

• task synchronization and intertask communication

Each of these performance attributes can be specified, but it is extremely difficult to verify whether system elements will achieve desired response, system resources will be sufficient to satisfy computational requirements, or processing algorithms will execute with sufficient speed.

The analysis of real time systems requires modeling and simulation that enables the system engineer to assess "timing and sizing" issues. Although a number of analysis techniques have been proposed in the literature (e.g., [LIU90], [WIL90] and [ZUC89]), it is fair to state that analytical approaches for the analysis and design of real-time systems are still in their early stages of development.

1.3.1 Mathematical Tools for Real-Time System Analysis

Figure 1.3

A set of mathematical tools that enable the system engineer to model real-time system elements and assess timing and sizing issues has been proposed by Thomas McCabe [MCC85]. Based loosely on data flow analysis techniques (Chapter 12), McCabe's approach enables the analyst to model both hardware and software elements of a real time system; represent control in a probabilistic manner; apply network analysis, queuing and graph theory and a Markovian mathematical model [GRO85] to derive system timing and resource sizing. Unfortunately, the mathematics involved is beyond the grasp of many readers of this book, making a detailed explication of McCabe's work difficult. However, an overview of the technique will provide a worthwhile view of an analytical approach to the engineering of real-time systems.

McCabe's real-time analysis technique is predicated on a data flow model of the real time system. However, rather than using a DFD in the conventional manner, McCabe [MCC85] contends that the transforms (bubbles) of a DFD can be represented as process states of a Markov chain (a probabilistic queuing model) and the data flows themselves represent transitions between the process states. The analyst can assign transitional probabilities to each data flow path. Referring to Figure 1.3, a value,

0 < pij <= 1.0

may be specified for each flow path, where pij represents the probability that flow will occur between process i and process j. The processes correspond to information transforms (bubbles) in the DFD.

Each process in the DFD-like model can be given a "unit cost" that represents the estimated (or actual) execution time required to perform its function and an "entrance value" that depicts the number of system interrupts corresponding to the process. The model is then analyzed using a set of mathematical tools that compute: (1) the expected number of visits to a process; (2) the time spent in the system when processing begins at a specific process; (3) the total time spent in the system.

1.3.2 Simulation and Modeling Techniques for Real-Time Systems

Mathematical analysis of a real-time system represents one approach that can be used to understand projected performance. However, a growing number of real-time software developers use simulation and modeling tools that not only analyze a system's performance, but also enable the software engineer to build a prototype, execute it, and thereby gain an understanding of a system's behavior.

The overall rationale behind simulation and modeling for real-time systems is discussed [ILO89] by i-Logix (a company that develops tools for systems engineers):

The understanding of a system's behavior in its environment over time is most often addressed in the design, implementation and testing phases of a project, through iterative trial and error. The Statemate [a system engineering tool for simulation and modeling] approach provides an alternative to this costly process. It allows you to build a comprehensive system model that is accurate enough to be relied on and clear enough to be useful. The model addresses the usual functional and flow issues, but also covers the dynamic, behavioral aspects of a system. This model can then be tested with the Statemate analysis and retrieval tools, which provide extensive mechanisms for inspecting and debugging the specification and retrieving information from it. By testing the implementation model, the system engineer can see how the system as specified would behave if implemented.

The i-Logix approach [HAR90] makes use of a notation that combines three different views of a system: the activity-chart, module-chart and statechart. In the paragraphs that follow, the i-Logix approach to real-time system simulation and modeling is described.


The Conceptual View

Figure 1.4

Functional issues are treated using activities that represent the processing capabilities of the system. Dealing with a customer's confirmation request in an airline reservation system is an example of an activity, as is updating the aircraft's position in an avionics system. Activities can be nested, forming a hierarchy that constitutes a functional decomposition of the system. Items of information, such as the distance to a target or a customer's name, will typically flow between activities, and might also be kept in data stores. This functional view of a system is captured with activity-charts, which are similar to conventional data flow diagrams.

Dynamic behavioral issues, commonly referred to as control aspects, are treated using statecharts, a notation developed by Harel and his colleagues [HAR88] [HAR92]. Here, states (or modes) can be nested and linked in a number of ways to represent sequential or concurrent behavior. An avionics mission computer, for example, could be in one of three states: air-to-air, air-to-ground, or navigation. At the same time it must be in the state of either automatic or manual flight control. Transitions between states are typically triggered by events, which may be qualified by conditions. Flipping a certain switch on the throttle, for example, is an event that will cause a transition from the navigate state to the air-to-ground state, but only on condition that the aircraft has air-to-ground ammunition available. As a simple example, consider the digital watch shown in Figure 1.4. The statechart for the watch is shown in Figure 1.5.

These two views of a system are integrated in the following way. Associated with each level of an activity-chart, there will usually be a statechart, called a control activity, whose role is to control the activities and data flows of that level [this is similar is some ways to the relationship between flow models and CSPEC described in Chapter 12]. A statechart is able to exercise control over the activities. For example, it can instruct activities to start and stop and to suspend and resume their work. It is able to change the values of variables, and thus to influence the processing carried out by the activities. It is also able to send signals to other activities and thus cause them to change their own behavior. In addition to being able to generate actions, a controlling statechart is able to sense such actions being carried out by other statecharts. For example, if one statechart starts an activity or increments the value of a variable, another can sense that event and use it, say, to trigger a transition.

Figure 1.5

It is important to realize that activity-charts and statecharts are strongly linked, but they are not different representations of the same thing. Activity-charts on their own are incomplete as a model of the system, since they do not address behavior. Statecharts are also incomplete, since without activities they have nothing to control. Together, a detailed activity-chart and its controlling statecharts provide the conceptual model. The activity-chart is the backbone of the model; its decomposition of the capabilities of the system is the dominant hierarchy of the specification, while its controlling statecharts are the driving force behind the system's behavior.

The Physical View

A specification that uses activity-charts and statecharts in the form of a conceptual model is an excellent foundation, but it is not a real system.What is missing is a means for describing the system from a physical (implementation) perspective, and a means to be sure that the system is implemented in a way that is true to that specification. An important part of this is describing the physical decomposition of the system and its relationship to the conceptual model.

The physical aspects are treated in Statemate using the language of module-charts. The terms physical and module are used generically to denote components of a system, whether hardware, software, or hybrid. Like activities in an activity-chart, modules are arranged in a hierarchy to show the decomposition of a system into its components and subcomponents. Modules are connected by flow lines, which one can think of as being the carriers of information between modules.

Analysis and Simulation

Once we have constructed a conceptual model, consisting of an activity-chart and its controlling statecharts, it can be thoroughly analyzed and tested. The model might describe the entire system, down to the lowest level of detail, or it might be only a partial specification.

We must first be sure that the model is syntactically correct. This gives rise to many relatively straightforward tests: for example, that the various charts are not blatantly incomplete (e.g., missing labels or names, dangling arrows); that the definitions of non-graphical elements, such as events and conditions, employ legal operations only, and so on. Syntax checking also involves more subtle tests, such as the correctness of inputs and outputs. A example of this is a test for elements that are used in the statechart but are neither input nor affected internally, such as a power-on event that is meant to cause a transition in the statechart but is not defined in the activity-chart as an input. All of these are usually referred to as consistency and completeness tests, and most of them are analogous to the checking carried out by a compiler prior to the actual compilation of a programming language.

Running Scenarios

A syntactically correct model accurately describes some system. However, it might not be the system we had in mind. In fact, the system described might be seriously flawed–syntactic correctness does not guarantee correctness of function or behavior. The real objective of analyzing the model is to find out whether it truly describes the system that we want. The analysis should enable us to learn more about the model that has been constructed, to examine how a system based on it would behave, and to verify that it indeed meets expectations. This requires a modeling language with more than a formal syntax. It requires that the system used to create the model recognize formal semantics as well.

If the model is based on a formal semantics, the system engineer can execute the model. The engineer can create and run a scenario that allows him to "press buttons" and observe the behavior of the model before the system is actually built. For example, to exercise a model of an automated teller machine (ATM) the following steps would occur: (1) a conceptual model is created; (2) the engineer plays the role of the customer and the bank computer, generating events such as insertion of a bank card, buttons being pressed, and new balance information arriving; (3) the reaction of the system to these events is monitored, and (4) inconsistencies in behavior are noted; (5) the conceptual model is modified to reflect proper behavior, and (6) iteration occurs until the system that is desired evolves.

The system engineer runs scenarios and views the system's response graphically. "Active" elements of the model (e.g., states that the system is in at the moment and activities that are active) are highlighted graphically, and the dynamic execution results in an animated representation of the model. The execution of a scenario simulates the system running in real-time, and keeps track of time-dependent information. At any point during the execution, the engineer can ask to see the status of any other, non-graphical, element , such as the value of a variable or a condition.

Programming Simulations

A scenario enables the system engineer to exercise the model interactively. At times, however, more extensive simulation may be desirable. Performance under random conditions in both typical and atypical situations may need to be assessed. For situations in which a more extensive simulation of a real-time model is desired, Simulation Control Language (SCL) enables the engineer to retain general control over how the executions proceed, but at the same time exploits the power of the tool to take over many of the details.

One of the simplest things that can be done with SCL is to read lists of events from a batch file. This means that lengthy scenarios or parts of them can be prepared in advance and executed automatically. These can be observed by the system engineer. Alternatively, the system engineer can program with SCL to set breakpoints and to monitor certain variables, states or conditions. For example, running a simulation of an avionics system, the engineer might ask the SCL program to stop whenever the radar locks on target and switch to interactive mode. Once "lock on" is recognized, the engineer takes over interactively, so that this state can be examined in more detail.

The use of scenarios and simulations also enables the engineer to gather meaningful statistics about the operation of the system that is to be built. For example, we might want to know how many times, in a typical flight of the aircraft, the radar loses a locked-on-target. Since it might be difficult for the engineer to put together a single, all-encompassing flight scenario, a programmed simulation can be developed using accumulated results from other scenarios to obtain average-case statistics. A simulation control program generates random events according to predefined probabilities. Thus, events that occur very rarely (say, seat ejection in a fighter aircraft) can be assigned very low probabilities, while others are assigned higher probabilities, and the random selection of events thus becomes realistic. In order to be able to gather the desired statistics, we insert appropriate breakpoints in the SCL program.

Automatic Translation Into Code

Once the system model has been built, it can be translated it in its entirety into executable code using a prototyping function. Activity-charts and their controlling Statecharts, can be translated into a high-level programming language, such as Ada or C. Today, the primary use of the resulting code is to observe a system perform under circumstances that are as close to the real world as possible. For example, the prototype code can be executed in a full-fledged simulator of the target environment or in the final environment itself. The code produced by such CASE tools should be considered to be "prototypical." It is not production or final code. Consequently, it might not always reflect accurate real-time performance of the intended system. Nevertheless, it is useful for testing the system's performance in close to real circumstances.

1.4 Real-Time Design

The design of real-time software must incorporate all of the fundamental concepts and principles (Chapter 13) associated with high quality software. In addition, real-time software poses a set of unique problems for the designer:

• representation of interrupts and context switching;

• concurrency as manifested by multi-tasking and multi-processing;

• intertask communication and synchronization;

• wide variations in data and communication rates;

• representation of timing constraints;

• asynchronous processing;

• necessary and unavoidable coupling with operating systems, hardware, and other external system elements.

Before considering some of these problems, it is worthwhile to address a set of specialized design principles that are particularly relevant during the design of real-time systems. Kurki-Suono [KUR93] discusses the design model for real-time ("reactive") software:

All reasoning, whether formal or intuitive, is performed with some abstraction. Therefore, it is important to understand which kinds of properties are expressible in the abstraction in question. In connection with reactive systems, this is emphasized by the more stringent need for formal methods, and by the fact that no general consensus has been reached about the models that should be used. Rigorous formalisms for reactive systems range from process algebras and temporal logics to concrete state-based models and Petri nets, and different schools keep arguing about their relative merits.

He then defines a number of "modeling principles" that should be considered in the design of real-time software [KUR93]:

Explicit atomicity. It is necessary to define "atomic actions" explicitly as part of the real-time design model. An atomic action or event is a well constrained and limited function that can be executed by a single task or executed concurrently by several tasks. An atomic action is invoked only by those tasks ("participants") that require it and the results of its execution affect only those participants; no other parts of the system are affected.

Interleaving. Although processing can be concurrent, the history of some computation should be characterized in a way that can be obtained by a linear sequence of actions. Starting with an initial state, a first action is enabled and executed. As a result of this action, the state is modified and a second action occurs. Because several actions can occur in any given state, different results (histories) can be spawned from the same initial state. "This nondeterminism is essential in interleaved modeling of concurrency." [KUR93].

Nonterminating histories and fairness. The processing history of a reactive system is assumed to be infinite, By this we mean that processing continues indefinitely or "stutters" until some event causes it to continue processing. Fairness requirements prevent a system from stopping at some arbitrary point.

Closed system principle. A design model of a real-time system should encompass the software and the environment in which the software resides. "Actions can therefore be partitioned into those for which the system itself is responsible, and to those that are assumed to be executed by the environment." [KUR93]

Structuring of state. A real-time system can be modeled as a set of objects each of which has a state of its own.

The software engineer should consider each of the concepts noted above as the design of a real-time system evolves.

Over the past two decades, a number of real-time software design methods have been proposed to grapple with some or all of the problems noted above. Some approaches to real-time design extend the design methods discussed in Chapters 14 and 21 (e.g., data flow [WAR85] [HAT87], data structure [JAC83], or object-oriented [LEV90] methods). Others introduce an entirely separate approach, using finite state machine models or message passing systems [WIT85], Petri nets [VID83], or a specialized language [STE84] as a basis for design. A comprehensive discussion of software design for real-time systems is beyond the scope of this book. For further details, the reader should refer to [LEV90], [SHU92], [SEL94], and [GOM95].

1.6 Summary

The design of real-time software encompasses all aspects of conventional software design while at the same time introducing a new set of design criteria and concerns. Because real-time software must respond to real-world events in a time frame dictated by those events, all classes of design (architectural, procedural and data design) become more complex.

It is difficult, and often impractical, to divorce software design from larger system-oriented issues. Because real-time software is either clock or event driven, the designer must consider function and performance of hardware and software. Interrupt processing and data transfer rate, distributed data bases and operating systems, specialized programming languages and synchronization methods are just some of the concerns of the real-time system designer.

The analysis of real-time systems encompasses both mathematical modeling and simulation. Queuing and network models enable the system engineer to assess overall response time, processing rate and other timing and sizing issues. Formal analysis tools provide a mechanism for real-time system simulation.

Software design for real-time systems can be predicated on a conventional design methodology that extends data flow-oriented or object-oriented design by providing a notation and approach that addresses real-time system characteristics. Alternatively, design methods that make use of unique notation or specialized languages can also be applied.

Software design for real-time systems remains a challenge. Progress has been made; methods do exist, but a realistic assessment of the state-of-the-art suggests much remains to be done.



[ELL94] Ellison, K.S., Developing Real-Time Embedded Software in a Market Driven Company, Wiley, 1994.

[EVE95] Everett, W.W., "Reliability and Safety of Real-Time Systems, Computer, May, 1995, pp. 13-16.

[GLA83] Glass, R.L., Real-Time Software, Prentice-Hall, 1983.

[GOM95] Gomaa, H., Software Design Methods for Concurrent and Real-Time Systems, Addison-Wesley, 1995.

[GRO85] Gross, D. and C.M. Harris, Fundamentals of Queuing Theory, second edition, Wiley, 1985.

[HAN73] Brinch Hansen, P., "Concurrent Programming Concepts," Computing Surveys, vol. 5, no. 4, December, 1973, pp. 223-245.

[HAR88] Harel, D., "On Visual Formalisms, Communications of the ACM, vol. 31, no. 5, May, 1988, pp. 514 - 530.

[HAR90] Harel, D. et al, "STATEMATE: A Working Environment for the Development of Complex Reactive Systems, IEEE Trans. Software Engineering, vol. 16, no. 3, April, 1990, pp. 403-414.

[HAR92] Harel, D., "Biting the Bullet: Toward a Brighter Future for System Development, Computer, January 1992, pp. 8 - 24.

Hatley, D.J. and I.A. Pirbhai, Strategies for Real-Time System Specification, Dorset House, 1987.

[HIN83] Hinden, H.J. and W.B. Rauch-Hinden, "Real-Time Systems," Electronic Design, January 6, 1983, pp.288-311.

[ILO89] "The Statemate Approach to Complex Systems," I-Logix Inc., Burlington, MA, 1989.

[JAC83] Jackson, M., System Development, Van Nostrand-Reinhold, 1983.

[KLI75] Kleinrock, L., Queueing Systems, Volume 1: Theory, Wiley, 1975.

[KUN85] Kung, A and R. Kung, "GALAXY: A Distributed Real-time Operating System Supporting High Availability," Proc. Real-Time Systems Symposium, IEEE, December, 1985, pp. 79-87.

[KUR93] Kurki-Suonio, R., "Stepwise Design of Real-time Systems," IEEE Trans. Software Engineering, vol. 19, no. 1, January, 1993, pp. 56 - 69.

[LEV90] Levi, S.T. and A.K. Agrawala,Real-Time System Design, McGraw-Hill, 1990.

[LIU90] Liu, L.Y. and R.K. Shyamasundar, "Static Analysis of Real-Time Distributed Systems," IEEE Trans. Software Engineering, vol. 16, no. 3, April, 1990, pp. 373-388.

[MCC85] McCabe, T.J., et al, "Structured Real-Time Analysis and Design," COMPSAC-85, IEEE, October, 1985, pp. 40-51.

[SAV85] Savitsky, S., Real-Time Microprocessor Systems, Van Nostrand-Reinhold, 1985.

[SEL94] Selic, B., G. Gullekson, and P. Ward, Real-Time Object-Oriented Modeling, Wiley, 1994.

[SHU92] Shumate, K. and M. Keller,, Software Specification and Design–A Disciplined Approach For Real-Time Systems, Wiley 1992.

[STE84] Steusloff, H.U., "Advanced Real-Time Languages for Distributed Industrial Process Control," Computer,vol.17, no.2, February, 1984, pp. 37-46.

[WAR85] Ward, P.T. and S.J. Mellor, Structured Development for Real-Time Systems, 3 volumes, Yourdon Press, 1985, 1986.

[WIL90] Wilson, R.G. and B.H. Krogh, Petri Net Tools for the Specification and Analysis of Discrete Controllers, " IEEE Trans. Software Engineering, vol. 16, no. 1, January, 1990, pp. 39-50.

[WIT85] Witt, B.I., "Communicating Modules: A Software Design Model for Concurrent Distributed Systems," IEEE Computer, vol.18, no.1, January, 1985, pp. 67-77.

[WOO90] Wood, M. and T. Barrett, "A Real-Time Primer," Embedded Systems Programming, vol. 3, no. 2, February, 1990, pp. 20-28.

[ZUC89] Zucconi, L., "Techniques and Experiences in Capturing Requirements for real-Time Systems, ACM Software Engineering Notes, vol. 14, no. 6, October, 1989, pp. 51-55.


Hatley and Pirbhai [HAT87] and Ward and Mellor (Structured Development for Real-Time Systems, Yourdon Press, 1986) remain the most widely used books for analysis and design of real-time systems. Mattai (Real Time Systems, Prentice-Hall, 1996) addresses program structures, timing analysis using scheduling theory and specifiation and verification of real-time systems. Cooling (Software Design for Real-Time Systems, Thomsen Publishing, 1996), considers the application of formal specification methods for time dependent applications. Ellison (Developing Real-Time Software in a Market Driven Company, Wiley, 1994) considers both management and teachnical aspects of real-time development.

Heath (Real-Time Software Techniques, Van Nostrand-Reinhold, 1991) focuses on implementation issues for the design and development of real-time machine control software. Books by Shumate and Keller (Software Specification and Design–A Disciplined Approach For Real-Time Systems, Wiley, 1992) and Braek and Oystein (Engineering Real Time Systems, Prentice Hall, 1993) provide a wealth of information on both analysis and design modeling for real-time software. Klein (A Practitioner's Handbook for Real-Time Analysis: Guide to Rate Monotonic Analysis for Real-Time Systems, Kluwer Academic Publishers, 1994) addresses the detailed mathematical analysis required to predict the timing behavior of many real-time systems. Mahar et al (Object-Oriented Technology for Real-Time Systems, Prentice-Hall, 1996) and Levi and Agrawala [LEV90] consider real-time systems from the object technologies perspective.

Van Tilborg and Koob (Foundations of Real-time Systems, Kluwer Academic Publishers, 1992) and Krishua and Lee (Readings in Real-time Systems, IEEE Computer Society Press, 1993) have each edited excellent tutorials on real time systems. Schiebe (Real-Time Systems Engineering and Applications, Kluwer Academic Publishers, 1994) has edited an anthology that addresses the engineering methods required for real-time hardware and software.