The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Performance

Jerome H. Saltzer, M. Frans Kaashoek, in Principles of Computer System Design, 2009

6.3.3.4 Priority Scheduling

Some jobs are more important than others. For example, a system thread that performs minor housekeeping chores such as garbage collecting unused temporary files might be given lower priority than a thread that runs a user program. In addition, if a thread has been blocked for a long time, it might be better to give it higher priority over threads that have run recently.

A scheduler can implement such policies using a priority scheduling policy, which assigns each job a priority number. The dispatcher selects the job with the highest priority number. The scheduler must have some rule to break ties, but it doesn't matter much what the rule is, as long as it doesn't consistenly favor one job over another.

A scheduler can assign priority numbers in many different ways. The scheduler could use a predefined assignment (e.g., systems jobs have priority 1, and user jobs have priority 0) or the priority could be computed using a policy function provided by the system designer. Or the scheduler could compute priorities dynamically. For example, if a thread has been waiting to run for a long time, the scheduler could temporarily boost the priority number of the thread's job. This approach can be used, for example, to avoid the starvation problem of the shortest-job-first policy.

A priority scheduler may be preemptive or non-preemptive. In the preemptive version, when a high-priority job enters while a low-priority job is running, the scheduler may preempt the low-priority job and start the high-priority job immediately. For example, an interrupt may notify a high-priority thread. When the interrupt handler calls notify, a preemptive thread manager may run the scheduler, which may interrupt some other processor that is running a low-priority job. The non-preemptive version would not do any rescheduling or preemption at interrupt time, so the low-priority job would run to completion; when it calls await, the scheduler will switch to the newly runnable high-priority job.

As we make schedulers more sophisticated, we have to be on the alert for surprising interactions among different schedulers. For example, if a thread manager that provides priorities isn't carefully designed, it is possible that the highest priority thread obtains the least amount of processor time. Sidebar 6.8, which explains priority inversion, describes this pitfall.

Sidebar 6.8

Priority Inversion

Priority inversion is a common pitfall in designing a scheduler with priorities. Consider a thread manager that implements a preemptive, priority scheduling policy. Let's assume we have three threads, T1, T2, and T3, and threads T1 and T3 share a lock l that serializes references to a shared resource. Thread T1 has a low priority (1), thread T2 has a medium priority (2), and thread T3 has a high priority (3).

The following timing diagram shows a sequence of events that causes the high-priority thread T3 to be delayed indefinitely while the medium priority thread T2 receives the processor continuously.

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Let's assume that T2 and T3 are not runnable; for example, they are waiting for an I/O operation to complete. The scheduler will schedule T1, and T1 acquires lock l. Now the I/O operation completes, and the I/O interrupt handler notifies T2 and T3. The scheduler chooses T3 because it has the highest priority. T3 runs for a short time until it tries to acquire lock l, but because T1 already holds that lock, T3 must wait. Because T2 is runnable and has higher priority than T1, the thread scheduler will select T2. T2 can compute indefinitely; when T2's time quantum runs out, the scheduler will find two threads runnable: T1 and T2. It will select T2 because T2 has a higher priority than T1. As long as T2 doesn't call wait, T2 will keep the processor. As long as T2 is runnable, the scheduler won't run T1, and thus T1 will not be able to release the lock and T3, the high priority thread, will wait indefinitely. This undesirable phenomenon is known as priority inversion.

The solution to this specific example is simple. When T3 blocks on acquiring lock l, it should temporarily lend its priority to the holder of the lock (sometimes called priority inheritance)—in this case, T1. With this solution, T1 will run instead of T2, and as soon as T1 releases the lock its priority will return to its normal low value and T3 will run. In essence, this example is one of interacting schedulers. The thread manager schedules the processor and locks schedule references to shared resources. A challenge in designing computer systems is recognizing schedulers and understanding the interactions between them.

The problem and solution have been “discovered” by researchers in the real-time system, database, and operating system communities, and are well documented by now. Nevertheless, it is easy to fall into the priority inversion pitfall. For example, in July 1997 the Mars Pathfinder spacecraft experienced total systems resets on Mars, which resulted in loss of experimental data collected. The software engineers traced the cause of the resets to a priority inversion problem*.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123749574000153

Real-Time Operating Systems for DSP

Robert Oshana, in DSP Software Development Techniques for Embedded and Real-Time Systems, 2006

Example: priority inversion

An example of priority inversion is shown in Figure 8.27. Tasklow begins executing and requires the use of a critical section. While in the critical section, a higher priority task, Taskh preempts the lower priority task and begins its execution. During execution, this task requires the use of the same critical resource. Since the resource is already owned by the lower priority task, Taskh must block waiting on the lower priority task to release the resource. Tasklow resumes execution only to be pre-empted by a medium priority task Taskmed. Taskmed does not require the use of the same critical resource and executed to completion. Tasklow resumes execution, finishes the use of the critical resource and is immediately (actually on the next scheduling interval) pre-empted by the higher priority task which executed its critical resource, completes execution and relinquishes control back to the lower priority task to complete. Priority inversion occurs while the higher priority task is waiting for the lower priority task to release the critical resource.

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 8.27. Example of priority inversion

This type of priority inversion can be unbounded. The example showed a medium priority task pre-empting a lower priority task executing in a critical section and running to completion because the task did not require use of the critical resource. If there are many medium priority tasks that do not require the critical resource, they can all pre-empt the lower priority task (while the high priority task is still blocked) and execute to completion. The amount of time the high priority task may have to wait in scenarios likes this can become unbounded (Figure 8.28).

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 8.28. Unbounded priority inversion

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780750677592500107

Real-Time Operating Systems

Colin Walls, in Embedded Software (Second Edition), 2012

7.7.4 Implications for Operating Systems and Applications

The discussion of scheduling algorithms so far has centered upon a fairly high-level view of the application software, and no particular details of the runtime environment have been assumed. For example, we have not indicated whether an operating system supports the application.

But, just as the time behavior of the application must be understood to select the right scheduling algorithm, if we do employ an operating system, its own time-dependent behavior also enters into the equation. For example, if the underlying operations of the operating system are unpredictable because of some memory allocation/deallocation strategy, it may well be impossible to reliably schedule the application. In other words, the operating system behavior violates the predictability required by the scheduling algorithm we hoped to use.

It turns out that the problem of operating system interference is quite real, even with an operating system that claims to be real time. This interference can occur within the operating system itself, or be induced at the application level by an inadequate capability in the system call set of the operating system.

There is a phenomenon that can occur within an application, called priority inversion, which serves as a good example of how a lack of care in developing an application may compromise the use of the powerful scheduling algorithms we discussed previously.

The Priority Inversion Problem

The conditions under which priority inversion can occur are quite commonly found in both real-time applications and within operating systems. We will discuss here the occurrence of priority inversion at the application level. The conditions are:

Concurrent tasks in the system share a resource that is protected by a blocking synchronization primitive such as a semaphore or exchange.

At least one intermediate priority task exists between any two tasks that share a resource.

The following scenario illustrates how priority inversion occurs (see Figure 7.11):

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 7.11. Priority inversion

1.

A low-priority task makes a memory allocation call, which in turn uses a semaphore to protect a shared data structure (see part A in Figure 7.11). It takes the semaphore and enters the critical section (illustrated in part B of Figure 7.11).

2.

An interrupt occurs that enables a high-priority task (part C). The kernel switches control to the higher-priority task (part D).

3.

The high-priority task now makes a memory allocation call (part E) and attempts to enter the critical section. Since the low-priority task currently holds the semaphore, the higher-priority task is suspended (blocked) on the semaphore, and the low-priority task runs again (part F).

4.

An interrupt occurs (part G), and a medium-priority task becomes ready to run. The medium-priority task runs (part H) because it is higher in priority than the lower-priority task holding the semaphore and because the high-priority task is suspended on the semaphore.

5.

At this point, all tasks of a priority higher than the low-priority task that become ready to run will do so at the expense of the low-priority task (which is how it should be), but they also run at the expense of the high-priority task, which is held up as long as the low-priority task is held up.

In effect, the high-priority task’s priority has become lowered or “inverted” to match that of the low-priority task—hence the term priority inversion.

If the operating system does not offer any specific facilities to address this problem (priority inheritance), there are three possible solutions:

Avoid sharing resources between tasks of differing priorities.

If the operating system allows, turn off preemption of the low-priority task for the critical section during which it holds the semaphore. (This, of course, inhibits multi-threading, which is normally undesirable.)

The low-priority task could raise its own priority while in the critical section (priority inheritance done “by hand”).

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780124158221000076

Distributed Information Resources

Douglas C. Schmidt, ... Chris Cleeland, in Advances in Computers, 1999

5.1.2 Non-multiplexed Connection Architectures

One technique for minimizing ORB Core priority inversion is to use a non-multiplexed connection architecture, such as the one shown in Fig. 17. In this connection architecture, each client thread maintains a table of preestablished connections to servers in thread-specific storage [66]. A separate connection is maintained in each thread for every priority level, e.g. P1, P2, P3, etc. As a result, when a two-way operation is invoked (1) it shares no socket endpoints with other threads. Therefore, the write, (2), select (3), read (4), and return (5) operations can occur without contending for ORB resources with other threads in the process.

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Fig. 17. Non-multiplexed connection architecture.

The primary benefit of a non-multiplexed connection architecture is that it preserves end-to-end priorities and minimizes priority inversion while sending requests through ORB endsystems. In addition, since connections are not shared, this design incurs low synchronization overhead because no additional locks are required in the ORB Core when sending/receiving two-way requests.

The drawback with a non-multiplexed connection architecture is that it can use a larger number of socket endpoints than the multiplexed connection model, which may increase the ORB endsystem memory footprint. Therefore, it is most effective when used for statically configured real-time applications, such as avionics mission computing systems [62], which possess a small, fixed number of connections.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/S0065245808600182

Integrating Schedulability Analysis and SDL in an Object-Oriented Methodology for Embedded Real-time Systems

J.M. Alvarez, ... J.M. Troya, in SDL '99, 1999

2.2.2 Sharing Resources

As we analyze in section 2.1, resource sharing in SDL can lead to priority inversion situations. This situation can be avoided by using the execution model described above, but data and resource sharing can be very inefficient and difficult to analyze, since it may involve several message exchanges and process context switches. In our model, shared data and resources will be encapsulated into a special kind of processes. These processes are, externally, normal SDL ones, but its behavior is limited as follows:

They act as passive server processes, i.e. they do not initiate any action by themselves.

They only use RPC as communication mechanism, i.e. they are always waiting for receiving RPCs from other processes.

Blocking during transition execution must be bounded.

Each of these processes has assigned a priority ceiling, that is the maximum of the priorities among all the other process transitions where the resource is accessed. In this way we avoid possible priority inversion in data access and blocking time in shared data access is predictable.

Mutual exclusion is also guaranteed, since all the process transitions will be executed at the higher priority between all the processes that shared the resource. In the figure 2 we show an example of how a reader and a writer process access shared data using this schema. The priority ceiling of the SharedData process will be 3, since it is accessed from two different transitions with priority 2 and 3, respectively.

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 2. Resource sharing example

In addition, using this schema for modeling data sharing has another important advantage: it can be implemented very efficiently. Although in the SDL model data are encapsulated in a process, this is not really translated to a real process in the implementation. Each of these processes can be mapped into a set of procedures, one for each transition, inside a module in the target language. These procedures will be called by the processes that share the data after changing its priority to the priority ceiling of the resource.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780444502285500175

Processes and Operating Systems

Marilyn Wolf, in Computers as Components (Fourth Edition), 2017

6.5.5 Priority inversion

Shared resources cause a new and subtle scheduling problem: a low-priority process blocks execution of a higher-priority process by keeping hold of its resource, a phenomenon known as priority inversion. Example 6.5 illustrates the problem.

Example 6.5 Priority Inversion

A system with three processes: P1 has the highest priority, P3 has the lowest priority, and P2 has a priority in between that of P1 and P3. P1 and P3 both use the same shared resource. Processes become ready in this order:

P3 becomes ready and enters its critical region, reserving the shared resource.

P2 becomes ready and preempts P3.

P1 becomes ready. It will preempt P2 and start to run but only until it reaches its critical section for the shared resource. At that point, it will stop executing.

For P1 to continue, P2 must completely finish, allowing P3 to resume and finish its critical section. Only when P3 is finished with its critical section can P1 resume.

Priority inheritance

The most common method for dealing with priority inversion is priority inheritance: promote the priority of any process when it requests a resource from the operating system. The priority of the process temporarily becomes higher than that of any other process that may use the resource. This ensures that the process will continue executing once it has the resource so that it can finish its work with the resource, return it to the operating system, and allow other processes to use it. Once the process is finished with the resource, its priority is demoted to its normal value.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128053874000066

Resource Kernels: A Resource-Centric Approach to Real-Time and Multimedia Systems

Raj Rajkumar, ... Shuichi Oikawa, in Readings in Multimedia Computing and Networking, 2002

2.5. Implicit Resource Parameter

If various reservations were strictly independent and have no interactions, then the explicit resource parameters would suffice. However, shared resources like buffers, critical sections, windowing systems, filesystems, protocol stacks, etc. are unavoidable in practical systems. When reservations interact, the possibility of “priority inversion” arises. A complete family of priority inheritance protocols [31] is known to address this problem. All these protocols share a common parameter B referred to as the blocking factor. It represents the maximum (desirably bounded) time that a reservation instance must wait for lower priority reservations while executing. If its B is unbounded, a reservation cannot meet its deadline. The resource kernel, therefore, implicitly derives, tracks and enforces the implicit B parameter for each reservation in the system. Priority (or reservation) inheritance is applied when a reservation blocks, waiting for a lower priority reservation to release (say) a lock. As we shall see in Section 4.5, this implicit parameter B can also be used to deliberately introduce priority inversion in a controlled fashion to achieve other optimizations.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9781558606517501273

Operating Systems Overview

Peter Barry, Patrick Crowley, in Modern Embedded Computing, 2012

Mutual Exclusion/Synchronization

In many cases it is critical to ensure serialized atomic access to resources such as data structures and/or physical device resources. There are many mechanisms to ensure mutually exclusive access to a resource. The first is to serialize access to the area where atomic updates must occur. An example of a critical section is when the execution context updates a counter in memory or performs pointer updates associated with insertion/removal of a node in a linked list.

When an execution context is performing the atomic update to a resource, it must prevent execution of any other context that might update the resource.

The simplest mechanism is to prevent the scheduling of any other execution context while the update is occurring. In many simple embedded systems, this was often carried out by disabling interrupts. This is not recommended, as it perturbs real-time behavior, interacts with device drivers, and may not work with multi-core systems. Disabling interrupts effectively prevents an operating system–triggered rescheduling of the task running in the critical section. The task must also avoid any system calls that would trigger execution of the OS scheduler. There may also be an operating system call to suspend scheduling of other tasks. Such mechanisms are blunt instruments to ensure mutual exclusion, as they may have a broad system wide impact. These techniques also do not work when being used by user- or de-privileged contexts such as a POSIX thread.

In many systems a hardware mechanism is provided to perform an atomic update in system memory. Processor architectures provide instructions to build mutual exclusion primitives such as Compare and Swap (CAS) and Compare and Exchange (CMPXCHG) in Intel Architecture and load-link/store-conditional instructions on PowerPC, ARM and MOPS architectures. These are sometimes referred to as atomic test and set operations. As an example, we describe how the CAS instruction can be used to perform an atomic update from two execution contexts. The CAS instruction compares the memory location with a given value. If the value at the memory location is the same as the given value, then the memory is updated with a new value given. Consider two execution contexts (A and B) attempting to update a counter: context A reads the current value, increments the current value by one to create the new value, then issues a CAS with the current and new values. If context B has not intervened, the update to the new value will occur, as the current value in memory is the same as the current value issued in the CAS. If, however, context B reads the same current value as context A and issues the CAS instruction before context A, then the update by context A would fail, as the value in memory is no longer the same as the current value issued in the CAS by context A. This collision can be detected because the CAS instruction returns the value of the memory; if the update fails then context A must try the loop again and will mostly likely succeed the second time. In the case described above, context A may be interrupted by an interrupt and context B may be run in the interrupt handler or from a different context scheduled by the operating system. Alternatively, the system has multiple CPUs where the other CPU is performing an update at exactly the same time.

Using a technique similar to the one described above, we can implement a construct known as a spin lock. A spin lock is where a thread waits to get a lock by repeatedly trying to get the lock. The calling context busy-waits until it acquires the lock. This mechanism works well when the time for which a context owns a lock is very small. However, care must be taken to avoid deadlock. For example, if a kernel task has acquired the spin lock but an interrupt handler wishes to acquire the lock, you could have a scenario where the interrupt hander spins forever waiting for the lock. Spinlocks in embedded systems can create delays due to priority inversion and it would be good to use mutexes where possible.

In reality, building robust multi-CPU mutual exclusion primitives is quite involved. Many papers have been written on efficient use of such resources to update data structures such as binary counters, counters, and linked lists. Techniques can be varied depending on the balance of consumer versus producer contexts (number of readers and writers).

The mechanism described above does not interact with the operating system scheduling behavior and the mechanism is not appropriate if the work to be performed in the critical section is long or if operating system calls are required. Operating systems provide specific calls to facilitate mutual exclusion between threads/tasks. Common capabilities provided include semaphores (invented by Edsger W. Dijkstra), mutexes, and message queues.

These functions often rely on underlying hardware capabilities as described above and interact with the operating system scheduler to manage the state transitions of the calling tasks.

Let’s describe a time sequence of two contexts acquiring a semaphore. Assuming context A has acquired a semaphore and context B makes a call to acquire the same semaphore, the operating system will place task B into a blocked state. Task B will transition to a ready state when task A releases the semaphore. In this case the system has provided the capabilities for the developer to provide mutual exclusion between task A and task B. Figure 7.12 shows the timeline associated with semaphore ownership.

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

FIGURE 7.12. Semaphore Timeline.

As you can see, the operating system scheduler ensures that only one task is executing in the critical section at any time. Although we show task A continuing to run after the semaphore is released until it is preempted, in many operating systems the call to release the semaphore also performs a scheduling evaluation and the transition to task B (or other task) could occur at that point.

Note

Priority inversion is a situation that can occur when a low-priority task is holding a resource such as a semaphore for which a higher-priority task is waiting. The high-priority task has effectively acquired the priority of the low-priority thread (thus the name priority inversion). Some operating systems automatically increase the priority of the lower-priority thread to that of the highest-priority waiting thread until the resource is no longer owned by the low-priority thread.

In the case of VxWorks, the mutual-exclusion semaphore has the option SEM_INVERSION_SAFE. This option enables a priority inheritance feature. The priority inheritance feature ensures that a task that holds a resource executes at the priority of the highest-priority task blocked on that resource.

There are two different types of semaphore, namely, binary and counting. Binary semaphores are restricted to two values (one or zero) acquired/available. Counting semaphores increment in value each time the semaphore is acquired. Typically the count is incremented each time the semaphore is acquired or reacquired. Depending on the operating system the semaphore can only be incremented by the task/thread that acquired it in the first place. Generally speaking, a mutex is the same as a binary semaphore. Although we have discussed semaphores in the context of providing mutual exclusion between two threads, in many real-time operating systems a semaphore can also be used to synchronize a task or thread, and in essence indicates an event. Figure 7.13 shows a simple case where a task is notified of an event by an interrupt handler (using VxWorks calls).

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

FIGURE 7.13. Task Synchronization.

As an embedded programmer, you have to be very aware of the services you can use in an interrupt handler. The documentation associated with your RTOS/OS should be referenced; making an OS call in an interrupt handler that is illegal can result in very strange behavior at runtime.

A common alternative to using mutual exclusion primitives to serialize access to data structure is to nominate a single entity in the system with responsibility to perform the updates. For example, a database structure can be abstracted by a messaging interface to send updates and make inquiries. Since only one thread is operating on the data at any time, there is no need for synchronization to be used within this thread. This can be a beneficial approach especially if the software was not originally designed to be multi-thread safe, but it comes at the higher cost of messaging to and from the owning thread. This technique can also be used to create a data structure that is optimized for a single writer and multiple readers. All writes to the data structures are carried out by a single thread that receives messages to perform the updates. Any thread reading the data structure uses API calls to directly read the data structures. This single write/multi-reader pattern allows for high-performance locking approaches, and it is a common design pattern. Figure 7.14 shows a simple message queue–based interaction between two tasks.

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

FIGURE 7.14. Message Queues.

Figure 7.14 uses VxWorks API’s calls to demonstrate the example. Two noteworthy arguments are in the calls (in addition to the message-related calls). The first is the blocking behavior: the example in Figure 7.14 blocks the calling thread forever when the task is waiting for a message; that is, the task is placed in the blocked state when no are messages in the queue and transited to ready when messages are in the queue. Similarly, on the msgQSend() call, the sending task may block if the queue is full, and the receive task does not draw down the messages. Balancing task priorities and message queue sizes to ensure that the system does not block or give unusual behavior is a task you will have to perform as an embedded software designer using an RTOS.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123914903000072

Virtual Machines in Middleware

Tammy Noergaard, in Demystifying Embedded Systems Middleware, 2010

6.2.2.2 Embedded VMs and Scheduling

VM mechanisms, such as a scheduler within an embedded VM, are one of the main elements that give the illusion of a single processor simultaneously running multiple tasks or threads (see Figure 6.31). A scheduler is responsible for determining the order and the duration of tasks (or threads) to run on the CPU. The scheduler selects which tasks will be in what states (Ready, Running, or Blocked), as well as loading and saving the information for each task or thread.

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 6.31. Interleaving Threads in VMs

There are many scheduling algorithms implemented in embedded VMs, and every design has its strengths and tradeoffs. The key factors that impact the effectiveness and performance of a scheduling algorithm include its response time (time for scheduler to make the context switch to a ready task and includes waiting time of task in ready queue), turnaround time (the time it takes for a process to complete running), overhead (the time and data needed to determine which tasks will run next), and fairness (what are the determining factors as to which processes get to run). A scheduler needs to balance utilizing the system's resources – keeping the CPU, I/O, as busy as possible – with task throughput, processing as many tasks as possible in a given amount of time. Especially in the case of fairness, the scheduler has to ensure that task starvation, where a task never gets to run, doesn’t occur when trying to achieve a maximum task throughput.

One of the biggest differentiators between the scheduling algorithms implemented within embedded VMs is whether the algorithm guarantees its tasks will meet execution time deadlines. Thus, it is important to determine whether the embedded VM implements a scheduling algorithm that is non-preemptive or preemptive. In preemptive scheduling, the VM forces a context-switch on a task, whether or not a running task has completed executing or is cooperating with the context switch. Under non-preemptive scheduling, tasks (or threads) are given control of the master CPU until they have finished execution, regardless of the length of time or the importance of the other tasks that are waiting. Non-preemptive algorithms can be riskier to support since an assumption must be made that no one task will execute in an infinite loop, shutting out all other tasks from the master CPU. However, VMs that support non-preemptive algorithms don’t force a context-switch before a task is ready, and the overhead of saving and restoration of accurate task information when switching between tasks that have not finished execution is only an issue if the non-preemptive scheduler implements a cooperative scheduling mechanism.

As shown in Figure 6.32, Jbed contains an earliest deadline first (EDF)-based scheduler where the EDF/Clock Driven algorithm schedules priorities to processes according to three parameters: frequency (number of times process is run), deadline (when processes execution needs to be completed), and duration (time it takes to execute the process). While the EDF algorithm allows for timing constraints to be verified and enforced (basically guaranteed deadlines for all tasks), the difficulty is defining an exact duration for various processes. Usually, an average estimate is the best that can be done for each process.

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 6.32. EDF Scheduling in Jbed

Under the Jbed RTOS, all six types of tasks have the three variables ‘duration’, ‘allowance’, and ‘deadline’ when the task is created for the EDF scheduler to schedule all tasks (see Figure 6.33 for the method call).

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 6.33. Jbed Method Call for Scheduling Task1

The Kaffe open source JVM implements a priority-preemptive-based scheme on top of OS native threads, meaning jthreads are scheduled based upon their relative importance to each other and the system. Every jthread is assigned a priority, which acts as an indicator of orders of precedence within the system. The jthreads with the highest priority always preempt lower-priority processes when they want to run, meaning a running task can be forced to block by the scheduler if a higher-priority jthread becomes ready to run. Figure 6.34 shows three jthreads (1, 2, 3 – where jthread 1 is the lowest priority and jthread 3 is the highest, and jthread 3 preempts jthread 2, and jthread 2 preempts jthread 1).

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 6.34. Kaffe's Priority-preemptive-based Scheduling

As with any VM with a priority-preemptive scheduling scheme, the challenges that need to be addressed by programmers include:

JThread starvation, where a continuous stream of high-priority threads keeps lower-priority jthreads from ever running. Typically resolved by aging lower-priority jthreads (as these jthreads spend more time on queue, increase their priority levels).

Priority inversion, where higher-priority jthreads may be blocked waiting for lower-priority jthreads to execute, and jthreads with priorities in between have a higher priority in running, thus both the lower-priority as well as higher-priority jthreads don’t run (see Figure 6.35).

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 6.35. Priority Inversion1

How to determine the priorities of various threads. Typically, the more important the thread, the higher the priority it should be assigned. For jthreads that are equally important, one technique that can be used to assign jthread priorities is the Rate Monotonic Scheduling (RMS) scheme which is also commonly used with relative scheduling scenerios when using embedded OSs. Under RMS, jthreads are assigned a priority based upon how often they execute within the system. The premise behind this model is that, given a preemptive scheduler and a set of jthreads that are completely independent (no shared data or resources) and are run periodically (meaning run at regular time intervals), the more often a jthread is executed within this set, the higher its priority should be. The RMS Theorem says that if the above assumptions are met for a scheduler and a set of ‘n’ jthreads, all timing deadlines will be met if the inequality Σ Ei/Ti ≤ n(21/n – 1) is verified, where

i = periodic jthread

n = number of periodic jthreads

Ti = the execution period of jthread i

Ei = the worst-case execution time of jthread i

Ei/Ti = the fraction of CPU time required to execute jthread i.

So, given two jthreads that have been prioritized according to their periods, where the shortest-period jthread has been assigned the highest priority, the ‘n(21/n – 1)’ portion of the inequality would equal approximately 0.828, meaning the CPU utilization of these jthreads should not exceed about 82.8% in order to meet all hard deadlines. For 100 jthreads that have been prioritized according to their periods, where the shorter period jthreads have been assigned the higher priorities, CPU utilization of these tasks should not exceed approximately 69.6% (100  ×  (21/100 − 1)) in order to meet all deadlines. See Figure 6.36 for additional notes on this type of scheduling model.

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 6.36. Note on Scheduling

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780750684552000066

Concurrency and Resource Architecture

Bruce Powel Douglass Ph.D., in Real-Time UML Workshop for Embedded Systems (Second Edition), 2014

9.1 What is the Concurrency and Resource Architecture?

The concurrency and resource architecture is one of the key architectural views identified in Chapter 5. Unlike some of the architectural views, it is almost entirely a software architectural concern. Concurrency is a key aspect of almost any real-time and embedded system because it so directly influences its performance.

Concurrency refers to the simultaneous execution of action sequences. Pseudo-concurrency, more common in software, refers to the simulated concurrency achieved when you execute action sequences on the same single-threaded execution environment by switching between tasks when it is appropriate. The task that is currently executing is said to have focus and other tasks may be idle (not waiting to run), ready (waiting to run but not having focus), or blocked (waiting for a resource to be release before it can become ready).

A concurrency unit is a sequence of actions executed so that the sequence of actions within the concurrency unit is fully deterministic (although possibly complex due to branching) but the sequence of actions between concurrency units isn’t known except at explicit synchronization points. Concurrency units are often referred to as processes, tasks, or threads. We will treat all concurrency units the same here because the only difference is really one of scope.

Each concurrency unit typically has its own stack and may have its own heap. The really important thing is that the sequence of actions within a concurrency unit is fully known but the sequence of actions between concurrency units is “don’t know, don’t care.” This is the power of tasks – the ability to divide computations or functionality up into sequences in which the relative order of action execution between these sequences is unimportant. It’s like having minions (and who can’t use a few more of those?) that contribute to the overall plan without having to interact with the other minions. Of course, the overall plan has to be designed so that this can happen efficiently and without undue blocking or deadlock.

Tasks have all kinds of properties (collectively referred to as concurrency metadata). Two of the most fundamental are urgency and criticality. Urgency refers to the “time stress” of the task. If you model a task as having a deadline (a point in time at which the completion of the task action sequence becomes irrelevant or incorrect), then the urgency is the nearness of the deadline. The closer the deadline, the more urgent the task. Criticality refers to the importance of the task completion.

Figure 9.1 shows what I mean. The abscissa shows the time axis while the ordinate shows the criticality. The curve represents the utility function; this is the value to the system of the completion of the task’s actions as a function of time. Note that the utility function in this case is a step function. In real life, a task’s utility is usually not a step function but a more general curve; people model the utility as a step function because this makes the math easier.

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 9.1. Urgency versus criticality.

Since we typically have multiple tasks ready to run, some criterion must be applied to determine who actually gets to run, if you’re running the set of tasks on the same execution resource (CPU). Scheduling patterns include

Interrupt Driven (first-come, first-served)

A set of tasks is driven entirely by incoming events without oversight. The Good: Simple and highly responsive to incoming events. The Bad: Doesn’t scale to many tasks, event handlers must be short, very easy to lose incoming events by not completing within the time window, difficult to share resources and data.

Cyclic Executive

A set of tasks is executed cyclically, each task running to completion. The Good: Very simple and very predictable. The Bad: Suboptimal in terms of responsiveness to incoming events, difficult to coordinate tasks and share data, all tasks run at the same rate.

Timed Round Robin

A set of tasks is assigned an equal time slice and executes within that time slice. The Good: Simple, starvation-free, and everybody gets to play. The Bad: A single misbehaving task can prevent any other task from executing; not responsive to incoming events.

Priority-Based Preemption

A set of tasks is scheduled on the basis of priority. The Good: Responsive to incoming events, handles high-priority processing at the expense of low priority processing. The Bad: A bit more complex to implement, sharing resources leads to priority inversion, naïve implementation can lead to unbounded priority inversion, analysis is more difficult.

The key to using priority-based preemption is the selection of the priority of the tasks. A task’s priority is an assigned numeric value that is used to determine which, among a set of tasks currently ready to run, will actually execute. The task with the highest priority “wins” and the other tasks remain in the ready queue until the higher priority task completes or blocks.

Priorities can be assigned on the basis of urgency or criticality, or some combination of the two. The most common priority-based scheduling schema is Rate Monotonic Scheduling (RMS). RMS assigns the priority on the basis of the recurrence property metadata. RMS assumes that

Tasks are periodic (i.e., time-based event arrival initiating the task)1 and are characterized by

Frequency (or, reciprocally, a period)

Variation around that frequency (“jitter”)

The task deadline is at the end of its period.

The task is infinitely interruptible (i.e., able to be interrupted at any point in its execution by the start of a higher priority task).

RMS assigns a higher priority to a shorter period. RMS is both optimal and stable. By “optimal” we mean that if you can meet the deadlines for the set of tasks by any other scheduling algorithm, you can also meet them with RMS. By “stable,” we mean that in an overload situation in which some deadlines will be missed, you can predict which deadlines these will be – the lower-priority task deadlines. RMS is inherently urgency based. Priority-based scheduling methods are more difficult to analyze than cyclic executives. For this reason, most avionic systems use cyclic executives in spite of their demonstrably suboptimal performance because of the relative ease with which they can be flight certified. Most real-time operating systems (RTOSes) support all of these scheduling approaches.

Figure 9.2 graphically shows the important concurrency metadata concepts.

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 9.2. Concurrency metadata.

A resource is a typically passive element that must be shared among tasks but is constrained in some way. The most common constraint is that it can only be accessed (safely) by a single task at a time – the so-called mutual exclusion problem. Most of the resource sharing patterns2 serialize the access to a resource. The standard ways of doing this are

Critical regions

This approach disables task switching during the access to the resource. The Good: Simple approach solves the mutual exclusion problem. The Bad: Requires that higher-priority tasks cannot preempt the lower-priority task (creating priority inversion) even if the higher-priority task doesn’t need the resource, and breaks the “infinite interruptibility” rule.

Guarded call

This approach adds a lock, such as a mutex semaphore, to a resource. This means that a task accessing the resource must block if that resource is currently in use, and becomes unblocked when the use of that resource is removed. The Good: Approach solves mutual exclusion problem while permitting normal priority preemption for tasks that don’t need the resource. The Bad: Causes priority inversion because the higher priority task that wants to use the resource must block to allow the lower priority task to finish its use of the resource. The Worse: Naive use can lead to unbounded priority inversion. The Worst: Can lead to deadlock in some circumstances.

Queuing requests

This is an asynchronous approach – it serializes all requests for the resource into a queue and handles them when it gets around to it. The Good: Approach solves the mutual exclusion problem and permits tasks to execute in their normal priority scheme. The Bad: Delays access to the resource by queuing the requests, so may not result in a timely response to the request.

A bit of explanation of priority inversion is in order. If you look at the task diagram3 in Figure 9.3 you can see two «active» classes, PacingTheHeartTask and the BuiltInTestTask. I’ve added a stereotype «ConcurrencyUnit» to add the concurrency metadata to those classes and used Rhapsody display options to show then on the diagram. The relevant metadata are

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 9.3. Priority inversion example 1.

Deadline – the time after the start of the task by which it must be complete or the system fails (in this example, the deadline is at the end of the period for all classes)

Execution time – the length of time the task requires focus to complete execution

Period – the length of time between task invocations

Priority – a scalar value used to select the ready task to execute (the lowest numeric value is the highest priority in this case)

In Figure 9.3, the highest priority task is that represented by PacingTheHeartTask. It runs every 50 ms and requires 10 ms to execute, 5 ms of which locks resource CurrentSensor. BuiltInTestTask is the lowest priority task; it runs every 1000 ms and requires 500 ms to execute, during which it locks the resource for 10 ms. Let’s now imagine that BuiltInTestTask is running and locks the resource CurrentSensor and right then PacingTheHeartTask becomes ready to run. What happens?

BuiltInTestTask stops running and the PacingTheHeartTask runs because it is of higher priority. However, when PacingTheHeartTask gets to the point where it needs to invoke services of the resource CurrentSensor, it must block (be put on the blocked task queue) and BuiltInTestTask must be allowed to run so that it can complete its work with the needed resource. Once BuiltInTestTask releases the CurrentSensor resource, PacingTheHeartTask is pulled off the blocked queue and can then lock the needed resource and complete its work. While the BuiltInTestTask is executing, even though PacingTheHeartTask is ready to run, the latter task is said to be blocked and the system is in a condition of priority inversion. Blocking occurs when a higher priority task wants to run but it cannot because a needed resource is unavailable.

Will the high priority task still meet its deadlines? The worst case is that PacingTheHeartTask becomes ready to run just as soon as BuiltInTestTask locks the resource. That means it will be blocked for the full 10 ms that BuiltInTestTask locks the resource. The total time for completion of PacingTheHeartTask is the execution time (10 ms) plus the blocking time (10 ms for a total of 20 ms), which is still less than the deadline (50 ms). So, yes, deadlines are met in this situation.

Let’s elaborate this system slightly. Figure 9.4 adds more tasks into the concurrency architecture. They have various periods and execution times but all are assigned priorities based on RMS. None of these tasks use the resource CurrentSensor. Can PacingTheHeartTask still meet its deadlines?

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 9.4. Priority inversion example 2.

Let’s consider the worst case: BuiltInTestTask locks the resource and then the next lowest priority task (CommTask) becomes ready to run. It doesn’t need the resource so it isn’t blocked so it runs. Then, the other tasks become ready to run in quick succession: QRSWaveAnalysisTask, HolterMonitoringTask, and finally DataSmoothingTask. BuiltInTestTask cannot run until all the higher priority tasks have completed, which will take a combined time of 370 ms (80 + 100 + 100 + 90). At this point, PacingTheHeartTask, the highest priority task in the system, runs. It preempts all the lower priority tasks and runs until it reaches the point at which it needs the resource and blocks. But at this point, it will be 380 ms (370 + 10) before it can unblock. PacingTheHeartTask fails to meet its deadline and the pacemaker patient now has a problem.

The fact that there are an unlimited number of intermediate-priority tasks that can prevent BuiltInTestTask from releasing its resource is known as unbounded priority inversion. This can be a subtle and elusive problem to detect via testing and can be a serious problem with naïve implementations of priority-based scheduling. There are a number of technical solutions (design patterns) that address this problem, including

Critical regions – don’t allow task switching during the time a resource is blocked.

Avoid the use of locking resources; for example, run the resource in a separate thread and use an asynchronous rendezvous to request services.

Use a priority inheritance-based solution.4

All these solutions have pros and cons, but they all (more or less) solve the problem of unbounded priority inversion. You should note, however, that if you use blocking semaphores on resources, you will always have priority inversion, but it is possible to limit it to a single level with diligent design. It is certainly easier when tasks are independent and don’t share resources, but that is fairly rare in real systems.

That having been said, in most systems, tasks do depend on the order of execution of other tasks but usually only in limited and well-defined ways. Consider Figure 9.5. In this figure we use an activity diagram to show the overall process for making coffee. Between the fork and the join, we show two task threads, which are independent. One of these is focused on boiling the water while the other is focused on preparing the coffee serving. However, both these tasks must be at the proper synchronization point (i.e., the water is hot and the coffee serving is prepared) before processing (i.e., mixing the coffee into the water) can proceed. Whether the water is heated before the coffee is removed from the freezer or not doesn’t matter – the order of execution of those actions is independent. However, there is an explicit synchronization point at which processing cannot proceed unless both tasks are at a certain point in their execution sequences.

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 9.5. Making coffee.

Let us now consider modeling the concurrency architecture of a simple example. Figure 9.6 is a simple control system. At the top left of the figure, the environmental data is acquired, filtered, and averaged from four different sensors. The data is acquired under the control of the SensorManager at a rate of 10 Hz (i.e., 100 ms rate) and this is used to update the AverageSensedData. The controller oversees the entire system and every 30 seconds it may choose to adjust the control points for the high-speed closed loop controller of the actuators based upon the averaged data. The high-speed closed loop control updates the output at 100 Hz (i.e., 10 ms rate) and does high-speed control to reduce the error between the monitored average values and the control set points. So how do we turn the analysis model in Figure 9.6 into a design model with concurrency?

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 9.6. Control system.

In this example, we’ll use the recurrence properties to determine the task set. Actions occur at three different periods: 10 updates/sec for data acquisition, 100 updates/sec for closed loop control, and one update every 30 seconds for the set points. Each of these processing sets will become an «active» object managing its own task. The AverageSensedData class is a resource shared among the task threads. For the resource sharing policy, we’ll use the Guarded Call pattern to enforce serialization of the access to the data.5

To make this into a task diagram, these classes must be made parts of the relevant «active» objects.6 Lastly, add ports to connect the objects across the thread boundaries.

A feature of UML is that you cannot connect ports of classes with associations; you can only connect ports on instances with links. Thus, in the resulting task diagram of Figure 9.7, we’ve instantiated the tasks and connected those instances with links to show the connections. We’ve also added the period and priority in constraints anchored to the appropriate «active» objects.7 To be complete, we also added the required interface (iAveragedData) to specify the ports.8

The amount of work waiting to be completed but delayed due to unavailable resources is known as:

Figure 9.7. Concurrency design model.

Concurrency architecture can be highly complex and the interested reader is referred elsewhere for more details on the ins and outs.9 For the purpose of this text, we will select the strategies for you to enter into your design. Nevertheless, it is useful to have a quick glossary of terms important in the concurrency and resource model (Table 9.1)

Table 9.1. Some Concurrency Definitions.

TermDefinition
Arrival pattern The recurrence property that specifies when events of a given type occur; e.g., periodic or aperiodic. Arrival patterns are detailed with quantitative information such as the period, jitter, minimum interarrival time, and/or average interarrival time.
Blocking time The amount of time a high priority task is prevented from execution because a lower priority task owns a resource.
Criticality The importance of the completion of an action.
Deadline In hard real-time systems, the time after an initiating event that the action must be completed to ensure correct system behavior.
Deadlock A condition in which a system is waiting for a condition that can never, in principle, occur. Can only occur when four conditions are met: (1) tasks can be preempted, (2) resources can be locked while tasks are waiting for other resources, (3) tasks can suspend when owning resources, and (4) a circular waiting condition exists.
Execution time The amount of time required to execute the response to an initiating event.
Hard real-time A real-time system characterized by the absolute need to adhere to a set of deadlines.
Interarrival time The time between the arrival of events of the same type; for periodic tasks this is a constant, but for aperiodic events, it can vary widely.
Jitter The variation in the actual arrival time of periodic events.
Period The length of time between the arrival of periodic events.
Priority In a multitasking preemptive system, the priority is a numeric value that is used to select which task, from a set of tasks currently ready to run, will run preferentially.
Race condition A condition in which the computational result depends on a sequence of actions, but whose order is inherently unknown or unknowable.
Real-time A system in which the specification of correctness includes a timeliness measure.
Resource An element that provides information or services of a quantifiably finite nature, e.g., a set of services which must be invoked atomically or provides a set of objects from a finite pool.
Schedulability The mathematically demonstrated ability for a set of tasks in a hard real-time system to always meet their deadlines.
Synchronization pattern The means by which tasks will synchronize their execution at specific, defined points.
Timeliness The ability of a task to always meet its deadlines.
Urgency The immediacy of the need to handle an event or to complete an action.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B978012407781200009X

What term is used for the amount of time an activity can be delayed from its early start?

FloatThe amount of time an activity can be delayed from an early start without delaying the completion date., sometimes called slack, is the amount of time an activity, network path, or project can be delayed from the early start without changing the completion date of the project.

What is the greatest resource demand rule?

- This decision rule starts by determining which projects in the company's portfolio will pose the greatest demand on available resources. Those projects that require the most resources are first identified and their resources are set aside.

What information is usually provided on a resource usage Calendar *?

A resource calendar should include all team members, their availability, their leaves of absence, and the amount of time they can dedicate to the specific project. You can use spreadsheets, a resource management software system, or Google Calendar to create an effective resource calendar for your team.

Is defined as placing resources on a detailed schedule of tasks?

Resource allocation, also known as resource scheduling, involves identifying and assigning resources to various activities for a specific period. It also monitors the resource's workload throughout the project life cycle and reassigns them if necessary.