|
Cyber
Physical Computing
Department
of Computer Science
The
|
|
|||||||
|
|
||||||||
For inquiries, please see contact
info.
Sponsor: NSF
The theoretical foundations of this
work were funded by two NSF grants. EHS 0208769 (A Paradigm for Scalable Open
Real-Time Computing under Uncertainty) funded early efforts in 2002-2006, that
resulted in Feasible Region Calculus. CSR 0720513 (An Extended Theory for
Temporal Composition of Distributed Real-Time Computing Systems) funded
follow-up work in 2007-2012, that resulted in Delay Composition Algebra.
Applications of this theory were funded by MURI, ARL, ONR NSF, and DTRA grants.
Timing properties of software (relating
to its ability to meet timing constraints) have traditionally been analyzed
using real-time scheduling theory. This theory is perhaps one of the most
studied and well understood aspects of real-time computing. However, the
current state of the art is far from complete.
Most prior research addressed small
systems such as single processors or multiprocessor machines. In contrast,
future computing systems will be marked by massive distribution. They will see
increased convergence of computation, communication, and sensing, and will
augment a widely-distributed physical environment, interacting with it under
constraints of real time and space. Despite the very large volume of real-time
computing literature that presents different analysis techniques for computing
the ability of systems to meet time constraints (called schedulability analysis in real-time computing terminology), there
is a lack of a fundamental theory that allows systematic analysis of distributed real-time systems by
appropriately composing properties of their components. Consider, by contrast,
the fundamental laws of circuit theory used to analyze linear electric
circuits. In that theory, a small number of fundamental rules (e.g., Kirchhoff
laws) allow a designer to analyze complex circuits of arbitrary interconnection
topology, reducing them to their effective transfer functions and deducing
their end-to-end characteristics, such as total impedance, current draw, and
voltage drop. Feasible region calculus aims to develop a similar theory for
distributed real-time computing based on results, which suggest rules for
composition of temporal behavior of real-time system components.
Observe that the capacity of a
distributed system is a complicated concept because the rate at which a distributed
system can execute an application is a function of not only the speed and
topology of the system but also the application flow graph. For example, an
application with more internal parallelism will run faster on a given system
than one that is primarily serialized. Furthermore, in a time-constrained
system, tasks that finish late do not contribute useful work. Hence, it is
important to quantify only the fraction of work that can be done on time. We
call it the real-time capacity of the
system. A mathematical foundation is developed for composing real-time capacity
expressions given application flow graphs, time constraints and resource
topology. This foundation utilizes a small set of composition rules to arrive
at the corresponding application-aware capacity expressions. We also consider limits
when system components become infinitesimal and the number of components needed
for an application grows to infinity. In such cases, it will be more
appropriate to consider resource density and liquid task flows rather than
individual instances of resources and tasks. Capacity expressions are
re-written in terms of density and flow distributions. Such models will be
suitable for future computing artifacts such as smart paint and amorphous
computing systems where individual computing components become insignificant
and computations may involve very large numbers of components.
2. Utilization-Based Feasible Regions
One of the big gaps in the theoretical
underpinnings of real-time computing have traditionally lied in the inability
to relate aggregate metrics (such as utilization) to per-task temporal
performance metrics (such as response time of individual tasks) except in some
restricted special cases (such as that of scheduling periodic tasks). The first
major result of feasible region calculus has been the derivation of regions,
defined in the state space of resource utilization, that imply satisfaction of
all individual temporal performance constraints. Utilization bounds for
schedulability existed previously only for periodic tasks. However, the
periodicity requirement was a limit on applicability. Moreover, no framework
existed for composing the bounds of individual components to compute those of
aggregate systems. Composition is complicated by the fact that in a multistage
system a task can still meet its end-to-end deadline despite overload on some
resources as long as others are underloaded. Hence, looking at resources in
isolation may be misleading. Instead, a surface exists in a space whose
dimensions are the utilizations of individual resources such that all
utilization vectors below that surface result in schedulable systems. The shape
of that surface is a key research problem addressed by feasible region
calculus.
Observe that most scheduling problems
are NP-complete, which typically calls for heuristic solutions. We demonstrate
that computationally tractable sufficient solutions to the general problem of
temporal analysis of distributed resource constrained communicating tasks are
also possible if the feasible region is appropriately defined. The following
references constitute a brief tutorial on feasible region calculus.
The first basic result: a utilization
bound for aperiodic tasks under fixed-priority scheduling. This result drops
the periodicity requirement that prevented prior work from being applicable to a
large category of systems with no regularity in task arrival patterns.
Extensions of the basic results to
multiprocessors (these are derived mainly for the sake of completeness and
intellectual curiosity).
Extensions that account for mutual
exclusion (blocking): These allow modeling systems of non-independent tasks, in
which semaphores or other synchronization constraints are used to regulate
access to passive resources.
Extensions of feasible regions to
resource pipelines: These basic results allow composing feasible regions of
individual components to compute those of distributed systems.
An application of feasible region
calculus to compute the real-time capacity of a sensor network:
3. Non-Utilization-Based Feasible Regions
The scheduling policy has a dramatic
effect on the ability of a system to meet deadlines. Hence, feasible region
calculus must allow composition under different scheduling policies. An
interesting and important consideration is that often the scheduling policy is
imposed on the application developer by legacy factors such as operating system
support. Hence, the scheduling policy is often a constraint and not a design
parameter. This is very different from the typical direction in real-time
computing that focuses on analyzing a handful of ``good'' policies with no
general framework for incorporating arbitrary (often suboptimal) scheduling
policies into the analysis as design constraints. A significant contribution of
out work therefore lies in its ability to analyze arbitrary scheduling policies
within the same general framework for quantifying real-time capacity. To the
best of our knowledge, ours is the first such general framework in real-time
computing literature.
The incorporation of arbitrary
scheduling policies into the capacity analysis has very important ramifications
on the analytic engine used. Previous efforts have attempted to relate latency
(or ability to meet deadlines) to system state (typically, resource
utilization) in distributed systems. The implicit assumption has been that
utilization is a good predictor of latency or the ability of a system to meet
deadlines. Unfortunately, we have shown that, depending on the resource
scheduling policy, utilization may or may not be a good predictor of ability to
meet latency constraints. Instead, we developed an automated method for
choosing, for any scheduling policy, a family of load metrics (not necessarily
utilization) to represent system state such that it becomes possible to predict
(based on any metric in that family) whether or not deadlines are met.
Specifically, if the load, as measured by the new metrics, is within a certain
range, it is assured that all distributed tasks finish on time. As before,
composition rules are defined to determine feasible regions defined on
non-utilization-based load metrics.
·
Xue Liu and Tarek Abdelzaher, "Nonutilization bounds and feasible regions for
arbitrary fixed-priority policies," ACM Transactions
on Embedded Systems, Vol. 10, Issue 3, April 2011. (An earlier
version appeared in the Proceedings of the Twelfth IEEE Real-Time and Embedded
Technology and Applications Symposium (RTAS 2006))
The
above line of research focused on extensions to the derivation of utilization
bounds or utilization-like bounds for schedulability. These extensions
addressed aperiodic tasks, new load metrics, and pipelines. The idea was to
quantify feasible regions defined in the space of resource utilization such
that end-to-end deadlines are met as long as the system operated within the
feasible regions derived. Feasible regions, like utilization bounds, however,
are high-level derived constructs. A more basic and fundamental question is to
ask: how does delay compose in distributed systems? Answering this question led
to the development of a new approach for schedulability analysis of distributed
systems, based on reducing them to equivalent centralized uniprocessors. This
approach, called delay composition algebra, is described below.
4. Sub-additive Delay Composition
Uniprocessor
schedulability results are “easy” because delays on a single
processor compose by addition. If task A is preempted by task B and task C,
then it is preempted by the sum of B’s and C’s (worst case)
computation times. This additive property is the foundation for schedulability
analysis expressions for uniprocessors. The problem with distributed systems is
that delays do not necessarily compose additively. For example, if task A is
preempted by B and C on processors X and Y, where X and Y form a pipeline, then
task A is not necessarily preempted by the sum of B’s and C’s
(worst case) computation times on X and Y. This is due to the inherent
concurrency in distributed systems. If A waits for B and C on processor X, then
while A finally executes on X, B and C might execute on Y. Hence, A does not
wait for all of B and C on Y. The delay composition is therefore sub-additive.
The interesting question is, can we describe how delays compose in distributed
systems? The following papers derived the basic delay composition theorems for
resource pipelines and directed acyclic graphs:
The
above papers lead to the development of a delay composition algebra. It enables
a new approach to schedulability analysis in distributed systems: namely, a
reduction-based approach. The
algebra reduces distributed system workload to an equivalent uniprocessor
workload that can be analyzed using uniprocessor schedulability analysis
techniques to infer end-to-end delay and schedulability properties of each of
the original distributed jobs.
5. The Basic Algebra
The
algebraic operators in the delay composition algebra perform reductions on a
resource graph (representing a distributed system) into a smaller graph by
merging nodes. Each node represents a resource in the system and is
characterized by a load matrix that summarizes its workload. When nodes are merged
in the original system, their workloads are merged as well into an equivalent
workload for the resulting node. The main function of the algebraic operators
of delay composition algebra is to perform workload reductions as nodes merge.
The reductions stop when only one node remains, meaning that the distributed
system workload is fully reduced to that of an equivalent uniprocessor.
Analyzing the workload on the resulting uniprocessor using any of the standard
uniprocessor schedulability analysis techniques determines the schedulability
of the original distributed task set.
A key
consideration in developing operators of the new algebra is that the operand
set must be closed under composition. In other words, both the operands and the
result of each operator must have the same structure. As mentioned above, the
operands refer to the load matrices of the nodes being merged. The result is
the load matrix of the resulting merged node. Since merged nodes that result
from composition represent more than one physical node, the workload
representation for a node should be the same whether this node represents a
single resource or the result of merging an entire subsystem. A key
contribution of this project has been the attainment of such a general
representation (the load matrix of a node) and the derivation of simple
algebraic operators that manipulate this representation to reduce distributed
systems workload into a single-node workload that is equivalent from a
schedulability perspective.
Observe
that if the operand matrices include variables (e.g., task computation times
that are yet unspecified), applying our algebraic operators to reduce the
distributed system to a single node yields equivalent uniprocessor load
expressions that are a function of those variables. Furthermore, applying
schedulability tests (e.g., uniprocessor utilization bounds) to the equivalent
load yields schedulability expressions that are a direct function of the
original load variables of the distributed system. This makes it possible to quantify the
distributed system parameter space for which schedulability is attained.
In
summary, existing techniques for analyzing delay and schedulability of jobs in
distributed systems can be broadly classified into two categories: (i)
decomposition-based, and (ii) extension-based. The decomposition-based
techniques break the system into multiple subsystems, analyze each subsystem
independently using current uniprocessor analysis techniques, then combine the
results. The extension-based techniques explore ways to extend current
uniprocessor analyses to accommodate distributed tasks and resources. In
contrast, we developed a third category of techniques for analyzing distributed
systems based on reduction (as opposed to decomposition or extension). Rather
than breaking up the problem into subproblems, or extending uniprocessor
analyses to more complex systems, we systematically reduce the distributed
system schedulability problem to a single simple problem on a uniprocessor. The
uniprocessor problem is then solved thereby determining the schedulability of
the original distributed system. The following paper describes the basic delay
composition algebra:
6. Optimizations and Extensions
The original algebraic framework,
published in 2008, was extended and various optimizations were developed. The
first extension was to the case where the task flows in the system can form
dependency cycles:
Reduction of distributed systems to
equivalent uniprocessors led to a new 'unirpocessor' task model that does not
naturally arise in uniprocessor systems and hence was not previously explored.
New schedulability results had to be developed for this task model to produce a
better schedulability test. Namely, in distributed systems of periodic tasks, a
task that traverses a given set of processing stages may end up co-located with
a different subset of other tasks on each stage of its distributed execution.
When the entire task system is reduced to a single equivalent uniprocessor, an
effect similar to mode changes is observed: the task set that competes with a
given task, say task T, changes over time as task T moves from one stage to
another in the original distributed system. The exact times at which the (mode)
changes occur are not known in advance, since they depend on task latency on
each stage, which is not yet computed. What is known, however, is the set of competing
tasks at each stage. Hence, a new problem arises on the reduced equivalent
uniprocessor: Namely, given task T and the competing task sets in each mode, it
is desired to find the worst-case possible time instants for mode changes,
taking into account that such changes can occur only when task T moves from one
stage to the next in the original distributed system. A polynomial-time
algorithm was developed to solve the above problem. Note that, this algorithm
is orthogonal to the reduction developed in previous years. The reduction
merely defines the parameters of the equivalent task set in each mode on the
equivalent uniprocessor (given the original execution parameters of tasks in
the distributed system). Those parameters are computed via set of maximization
and addition operators. Schedulability analysis of the uniprocessor with mode
changes was published in RTSS 2009:
A comprehensive summary of the algebra
was then invited for publication in the Journal of Real-time Systems:
In an extension of the main algebra
line of research, the PIs used the developed delay composition algebra to understand
robustness of end-to-end timing behavior of distributed systems to changes in
task execution parameters at individual stages. The algebra offers a simple,
compact way to relate parameters of individual stages to properties of
end-to-end delay. This, in turn, led to the idea of optimizing robustness. In
other words, a research question posed was how to allocate finite resources in
a distributed system in such a way that robustness is maximized while ensuring
that the system remains schedulable. A metric to measure robustness was
developed, and an algorithm to maximize robustness (subject to schedulability
constraints) was derived and published in RTSS 2010:
A second advance was to develop a
closed form real-time capacity result for (a class of) distributed real-time
systems. Specifically, we developed a closed form real-time capacity expression
for data fusion systems characterized by tree topologies where the workflows
originate at the leaves and propagate towards the root. The closed form
expression describes the conditions under which end-to-end deadlines of all
workflows are met. The result was derived by reducing such a system generically
to an equivalent uniprocessor using the reduction techniques developed earlier
in the project, then using uniprocessor utilization bounds to derive an
expression of the schedulability boundary. A paper on the topic appeared in
Fusion 2011:
Finally, the authors extended the above
paper to consider the important case where synchronization primitives are used
such that a module must wait until all of its inputs arrive before executing.
This case was new in that all previous work on the algebra under this project
assumed that any input arrival always triggered some work immediately. A good
example of systems where such synchronization is used is data fusion systems
where different data pipelines merge and fusion results can be computed only
when all data inputs have been received. The previously derived delay
composition theorem was augmented to account for additional latency imposed due
to synchronization. A reduction was obtained from (a category of) systems that
feature such synchronization to an equivalent uniprocessor. These results were
published in Fusion 2012.
7. Example Applications Related to this Project
The framework found applicability in many
applications (addressed, in part, under separate funding). The PI and his
students demonstrated that their work on schedulability in pipelines can be
integrated with recent network optimization techniques to produce wireless
networks that dynamically maximize the total bandwidth of schedulable flows:
The algebra developed in this project
further allowed latency analysis of new classes of networks, namely
disruption-tolerant networks, a result that was published in RTSS 2010: