Title: Testing Timing
1Testing Timing
- W. T. Tsai
- Department of Computer Science and Engineering
- Arizona State University
- Tempe, AZ 85281
2Issues in Testing Time-Critical Systems
- Testing is intrusive
- Operating conditions impact timing behavior
- Characteristics of Timing Constraints require
different strategies - Non-availability of sufficient hardware resources
poses problems - Timing constraints impact other testing
- Short Time Intervals pose problems
- Long Time Intervals pose problems
- Requirements refinement is done as part of
testing timing constraints - Performance Optimization can be done while
testing timing constraints
3Timing Bug Classification
4Timing Bug Classification
- Rate bugs
- Normal period
- Period too short
- Period too long
- Irregular period
5Timing Bug Classification
- Premature events (Early events)
- Late events (Missed deadlines)
6Strategies for Testing Timing Constraints
- Time Period Contraction
- Time Period Expansion
- Load/Stress Testing
- Configuration Testing
- Phase Testing
- Interval Synchronization Testing
7Time Period Contraction
- Use a shorted time period than that specified in
a requirement (in other words, use time period
contraction) - Useful for testing requirements with long time
periods - Based on the assumption that the system would
satisfy the requirement when the longer time
period is used in the field, given that the
system satisfies the requirement when the shorter
time period is used - If the GPS oscillator is flywheeling for more
than 24 hours, then a GPS oscillator critical
alarm must be raised
8Time Period Contraction Uses
- Helps in limiting the amount of time and
resources required for testing a requirement that
involves long time durations - Force a timeout to test certain requirements or
force certain actions while testing. Example If
no activity is detected for more than 15 minutes
in a call, the dormancy event must be triggered.
Use Time Period Contraction to trigger dormancy
event quickly.
9Time Period Expansion
- Use a longer time period than that specified in a
requirement (in other words, use time period
expansion) - Useful for testing requirements with very short
time periods - If a ConnectAck event is received within 50
milliseconds after a Connect event, then
signaling channel must be setup - Use when it is difficult to generate event (s)
within a very short time period in lab
environments. - Use to eliminate or prevent timeouts in some
parts of the system while other parts of the
system are being tested
10Load/Stress Testing
- As the load on a system increases, the
performance of a system degrades, especially with
regard to timing constraints - Useful in determining if the system will satisfy
its timing constraints under the various stress
conditions that can be applied to the system in
the field - Identification of Bottlenecks in the system
- Helpful in performance optimization and
requirements refinement
11Bottleneck types
12Load/Stress Testing Steps
- Identify the factors that impact the stress
conditions on the system. Analysis of the domain
as well as usage patterns of the system can help
in identifying such factors. - After such factors are determined, identify means
to place the system under those stress
conditions. - Develop tests for the different stress
conditions. Well-known test case generation
techniques such as equivalence testing and
boundary value testing can be used to develop
test cases exercising varying amounts of stress
on the system. - Analyze the data collected to identify
bottlenecks and the conditions under which they
occur
13Configuration Testing
- Testing a system by placing it in its different
possible configurations - Critical for testing timing constraints since
timing behavior can vary across configurations. - Both hardware and software configurations
14Different Possible Configurations
15Phase Testing
- An operation in a system can be composed of
several sub-steps or phases - Requirements can specify timing constraints on an
entire operation as well as its individual phases - Phase testing enables testers to analyze where
most time is being spent for an operation - Helpful for performance optimization as well as
requirements refinement
16Phase Partitioning
- Partition an operation into several phases based
on different criteria - Request-Response pairs
- Feature Based
- Component Based
- Process/Server Based
17Feature Based Partitioning
18Component Based Partitioning
19Server/Process Based Partitioning
20Phase Testing Steps
- Identify phases in the system based on different
partitioning criteria - For each operation that needs to be tested,
defined entry and exit point(s) for each phase
within it as well as for the entire operation - For each entry and its corresponding exit point,
collect timing data. - Analyze the data collected for each phase as well
as the entire operation to determine if any
constraints have been violated
21Phase Testing Uses
- To isolate where (in which phase) most time is
being spent for an operation - To uncover some hidden bugs in the system that
impact timing performance - To uncover certain types of requirement bugs
- Used to revise the timing constraints
22Interval Synchronization Testing
- To verify that both sides of a data exchange
(senders and receivers) do satisfy their interval
constraints setup for that exchange, i.e., they
start at their appropriate start times and stop
at the right times
23Problems with Interval Synchronization Testing
- No common clock between senders and receivers
- Steady stream of data needed
- Transport link or Intermediate Node problems
- Identification Problem
- Multiple Bug Cases
24Interval Synchronization Testing Steps
- Setup a data application that generates a steady
stream of data at the server side. Start the
application before the presumed start time and
let it run beyond the presumed stop time on the
sender side. - If possible, setup error-free transport between
the sender and the receiver and also ensure that
none of the intermediate nodes will have any
bottlenecks or overloading conditions during the
test. - Setup sender and receiver with computed start and
stop times as per test case being executed.
25Interval Synchronization Testing Steps (continued)
- On the sender side, check if the computed start
time matches with the time when the actual data
starts being transmitted. Also check if the
computed stop time matches with the time when the
actual data transmission is stopped. If either
of these do not match a bug can be reported. - On the receiver side, record the computed start
and stop times as well as when data actually
starts and stops being received from the sender. - Analyze the data recorded to figure out if there
is any interval synchronization bug.
26Patterns for Testing Timing Constraints
- Data Overhead Reduction Patterns Patterns that
describe techniques that apply to reducing the
overhead involved in collecting data necessary
for testing. - Event Generation Patterns Patterns that describe
techniques that generate events with certain
characteristics useful for testing. - Event Monitoring and Analysis Patterns Patterns
that cover techniques useful in collecting events
necessary for analyzing test results.
27Data Overhead Reduction Patterns
- One of the main problems with testing timing
constraints is the impact of collecting data. - In other words, the intrusive nature of the data
collection part of the testing can adversely
impact the test results.
28Data Overhead Reduction Patterns
- Peek and Poke Pattern
- Data Buffering Pattern
- Multiple Levels of Tracing
- Partial Buffer Recording Pattern
- Event Throttle Pattern
29Peek and Poke Pattern
- The ability to view (peek) or modify (poke)
certain registers within the system is important. - By storing timing-related data in registers, the
data about various operations can be collected
and stored with minimal intrusiveness. - Peek and poke pattern refers to modifying certain
areas of the memory or registers and reading them
subsequently to gather data. - This feature may be available only on certain
platforms unfortunately.
30Data Buffering Pattern
- Writing the information to I/O or storage devices
is costly and can cause excessive delays in the
system. - Instead of sending data whenever it is ready,
buffer them and send it at certain pre-defined
instants or when sufficient data are collected. - However, in embedded devices the limited
availability of memory means that designers must
choose the data to be buffered after careful
analysis because not all data can be buffered.
31Multiple Levels of Tracing
- When a system executes, much information may be
available and can be collected state changes,
alarms, database updates, incoming events and
outgoing events. - However, while testing a particular requirement,
only a few of these may be needed to verify if
the requirement is satisfied. - To make the testing minimally intrusive, it is
necessary to collect only the relevant
information.
32Multiple Levels of Tracing (Contd)
- Multiple levels of tracing can be used to record
different kinds of data with different tracing
levels.
void handlePSMM(PSMMMsg psmm) // Received PSMM
Message TRACE (MESSAGE_RCVD, Received PSMM
Message) // Processing of PSMM Message if
(psmm-gtpilotPN ! refPN) // ERROR Case TRACE
(DATA_ERROR, Invalid pilotPN value received in
PSMM) return // Sending HODir
Message sduIwm-gtsendMessage (hoDir) TRACE
(MESSAGE_SENT, SentHODir Message) return
33- The general steps involved in using multiple
tracing levels pattern - Define different tracing levels required for
testing. - Define a TRACE utility that has the following
features - Provides a software utility (TRACE macro in the
code fragment shown) takes as its input a trace
level and additional output information. - Maintains a TRACE MASK that represents the
current trace levels that are enabled for
recording data. The TRACE MASK must be
dynamically modifiable by the tester. - If the input trace level of a TRACE statement is
enabled in the current TRACE MASK, the TRACE
utility will record the output information
associated with the TRACE statement. - The information recorded should include at least
a time-stamp and information about what event has
occurred. - Instrument the software being tested with
TRACE statements at appropriate locations. The
locations and what TRACE levels to use must be
determined based on an up-front analysis during
the development phase of the lifecycle. - Run the tests after setting the appropriate
TRACE levels to the application. - Analyze the data recorded for the test results.
34Partial Buffer Recording Pattern
- Sometimes systems receive packets or events that
contain a large amount of data. - Expensive and may not be feasible.
- The partial buffer recording strategy focuses on
recording only part of the buffer/data received
for testing. - The types of information depends on the
application and needs to be determined as part of
the test case development.
35Event Throttle Pattern
- Sending information about all events being
generated can be costly and take up valuable
bandwidth. - Event throttling refers to using some methods to
reduce the number of events being reported by
implementing some throttling mechanisms. - Have certain intelligence as to what events are
required to be reported and which events can be
dropped without losing relevant information. - Be application specific and can be designed
with the semantics of the events being reported
in mind.
36Event Throttle Pattern (Contd)
- The Event Throttling Layer receives events from
the WCCU, CBR and TFU which are three different
cards. - Filter out the events that need not be reported
to the Event Server.
37Event Generation Patterns
- Event Tracker Pattern
- Event Stream Generator
38Event Generation Patterns
- Event Tracker Pattern
- Have a unique identifier within that event so
that it can be traced from the sender to the
receiver. - For instance, a monotonically increasing
sequence number is one example. - The unique identifier that is setup needs to be
accessible from the event on all elements and can
then be used to record information about the
event.
39Event Stream Generator
- To test certain requirements, it is necessary to
have a capability of controlling the stream of
events being generated, e.g., at a certain
frequency. - An Event Stream Generator is a component
- Ability to generate events of a particular type
starting at a certain time and ending at a
certain time at a specified rate. - Ability to specify the number of times an event
must be generated. - Ability to change certain parameters within the
events being generated.
40Event Monitoring and Analysis Patterns
- Active Monitoring Patterns the test suites query
the systems for required data during test
execution and process the returned data. - Continuous Polling
- Discrete Polling
- Periodic Polling
41Continuous Polling
- Consider a requirement which specifies that a
certain event or state change must occur within a
certain time after some other state change or
event has occurred - E.g., If magnet abort/inhibit therapy control
is enabled and if magnet is applied for at least
1 second, then the magnet task shall place the
device if it is in therapy or sense into
temporary detect mode within 3 seconds of the
start of the magnet application.
42- teststart( "Test Temporary Detect Mode " )
- / Set device to the default state /
- nominals ()
- report( "
\n" - "Requirement being tested If Magnet
abort/inhibit therapy control is enabled and if
magnet\n" - "is applied for atleast 1 second, then the magnet
task shall place the device if it is in
Therapy\n" - "or Sense into Temporary Detect Mode within 3
seconds of the start of the magnet application
\n" - "
\n") - / Enable magnet abort/inhibit therapy control /
- SetMagTherapyControl (ENABLED)
- / Place the Device in Therapy Mode /
- prog_perm_device_mode (THERAPY)
- status FAILURE
- currtime time ()
- / exptime is time before which mode must become
Temporary Detect Mode/ - exptime currtime 3000
- / Apply Magnet /
- ApplyMagnet (1.5) // apply magnet for 1.5
seconds - / Check if Device is in Temporary Detect Mode
through continuous polling ./
43Discrete Polling
- The polling is done only at discrete instants.
This scheme reduces the overhead of polling, by
determining certain time points at which the
requirement being tested can be conclusively
found to have been satisfied or violated. - E.g., If magnet abort/inhibit therapy control
is enabled and if magnet is applied for at least
1 second, then the magnet task shall place the
device if it is in therapy or sense into
temporary detect mode within 3 seconds of the
start of the magnet application.
44- teststart( "Test Temporary Detect Mode " )
- nominals ()/ Set device to the default state /
- report( "
\n" - "Requirement being tested If Magnet
abort/inhibit therapy control is enabled and if
magnet\n" - "is applied for atleast 1 second, then the magnet
task shall place the device if it is in
Therapy\n" - "or Sense into Temporary Detect Mode within 3
seconds of the start of the magnet application
\n") - SetMagTherapyControl (ENABLED) / Enable magnet
abort/inhibit therapy control / - prog_perm_device_mode (THERAPY) / Place the
Device in Therapy Mode / - status FAILURE
- ApplyMagnet (1.5) // apply magnet for 1.5 second
- / Use Discrete Polling at 1 second intervals for
3 seconds to see if required change has happened.
/ - while (TRUE)
- wait (1)
- if (read_curr_device_mode () DETECT)
- status SUCCESS
- break
-
- wait (1)
- if (read_curr_device_mode () DETECT)
45Periodic Polling
- Periodic polling is just a variant of discrete
polling scheme. In this scheme, instead of
polling at certain discrete instants as in
discrete polling, polling is done at periodic
intervals.
46Consequences of Active Monitoring Schemes
- All polling schemes impose an overhead on both
the poller and the pollee. Therefore it must be
used only if does not cause any significant
intrusive impact on the system behavior. - Since the system is continuously polled, the
detection of any significant state change is done
with minimal delay in a continuous polling
scheme. The test case execution is therefore
faster, whereas in discrete and periodic polling
schemes, the detection can take longer. On the
other hand, the overhead for continuous polling
scheme would be the maximum among all three
schemes. - They are relatively simple to implement.
47Passive Monitoring Patterns
- In passive patterns, test suites wait for the
system being tested to report events and use such
data for analysis. - Requirement Activate Traffic Confirm message
must be received within 500 ms after sending a
Activate Traffic Channel Message.
48Applications of Timing Constraint Testing
- Software architecture enhancements some examples
of software architecture changes that can be
considered to improve timing include - Threading model changes (single-threaded vs.
Multithreaded) - Use of different process models (process per
request vs. Process pool vs. Single process) - Relieve data contention problems by careful use
of synchronization mechanisms. - Use of different communication mechanisms to
exchange data (CORBA vs. RPC vs. TCP/IP) based on
their performance numbers.
49- Hardware/Platform changes Some of the
hardware/platform changes that can be considered
to enhance performance include - Adding more processing power (better
processors, more processors and more memory).
However, this is not easy to do in some cases
because it is a major architectural change as
well as other reasons such as space constraints
on boards and availability of appropriate
processors with the right operating systems. - Adding faster devices.
50- Transport changes
- If transport links between elements are
considered to be bottlenecks for performance,
then more bandwidth can be added to the links. - Algorithm improvements
- Algorithms used in systems play an important
role in how a system performs timing wise. For
instance, in wireless systems, radio link
protocol (RLP) algorithms determine how fast a
system can recover from lost packets transmitted
to or from a mobile system. Algorithm
enhancements therefore are a means to improve
system behavior.
51- Field usage guidelines
- Sometimes, lessons are learned in the labs with
regard to the timing constraints that need to be
used in the field to maximize system throughput. - For instance, use of a particular algorithm
under certain loading conditions can improve the
timing behavior compared to an alternative. If
the algorithms are dynamically configurable, then
the field usage guidelines can specify the
different cases when the different algorithms
need to be used. - The operators in the field can thus program the
system dynamically to use the specific algorithms
at appropriate times.
52- Link Budget Refinement
- Testing in a lab is a means to gather data
about how much time is spent on different phases
of an operation. Such data can be used to refine
link budget estimations, i.e., determining how
much time is reasonable for a certain phase. - Based on this, certain timeout values can be
adjusted to more closely reflect the systems
actual behavior. This can be used to avoid
spurious timeouts that may happen otherwise if
unrealistic numbers are used. Link budget
refinement (the process of adjusting time taken
for different phases or links of an operation) is
an important by-product of testing timing
constraints.
53- Requirement Changes
- In many cases, feedback from testing results
forms a useful input to writing the requirements
for future products. - For a new product line with no existing
architecture it is difficult to estimate how much
time it would take a certain operation or any of
its phases. - Although requirements may specify certain
numbers and the system architecture tries to meet
those requirements, they may sometimes turn out
to be unrealistic or unachievable in some cases
or in other cases be significantly more than what
is really needed in others.
54Practical approach to solve timing problems -
Scheduling
- In practice, we generally have 2 major kinds of
tasks - Periodic tasks
- Aperiodic tasks
- Implanting a embedded system
- Make demonstration
- Emergency state
- Re-programming/Re-loading
- Sets parameters
- Low battery
- .
55Practical approach to solve timing problems -
Scheduling (Contd)
- How do we schedule tasks in practice?
- Divide the task (almost all the tasks) into
slices, then CPU will schedule all the tasks by
timing slices
56Practical approach to solve timing problems -
Scheduling (Contd)
- Consequence for scheduling
- Each task needs a priority
- The CPU will schedule the task with the highest
priority - The timing information of each task is estimated
- CPU is usually too powerful that we do not have
too many timing problems - Concurrent behavior / tasks
- Different completion / arrival order A ? B / B ?
A - If order is important, say A must precedes B,
you must put A to have a higher priority than B