Watchdog Timers - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Watchdog Timers

Description:

Maxim MAX6323. Internal Watchdogs ... [5] Maxim Integrated Products, Inc. ( 2005, December) ... http://datasheets.maxim-ic.com/en/ds/MAX6323-MAX6324.pdf [6] ... – PowerPoint PPT presentation

Number of Views:2211
Avg rating:3.0/5.0
Slides: 29
Provided by: jeffreysc
Category:
Tags: maxim | timers | watchdog

less

Transcript and Presenter's Notes

Title: Watchdog Timers


1
Watchdog Timers
  • Jeffrey Schwentner
  • EEL6897, Fall 2007

2
Software Reliability
  • Embedded systems must be able to cope with both
    hardware and software anomalies to be truly
    robust.
  • In many cases, embedded devices operate in total
    isolation and are not accessible to an operator.
  • Manually resetting a device in this scenario when
    its software hangs is not possible.
  • In extreme cases, this can result in damaged
    hardware or loss of life and incur significant
    cost impact.

3
The Clementine
  • In 1994, a deep space probe, the Clementine, was
    launched to make observations of the moon and a
    large asteroid (1620 Geographos).
  • After months of operation, a software exception
    caused a control thruster to fire for 11 minutes,
    which depleted most of the remaining fuel and
    caused the probe to rotate at 80 RPM.
  • Control was eventually regained, but it was too
    late to successfully complete the mission.

4
Watchdog Timers
  • While it is not possible to cope with all
    hardware and software anomalies, the developer
    can employ the use of watchdog timers to help
    mitigate the risks.
  • A watchdog timer is a hardware timing device that
    triggers a system reset, or similar operation,
    after a designated amount of time has elapsed.
  • A watchdog timer can be either a stand-alone
    hardware component or built into the processor
    itself.
  • To avoid a reset, an application must
    periodically reset the watchdog timer before this
    interval elapses. This is also known as
    kicking the dog.

5
External Watchdogs
  • External watchdog timers are integrated circuits
    that physically assert the reset pin of the
    processor.
  • The Processor must assert an output pin in some
    fashion to reset the timing mechanism of the
    watchdog.
  • This type of watchdog is generally considered the
    most appropriate because of the complete
    independence of the watchdog from the processor.
  • Some external watchdogs feature a windowed
    reset.
  • Enforces timing constraints for a proper watchdog
    reset.
  • Minimizes likelihood of errant software resetting
    the watchdog.

6
External Watchdog Schematic
7
Windowed Watchdog Operation
Maxim MAX6323
8
Internal Watchdogs
  • Many processors and microcontrollers have
    built-in watchdog circuitry available to the
    programmer.
  • This typically consists of a memory-mapped
    counter that triggers a non-maskable interrupt
    (NMI), or reset, when the counter reaches a
    predefined value.
  • Instead of issuing a reset via an I/O pin
    assertion, an internal counter of reset to an
    initial value.
  • Watchdog configuration is controlled user
    software.
  • Watchdog may even be used as a general purpose
    timer in some cases.

9
Internal Watchdog Considerations
  • Internal watchdogs are not as safe as watchdog
    circuits external to the processor.
  • Watchdogs that issue a NMI instead of a reset may
    not properly reinitialize the system.
  • Watchdog control registers may be inadvertently
    overwritten by runaway code, disabling the
    watchdog all together.
  • Reset is limited to the processor itself (no
    outside peripherals).
  • To circumvent these issues, most built-in
    watchdogs have extra safety-steps designed to
    prohibit errant code from interfering with the
    operation of the watchdog timer.
  • On-chip solutions have a significant cost and
    space advantage over their external counterparts.

10
MSP430 Watchdog
  • Texas Instruments MSP430 family of
    microcontrollers has a built-in 16-bit watchdog
    timer featuring
  • Configurable clock source and prescaler
  • Two interrupt options (Reset or NMI)
  • Isolated watchdog counter
  • Access to the watchdog counter requires a unique
    binary code, or password.
  • The code must be written to the password register
    prior to resetting watchdog timer.
  • An invalid password attempt causes a key
    violation interrupt.

11
MSP430 Watchdog Timer Block Diagram
12
Design Considerations
  • The effectiveness of the watchdog is a function
    of how it is used within the application
    software.
  • Simply issuing a watchdog reset in every
    iteration of the program loop may be
    insufficient.
  • Take a more proactive approach.
  • Periodically assess the state and health of the
    system. Only issue a reset if all processes are
    deemed normal.
  • Employ a state-based approach when resetting the
    watchdog timer.
  • Should a watchdog failure occur, provide an
    indication and/or capture debugging information.

13
System Health Assessment
  • As the size and complexity of software increases,
    so does the likelihood of introducing code that
    may be detrimental to the system.
  • Software may not be the only cause of system
    invalidation. A spike in the power supply, for
    example, may corrupt data in memory, or even
    system registers (program counter, stack pointer,
    etc).
  • Check for things like stack overflows and
    validate memory wherever possible.

14
System Health Assessment
  • If the state of the system is compromised, let
    the watchdog timer perform the reset. This is a
    better approach than an application
    pseudo-reset.
  • Watchdog timers, themselves, can also adversely
    affect the system.
  • Setting a watchdog interval too short will
    generate a premature reset.
  • If a critical section of code takes 80
    milliseconds to complete, do not set the watchdog
    interval for 60 milliseconds.

15
State-based Watchdog
  • To guarantee that the software executes as
    intended, incorporate a simple state machine.
  • This involves adjusting a state variable at the
    beginning of a program iteration.
  • Prior to resetting the watchdog timer at the end
    of the program iteration, verify that the state
    is correct.
  • Prevents random code from wandering into the main
    loop and kicking the dog.
  • Enforces a constraint on program sequence.

16
State-based Watchdog Example
  • void watchdog_state_advance(void)
  • g_usWatchdogState 0x1111
  • void watchdog_state_validate(void)
  • g_usWatchdogStatePrev 0x1111
  • if(g_usWatchdogState ! g_usWatchdogStatePrev)
  • // State is invalid, allow watchdog to
    reset.
  • SLEEP()
  • else
  • // Reset the watchdog timer.
  • WDT_RESET()

Note Repeated calls to the validate function
will cause a watchdog reset.
17
Debugging Information
  • If software detects a fault condition, log the
    error information prior to allowing the watchdog
    to reset the system.
  • Allows the cause of the failure to be addressed.
  • A report of the error should be attempted when
    the system resets (part of initialization
    perhaps).
  • In addition to reporting errors after reset, it
    is a good idea to indicate that the device has
    been reset.
  • If the software was unable to catch the error, it
    will still attempt to notify of the reset event.
  • Systems that appear sluggish may actually be
    experience frequent watchdog resets.

18
Single-threaded Implementation
  • Single-threaded implementations should reset the
    watchdog timer in the main software loop.
  • To determine the proper watchdog timeout
    duration, the programmer must determine the
    amount of time it takes to execute the code,
    using worst case scenarios.
  • Many systems do not require tight timing.
  • In these cases, setting the timeout to a very
    large safe value may be acceptable, just to
    provide a protection against deadlocks.
  • Prior to resetting the watchdog, verify that the
    state of the system is valid, and system health
    is normal.

19
Single-threaded Example
  • main(void)
  • hwinit()
  • for ()
  • watchdog_state_advance()
  • read_sensors()
  • control_motor()
  • display_status()
  • if(system_check() S_OK)
  • // Kick the dog.
  • watchdog_state_validate()
  • else
  • flash_led()

20
Multi-threaded Implementation
  • The same concepts used in a single-threaded
    design are also applicable for multi-threaded
    implementations.
  • Avoid creating a thread that simply resets the
    watchdog timer at regular intervals.
  • Other threads could fail, and the watchdog thread
    would keep kicking the dog.
  • Generate a set of flags or data from each thread
    that can be validated in a monitoring thread.
  • The monitoring thread should reset the watchdog
    at regular intervals only if the data produced by
    the other threads is acceptable.

21
Multi-threaded Monitoring
Monitoring Task
System Tasks
22
Multi-threaded Frequency
  • An important criteria that can be used to
    validate the health of the system is the
    execution frequency of the worker threads.
  • This can be accomplished by incorporating a
    simple counter that is incremented on each
    iteration of a worker thread.
  • These counters are then monitored and compared
    with threshold values from the monitoring thread.
  • If the execution frequency of the monitoring task
    is significantly greater, the monitoring task can
    perform a thresholds
  • This allows the software to validate timing
    constraints.

23
Multi-threaded Example System Threads
  • thread_read_sensor(void)
  • for ()
  • read_sensors()
  • thread_sensor_cnt
  • sleep(50)
  • thread_control_motor(void)
  • for ()
  • control_motor()
  • thread_motor_cnt
  • sleep(100)

thread_display_status(void) for ()
display_status ()
thread_display_cnt sleep(125)
Note Each thread maintains a unique execution
counter.
24
Multi-threaded Example Monitoring Thread
  • main(void)
  • hwinit()
  • launch_threads()
  • for ()
  • watchdog_state_advance()
  • if(system_check() S_OK
  • thread_sensor_cnt 18
  • thread_sensor_cnt
  • thread_motor_cnt 8
  • thread_motor_cnt
  • thread_display_cnt 6
  • thread_display_cnt
  • // Kick the dog.
  • watchdog_state_validate()

else flash_led()
report_error(E_FAIL)
// Sleep monitoring task for 1 sec.
sleep(1000)
25
Mars Pathfinder
  • In July of 1997, a priority inversion occurred on
    the Mars Pathfinder mission, after the craft had
    landed on the Martian surface.
  • A high priority communications task was forced to
    wait on a mutex held by a lower priority
    science task.
  • The timing of the software was compromised, and a
    system reset issued by its watchdog timer brought
    the system back to normal operating conditions.
  • On Earth, scientists were able to identify the
    problem and upload new code to fix the problem.
  • Thus, the rest of the 265 million dollar mission
    could be completed successfully.

26
Conclusion
  • Watchdog timers can add a great deal of
    reliability to embedded systems if used properly.
  • To do so requires a good overall approach.
    Resetting the watchdog timer must be part of the
    overall design.
  • Verify the operation integrity of the system, and
    use this as a criteria for resetting the watchdog
    timer.
  • In addition to validating that the software does
    the right thing, verify that it does so in the
    time expected.
  • Assume the software will experience a hardware
    malfunction or software fault. Add enough
    debugging information to help debug situation.

27
Questions ?
28
References
  • 1 Barr, M. (2001). Introduction to Watchdog
    Timers, http//www.netrino.com/Publications/Glossa
    ry/WatchdogTimer.php
  • 2 Barr, M. (2002). Introduction to Priority
    Inversion, http//www.netrino.com/Publications/Glo
    ssary/PriorityInversion.php
  • 3 Gansel, J. (2004, January). Great Watchdogs,
    http//darwin.bio.uci.edu/sustain/bio65/Titlpage.
    htm
  • 4 Murphy, N. Watchdog Timers, Embedded Systems
    Programming, http//www.embedded.com/2000/0011/001
    1feat4.htm
  • 5 Maxim Integrated Products, Inc. (2005,
    December). Supervisory Circuits with Windowed
    (Min/Max) Watchdog and Manual Reset,http//datash
    eets.maxim-ic.com/en/ds/MAX6323-MAX6324.pdf
  • 6 Texas Instruments, Inc. (2006). MSP430x1xx
    Family Users Guide, http//focus.ti.com/lit/ug/sl
    au049f/slau049f.pdf
Write a Comment
User Comments (0)
About PowerShow.com