Debugging with the TotalView Source Code Debugger - PowerPoint PPT Presentation

1 / 88
About This Presentation
Title:

Debugging with the TotalView Source Code Debugger

Description:

Linux x86, x86-64, ia64, Power. Mac Power and Intel. Solaris Sparc and AMD64. AIX, Tru64, IRIX, HP-UX ia64. Cray X1, XT3, XT4, IBM BGL, BGP, SiCortex ... – PowerPoint PPT presentation

Number of Views:306
Avg rating:3.0/5.0
Slides: 89
Provided by: bevan
Category:

less

Transcript and Presenter's Notes

Title: Debugging with the TotalView Source Code Debugger


1
Debugging with the TotalView Source Code
Debugger
MIT March 6, 2008
Ed Hinkel Sales Engineer TotalView Technologies
2
Agenda
  • TotalView Technologies Intro
  • Source Code Debugging
  • - Setup
  • - Navigation
  • - Data View and Analysis
  • Memory Debugging
  • Parallel Debugging
  • Debugging Large Apps
  • Questions / Comments

3
TotalView Technologies Corporate Overview
  • The Most Experienced Technologists in Parallel
    Debugging
  • Technology originally developed at BBN in late
    80s
  • Developed from scratch specifically for debugging
    parallel applications
  • TotalView is recognized worldwide as the gold
    standard for debugging in multi-core, data
    intensive, high-performance, distributed, and
    clustered computing environments
  • The debugging leader in the HPC, EDU, and
    Commercial sectors
  • Founded as Etnus, Inc. in 1999, Renamed TotalView
    Technologies in 2007
  • 50 employees (heavily engineering influenced)
  • Over 1,400 customers in 55 countries
  • Over 10K developers with over 2 million cores
    under license
  • Award winning product line (Supercomputing
    Online's Product of the Year)

4
What Is TotalView?
5
What is TotalView?
  • A comprehensive debugging solution for demanding
    multi-core applications
  • C, C, Fortran 77 90, UPC
  • Wide compiler platform support
  • Multi-threaded Debugging
  • Parallel Debugging
  • MPI, PVM, Others
  • Remote Debugging
  • Memory Debugging Capabilities
  • Integrated into the Debugger
  • Powerful and Easy GUI
  • Visualization
  • CLI for Scripting

6
Supported Compilers and Architectures
  • Platform Support
  • Linux x86, x86-64, ia64, Power
  • Mac Power and Intel
  • Solaris Sparc and AMD64
  • AIX, Tru64, IRIX, HP-UX ia64
  • Cray X1, XT3, XT4, IBM BGL, BGP, SiCortex
  • Languages / Compilers
  • C/C, Fortran, UPC, Assembly
  • Many Commercial Open Source Compilers
  • Parallel Environments
  • MPI (MPICH1 2, LAM, Open MPI, poe, MPT,
    Quadrics, MVAPICH, many others )?
  • UPC

7
Architecture for Cluster Debugging
  • Single Front End (TotalView)?
  • GUI
  • debug engine
  • Debugger Agents (tvdsvr)?
  • Low overhead, 1 per node
  • Traces multiple rank processes
  • TotalView communicates directly with tvdsvrs
  • Not using MPI
  • Protocol optimization

Compute Nodes
Provides Robust, Scalable and efficient operation
with Minimal Program Impact
8
TotalView Basics_________________ Startup,
Process Control Navigation
9
Starting TotalView
  • Normal
  • totalview tv_args prog_name a prog_args
  • Attach to running program
  • totalview tv_args prog_name pid PID a
    prog_args
  • Attach to remote process
  • totalview tv_args prog_name remote name a
    prog_args
  • Attach to a core file
  • totalview tv_args prog_name corefile_name
    a prog_args

Command Line
GUI
10
Interface Concepts
  • Root Window
  • State of all processes being debugged
  • Process Window
  • Detailed state of a single process
  • Thread within a process
  • Point of control
  • Control the process and possibly other related
    processes

11
TotalView Root Window
Host name
Hierarchical/ Linear Toggle
Rank (if MPI program)
TotalView Thread ID
Expand - Collapse Toggle
Action Point ID number
Process Status
12
Process Window Overview
Toolbar
Stack Trace Pane
Stack Frame Pane
Source Pane
Tabbed Area
13
Stack Trace and Stack Frame Panes
Language
Name
Function Pointer
14
Source Code Pane
15
Viewing Source Code
  • TV always tries to display source code
  • If it cannot you will see assembly
  • -g puts symbol table and source code line
    number info into your application
  • These are references, usually by relative path
    from the object file to source file
  • TV takes the basename and the path
  • TotalView will first try to use this info to find
    the source file
  • Then it will search a TV search path for the
    basename
  • Paths can be set via tree function
  • CLI variables provides for setting source search
    paths - see documentation for details

16
Debugging Assembly Code
Display/Debug Source, Assembly or Both
17
Process Status
18
Stepping Commands
Based onPC location
19
Basic Process Control
Automatic Grouping
  • Control Group
  • All the processes created or attached together
  • Share Group
  • All the processes that share the same image
  • Workers Group
  • All the processes threads that are not
    recognized as manager or service processes or
    threads
  • Lockstep Group
  • All threads at the same PC

20
Finding Functions, Variables, and Source Files
Menu View gt Lookup ---------- Accelerator
Keys f, v ---------- Closest Match Search
Results
21
Action Points
  • Breakpoints
  • ----------
  • Barrier Points
  • ----------
  • Conditional Breakpoints
  • ----------
  • Evaluation Points
  • ----------
  • Watchpoints

22
Setting Breakpoints
Setting action points Single left-click
outlined source code line numbers Action
Points Tab Lists all action points Dive on an
action point to focus it in source pane Action
point properties Context menu when right-clicking
the action point Deleting action
points Left-click in Source Pane Context menu in
Source Pane / Action Points Tab Disabling
action points Context menu Left-click in Action
Points Tab Saving all action points Action
Point gt Save All
23
Setting Breakpoints
24
Conditional Breakpoint
25
Evaluation Points
  • Generalization of Conditional Breakpoints
  • C/C or Fortran
  • Call functions
  • Set variables
  • Test conditions
  • Test small source code patches
  • Help set up program circumstances

26
Test Fixes on the Fly
27
Watchpoints
Use Tools gt Watchpoint from a Variable
Window. Watchpoints are set on a fixed memory
region. When the contents of watched memory
change, the watch- point is triggered and
TotalView stops the program. Watchpoints are not
set on a variable. You you need to be aware of
the variable scope. Watchpoints can be
conditional or unconditional Use intrinsic
variables newval and oldval in the conditional
expression
28
Using Set PC to Replay Code
29
Help System
  • Context sensitive buttons on many dialog windows
  • Help menu in the main windows
  • Launches an html browser
  • Navigate or search the full content
  • Also available in pdf and hard copy
  • Check out the tip of the week archive

30
TotalView Documentation
31
TotalView Basics_________________ Viewing and
Editing Data
32
Diving on Variables
  • You can use Diving to
  • get more information
  • to open a variable in a Variable Window.
  • to chase pointers in complex data structures.
  • You can Dive on
  • variable names to open a variable window
  • function names to open the source in the
    Process Window.
  • processes and threads in the Root Window.
  • How do I Dive?
  • Double-click the left mouse button on selection
  • Single-click the middle mouse button on
    selection.
  • Select Dive from context menu opened with the
    right mouse button

33
Diving on Variables
34
Undiving
In a Process Window retrace the path that has
been explored with multiple dives. In a Variable
Window replace contents with the previous
contents. You can also remove changes in the
variable window with Edit gt Reset Default.
35
Dive in All
Dive in All displays an element in an array of
structures as if it were a simple array.
36
The Variable Window
  • Editing Variables
  • Click once on the value
  • Cursor switches into edit more
  • Esc key cancels editing
  • Enter key commits a change
  • Editing values changes the memory of the program
  • Window contents are updated automatically
  • Changed values are highlighted
  • Last Value column is available

37
Expression List Window
Add to the expression list using contextual menu
with right-click on a variable, or by typing an
expression directly in the window
38
Expression List Window
  • Reorder, delete, add
  • Sort the expressions
  • Edit expressions in place
  • Dive to get more info
  • Updated automatically
  • Expression-based
  • Simple values/expressions
  • View just the values you want to monitor

39
Four Ways to Look at Variables
  • Glance
  • Stack frame
  • Hover
  • Source pane
  • Dive to data window
  • Source, Stack or Variable Window
  • Arrays, structures, explore
  • Monitor via expression list
  • Source, Stack or Variable window
  • Keep an eye on scalars and expressions

40
Viewing Arrays
41
Slicing Arrays
  • Slice notation is startendstride

42
Filtering Arrays
43
Visualizing Arrays
  • Visualize array data using Tools gt Visualize from
    the Variable Window
  • Large arrays can be sliced down to a reasonable
    size first
  • Visualize is a standalone program
  • Data can be piped out to other visualization tools
  • Visualize allows to spin, zoom, etc.
  • Data is not updated with Variable Window You
    must re-visualize
  • visualize() is a directive in the expression
    system, and can be used in evaluation point
    expressions.

44
Typecasting Variables
  • Edit the type of a variable
  • Changes the way TotalView interprets the data in
    your program
  • Does not change the data in your program
  • Often used with pointers
  • Type cast to a void or code type to snoop around
    in memory

Give TotalView a starting memory address and
TotalView will interpret and display your memory
from that location.
45
Type Casts Read from Right to Left
  • Examples
  • int10 Pointer to an array of 10 int
  • int10 Array of 10 pointers to int
  • int10 Pointer to an array of 10 pointers to
    int
  • int510 Array of 10 pointers to arrays of 5
    int

46
Typecasting Examples
  • Cast float to float 100 to see a dynamic
    arrays values
  • Cast to built-in types like string to view a
    variable as a null-terminated string (automatic
    cast for char )
  • Cast to void for no type interpretation or for
    displaying regions of memory
  • Cast to code100 to see 100 instructions
    of disassembly
  • Cast to your own structs, objects, Fortran user
    defined types, common block definitions, etc.

47
STLView
  • STLView transforms templates into readable and
    understandable information
  • STLView supports stdvector, stdlist,
    stdmap, stdstring
  • See doc for which STL implementations are
    supported

48
C Templates
TotalView understands your C templates and
gives you a choice ... Boxes with solid
lines around line numbers indicate locations with
replicated code
49
Managing SignalsFile gt Signals
Error Stop the process and flag as
error Stop Stop the process Resend Pass the
signal to the target and do nothing use with
signal handlers Ignore Discard the signal
50
TotalView Basics_________________Memory
Debugging
51
What is a Memory Bug?
  • A Memory Bug is a mistake in the management of
    heap memory
  • Failure to check for error conditions
  • Leaking Failure to free memory
  • Dangling references Failure to clear pointers
  • Memory Corruption
  • Writing to memory not allocated
  • Over running array bounds

52
The Agent and Interposition
Process
TotalView
User Code and Libraries
Heap Interposition Agent (HIA)?
Allocation Table
Deallocation Table
Malloc API
53
TotalView HIA Technology
  • Advantages of TotalView HIA Technology
  • Use it with your existing builds
  • No Source Code or Binary Instrumentation
  • Programs run nearly full speed
  • Low performance overhead
  • Efficient memory usage
  • Low memory overhead
  • Support wide range of platforms and compilers

54
Memory Debugger Features
  • Automatically detect allocation problems
  • View the heap
  • Leak detection
  • Block painting
  • Dangling pointers
  • Deallocation/reallocation notification
  • Guard Blocks
  • Memory Comparisons between processes
  • Collaboration features

55
Heap Graphical View
56
Heap Graphical View
57
Leak Detection
  • Leak Detection
  • Based on Conservative Garbage Collection
  • Can be performed at any point in runtime
  • Helps localize leaks in time
  • Multiple Reports
  • Backtrace Report
  • Source Code Structure
  • Graphically Memory Location

58
Leak Detection
59
Guard Blocks Memory Corruption
60
Guard Blocks Memory Corruption
61
Memory Comparisons
  • Diff live processes
  • Compare processes across cluster
  • Compare with baseline
  • See changes between point A and point B
  • Compare with saved session
  • Provides memory usage change from last run

62
Memory Status
63
MemoryScape
  • What is MemoryScape?
  • Streamlined
  • Lightweight
  • Intuitive
  • Collaborative
  • Memory Debugging
  • Features
  • Shows
  • Memory Errors
  • Memory Status
  • Memory Leaks
  • Bounds Violations
  • MPI Memory Debugging
  • Remote Memory Debugging
  • Tech
  • Low Overhead
  • No Instrumentation
  • Interface
  • Inductive
  • Collaboration
  • Multi-process

64
Script Mode
  • Automation Support
  • MemoryScape lets users run tests and check
    programs for memory leaks without having to be in
    front of the program
  • Simple command line program called memscript
  • Doesnt start up the GUI
  • Can be run from within a script or test harness
  • The user defines
  • What configuration options are active
  • What things MemoryScape is looking for
  • What actions MemoryScape should take for each
    type of event that may occur

65
TotalView Basics_________________ Parallel
Application Debugging
66
Challenges of Debugging in a Multi-Core Age
  • Concurrency
  • Stochastic errors are often many times harder to
    solve than others
  • Achieving reproducibility of race conditions,
    deadlocks, live-locks, and other concurrent bugs
    is the key
  • Precise thread-level control of all the processes
    in the distributed application
  • If you cant control the threads then you are
    simply hoping that the problem happens
  • If you cant reproduce the problem you cant
    easily pose questions about why
  • Constructs to enable problem reproduction
  • Scripting
  • Expression Evaluation
  • Visibility into all the relevant data
  • Thread specific stacks and thread-private
    variables
  • Easy ways to view complex data

67
Debugging Multithreaded Programs
  • When debugging multithreaded programs, you want
    to
  • Know where to look to get thread status.
  • Be able to switch the focus from one thread to
    another quickly and easily.
  • Understand how asynchronous thread control
    commands (step, go, halt) and breakpoints are
    used.
  • A parallel program has a lot more states than
    Running or stopped There are more degrees of
    freedom for program control. TotalView gives you
    a full set of features to manage this complexity.
    It is important to understand how the different
    classes of commands and features work so as to
    avoid confusion.

68
TotalView Provides
  • MPI-Aware Easy launch mechanisms
  • Seamless Parallel and Remote Debugging
  • Powerful Process Control Features
  • MPI Message Queue Display
  • High Degree of Scalability
  • Scriptability for unattended operation

69
Preparing for debugging
  • Compiling Your Application
  • Provide TotalView with debug symbols
  • Add '-g' to your compile line
  • Turn off optimizations
  • Remove any optimization flags -o

70
Starting an MPI Job Within TotalView
  • Indirect launch
  • Choose MPI implementation
  • Set parameters
  • Enable Memory Debugging
  • Indicate your MPI
  • Start from command line
  • mpirun -tv -np 4 my_program (mpich)
  • totalview poe -a -np 4 my_program

71
Running TotalView with SiCortex Applications
  • The TotalView Debugger runs as a cross-debugger
    within the SiCortex-MIPS Linux environment.
  • The SiCortex version is a 64-bit application runs
    on a x86- 64 system running a 64-bit kernel.
  • Debugging on SiCortex uses the remote features of
    TotalView

72
Running TotalView with SiCortex Applications
  • TVD needs to execute a command on the target
    system from the development host.
  • By default, this version uses the ssh -x command.
  • It is suggested to use ssh set so that allows
    password-less commands.
  • The programs executable file must be visible
    from both the development host and the target
    system.
  • Place the executables in a directory that is
    visible on both machines through the same path.
  • Having the executable visible in separate
    directories that are accessed through the same
    path on both machines will also work.

73
Running TotalView with SiCortex Applications
  • The SiCortex version of TotalView uses a
    different set of naming conventions, using an
    sc prefix
  • sctv8 instead of tv8
  • sctototalview instead of totalview
  • sctv8cli vs. tv8cli for the Command Line I/F

74
Running TotalView with SiCortex Applications
  • TVD must debug the MIPS version of srun, not the
    x86-64 version of srun.
  • TotalView can be Invoked as follows
  • sctv8 -r SiCortex_node ./srun -a srun_arguments
  • Via the GUI
  • Use the File gt New Program dialog box.
  • Within the Parallel tab, select SiCortex from the
    pull- down list.
  • (This is the preferred way to start MPI programs
    from within TVD.)

75
Root Window with MPI
  • Status Info
  • T stopped
  • B Breakpoint
  • E Error
  • W Watchpoint
  • R Running
  • M Mixed
  • Navigation
  • Dive to refocus
  • Dive anew to get a second process window

76
Call Graph
  • Quick view of program state
  • Each call stack is a path
  • Functions are nodes
  • Calls are edges
  • Labeld with the MPI rank
  • Construct process groups
  • Look for outliers

77
Process Control Concepts
  • Each process window is always focused on a
    specific process.
  • Process focus can be easily switched
  • Processes can be held - they will not run till
    unheld.
  • Breakpoints can be set to stop the process or the
    group
  • Breakpoint and command scope can be simply
    controlled

78
Switching Processes
  • You can switch the focus of the process window by
    using the P and P- buttons on the toolbar.
  • P takes you to the next process.
  • P- takes you to the previous process.
  • You can also navigate directly to processes by
    diving on process entries in the root window.
  • The next slide describes using the root window in
    more detail.

79
Process Control with MPI
  • Process control commands have a 'scope. For MPI
    debugging the following scopes are interesting
  • Control Group Scope the entire MPI job,
    including starter (if there is a separate
    starter)
  • Share Group Scope all the rank processes (if you
    are focused on a rank process)
  • Process Scope the process that the Process
    Window is focused on.
  • Arbitrary sets of processes can be controlled by
    defining a process group with the command line
    interface.
  • See later slides and Chapter 11 of the users
    manual
  • Nothing MPI specific about process control

80
Looking at Variables across Processes
  • TotalView allows you to look at the value of a
    variable in all MPI processes
  • Right Click on the variable
  • Select the View gt View Across
  • TotalView creates an array indexed by process
  • You can filter and visualize

81
View MPI Message Queues
  • Information visible whenever MPI rank processes
    are halted
  • Provides information from the MPI layer
  • Unexpected messages
  • Pending Sends
  • Pending Receives
  • Use this info to debug
  • Deadlock situations
  • Load balancing

82
Message Queue Graph
  • Hangs Deadlocks
  • Pending Messages
  • Receives
  • Sends
  • Unexpected
  • Inspect
  • Individual entries
  • Patterns

83
Message Queue Debugging
  • Filtering
  • Tags
  • MPI Communicators
  • Cycle detection
  • Find deadlocks

84
Large JobsSubset Attach
TotalView does not need to be attached to the
entire job
  • You can be attached to different subsets at
    different times through the run
  • You can attach to a subset, run till you see
    trouble and then 'fan out' to look at more
    processes if necessary.
  • This greatly reduces overhead

85
Large JobsStrategies
  • Reduce N
  • Problem Each process added requires overhead
  • Strategy Reduce the number of processes
    TotalView is attached to
  • Simply reducing N is best, however data or
    algorithm may require large N
  • Technique subset attach mechanism
  • Focus Effort
  • Problem Some debugger operations are much more
    intensive than others, when multiplied by N this
    is a big deal
  • Strategy Reduce the interaction between the
    debugger and the processes
  • Technique Use TotalView's process control
    features to
  • Avoid single stepping
  • Focus on one or a small set of processes

86
Large JobsFocus on One Process
  • If you want to single step through a section of
    code
  • Perhaps do it on one process only
  • Set a process-width breakpoint at the beginning
  • Once everything you want/expect is lined up,
    holdone process
  • Do a group-width go to get all the other
    processes running
  • Unhold that one process and change the width of
    the commands in the tool bar to process
  • Next and step that one process
  • The other processes are running so they will
    participate in communication with your process of
    interest

87
Thanks!
  • QUESTIONS?

88
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com