Developing high performance applications with .NET Compact Framework

1 / 48
About This Presentation
Title:

Developing high performance applications with .NET Compact Framework

Description:

Closed Types Loaded 730 - - - - - Closed Types Loaded per Definition 730 8 385 1 1 8. Open Types Loaded 78 - - - - - Closed Methods Loaded 46 ... –

Number of Views:442
Avg rating:3.0/5.0
Slides: 49
Provided by: deep4
Category:

less

Transcript and Presenter's Notes

Title: Developing high performance applications with .NET Compact Framework


1
Developing high performance applications with
.NET Compact Framework
  • Deepak Gulati
  • ISV Developer Evangelist
  • Microsoft

2
OEM/IHV Supplied
BSP(ARM, SH4, MIPS)
OEM Hardware and Standard Drivers
Standard PC Hardware and Drivers
Hardware/Drivers
Windows XP DDK
Device Building Tools
Windows Embedded Studio
Platform Builder
Lightweight Relational
SQL Server 2005 Express Edition
EDB
Data
SQL Server 2005
SQL Server 2005 Mobile Edition
Win32
Native Managed Server Side
Programming Model
MFC 8.0, ATL 8.0
.NET Compact Framework
.NET Framework
ASP.NET Mobile Controls
ASP.NET
Windows Media
Multimedia
DirectX
Location Services
MapPoint
Development Tools
Visual Studio 2005
Internet Security and Acceleration Server
Communications Messaging
Exchange Server
Live Communications Server
Speech Server
Device Update Agent
ManagementTools
Software Update Services
Image Update
Systems Management Server
Microsoft Operations Manager
3
Measuring PerformanceOverview
  • Basic technique involves
  • Find start time
  • Find end time
  • Calculate delta

4
Measuring PerformanceOverview
  • Start and End times can be measured in various
    ways
  • GetTickCount, a Win32 API function
  • Environment.TickCount is its managed code
    equivalent
  • Both return int that represents time in ms that
    has passed since the device was booted
  • Can also use System.DateTime and get
    System.TimeSpan by subtracting Start and End
    values

5
Measuring PerformanceOverview
  • There can be issues with these techniques
  • For a device that has been on for a long time,
    TickCount clips and goes negative
  • Not great for measuring short operations, there
    can be a variation of upto 500 ms
  • System.Date also suffers from accuracy issues

6
Measuring PerformanceOverview
  • QueryPerformanceCounter/QueryPerformanceFrequency
    to the rescue!
  • High resolution timer OEM specific
    implementation
  • Defaults to GetTickCount if not available

7
Measuring PerformanceOverview
  • No managed implementation available for
    QueryPerformanceCounter or Frequency
  • PInvoke QueryPerformanceFrequency and get the
    clock frequency of the device/sec. Divide by 1000
    to get the clock frequency/ms
  • PInvoke QueryPerformanceCounter before your call.
    Make your call. PInvoke QueryPerformanceCounter
    again
  • End Start / frequency/ms will give you time for
    your call in ms

8
Demo
  • Using QueryPerformanceCounter

9
Measuring PerformanceOverview
  • Micro-benchmarks versus Scenarios
  • Benchmarking tips
  • Start from known state
  • Ensure nothing else is running
  • Measure multiple times, take average
  • Run each test in own AppDomain / Process
  • Log results at the end
  • Understand JIT-time versus runtime cost

10
.NET Compact Framework .NET Compact Framework
Performance v1-v2
Biggeris better
Smalleris better
11
Measuring PerformancePerformance Counters
  • There will be times when an application runs slow
    and the code looks fine
  • .NET CompactFramework can be made to report
    performance statistics
  • .stat (formerly mscoree.stat)
  • http//msdn.microsoft.com/library/en-us/dnnetcomp/
    html/netcfperf.asp
  • Registry
  • HKLM\SOFTWARE\Microsoft\.NETCompactFramework\PerfM
    onitorCounters (DWORD) 1
  • What does .stat tell you?
  • Working set and performance statistics
  • More counters added in v2
  • Generics usage
  • COM interop usage
  • Number of boxed valuetypes
  • Threading and timers
  • GUI objects
  • Network activity (socket bytes send/received)

12
Demo
  • Enabling .NET Compact Framework Performance
    Statistics

13
.stat
counter
total last datum n mean
min max Total Program Run Time (ms)
55937 - -
- - - App Domains Created
18 -
- - - - App
Domains Unloaded
18 - - -
- - Assemblies Loaded
323 - -
- - - Classes Loaded
18852 -
- - -
- Methods Loaded
37353 - - -
- - Closed Types Loaded
730 - -
- - - Closed Types
Loaded per Definition 730
8 385 1 1
8 Open Types Loaded
78 - - -
- - Closed Methods Loaded
46 - -
- - - Closed Methods
Loaded per Definition 46
1 40 1 1
2 Open Methods Loaded
0 - - -
- - Threads in Thread Pool
- 0 6
1 0 3 Pending Timers
-
0 93 0 0
1 Scheduled Timers
46 - - -
- - Timers Delayed by Thread Pool
Limit 0 -
- - - - Work Items
Queued 46
- - - -
- Uncontested Monitor.Enter Calls
57240 - - -
- - Contested Monitor.Enter Calls
0 -
- - - - Peak Bytes
Allocated (native managed) 4024363
- - - -
- Managed Objects Allocated
1015100 - - -
- - Managed Bytes Allocated
37291444 28
1015100 36 8 55588 Managed
String Objects Allocated 112108
- - - -
- Bytes of String Objects Allocated
4596658 - - -
- - Garbage Collections (GC)
33 -
- - - - Bytes
Collected By GC 25573036
41592 33 774940 41592
1096328 Managed Bytes In Use After GC
- 23528 33
259414 23176 924612 Total Bytes In Use
After GC - 3091342
33 2954574 1833928 3988607 GC
Compactions
17 - - -
- - Code Pitchings
6 - -
- - - Calls to GC.Collect
0 -
- - - - GC
Latency Time (ms)
279 16 33 8
0 31 Pinned Objects
156 - -
- - - Objects Moved by
Compactor 73760
- - - -
- Objects Not Moved by Compactor
11811 - - -
- - Objects Finalized
6383 - -
- - - Boxed Value Types
350829
- - - -
- Process Heap
- 1626 430814 511970
952 962130 Short Term Heap
- 0 178228
718 0 21532 JIT Heap
-
0 88135 357796 0
651663 App Domain Heap
- 0 741720 647240
0 833370 GC Heap
- 0
376 855105 0 2097152 Native Bytes
Jitted 7202214
152 26910 267 80
5448 Methods Jitted
26910 - - -
- - Bytes Pitched
1673873 0
7047 237 0 5448
Peak Bytes Allocated (native managed)
JIT Heap
App Domain Heap
GC Heap

Garbage Collections (GC)
GC Latency Time (ms)

Boxed Value Types

Managed String Objects Allocated

14
.NET Compact FrameworkHow we are different?
  • Portable JIT Compiler
  • Fast code generation, less optimized
  • May pitch JIT-compiled code under memory pressure
  • No NGen, install time or persisted code
  • Interpreted virtual calls (no v-tables)
  • Simple mark and sweep GC, non generational

15
Common Language RuntimeExecution Engine
  • Call path
  • Managed calls are more expensive than native
  • Instance call 2-3X the cost of a native
    function call
  • Virtual call 1.4X the cost of a managed
    instance call
  • Platform invoke 5X the cost of managed instance
    call (Marshal int parameter)
  • Properties are calls
  • JIT compilers
  • All platforms has the same optimizing JIT
    compiler architecture in v2
  • Optimizations
  • Method inlining for simple methods
  • Variable enregistration

16
Common Language Runtime Call path (sample)
  • public class Shape
  • protected int m_volume
  • public virtual int Volume
  • get return m_volume
  • public class CubeShape
  • public MyType(int vol)
  • m_volume vol

public class Shape protected int m_volume
public int Volume get return m_volume
public class CubeShape public
MyType(int vol) m_volume vol
17
Common Language Runtime Call path (sample)
  • public class MyCollection
  • private const int m_capacity 10000
  • private Shape storage new
    Shapem_capacity
  • public void Sort()
  • Shape tmp
  • for (int i0 i
  • for (int j0 j
  • if (storagej1.Volume storagej.Volume)
  • tmp storagej
  • storagej storagej1
  • storagej1 tmp

callvirt instance int32 Shapeget_Volume()
18
Common Language Runtime Call path (sample)
  • public class Shape
  • protected int m_volume
  • public virtual int Volume
  • get return m_volume
  • public class CubeShape
  • public MyType(int vol)
  • m_volume vol

public class Shape protected int m_volume
public int Volume get return m_volume
public class CubeShape public
MyType(int vol) m_volume vol
57 sec
39 sec
19
Common Language RuntimeGarbage Collector
  • What triggers a GC?
  • Memory allocation failure
  • 1M of GC objects allocated (v2)
  • Application going to background
  • GC.Collect() (Avoid helping the GC!)
  • What happens at GC time?
  • Freezes all threads at safe point
  • Finds all live objects and marks them
  • An object is live if it is reachable from root
    location
  • Unmarked objects are freed and added to finalizer
    queue
  • Finalizers are run on a separate thread
  • GC pools are compacted if required (less than
    750K of free space)
  • Return free memory to the operating system
  • In general, if you dont allocate objects, GC
    wont occur
  • Beware of side-effects of calls that may allocate
    objects
  • http//blogs.msdn.com/stevenpr/archive/2004/07/26/
    197254.aspx

20
Common Language RuntimeGarbage Collector
GC Latency per collection
21
Common Language RuntimeGarbage Collector
Allocation rate
22
Common Language RuntimeGarbage Collector
Allocation throughput
23
Common Language RuntimeWhere garbage comes from?
  • Unnecessary string copies
  • Strings are immutable
  • String manipulations (Concat(), etc.) cause
    copies
  • Use StringBuilder

String result "" for (int i0 i result ".NET Compact Framework"
result " Rocks!"
StringBuilder result new StringBuilder() for
(int i0 i
Compact Framework") result.Append(" Rocks!")
24
.stat
Run time 173 sec
counter
total last datum n mean
min max Total Program Run Time (ms)
11843 - -
- - - App Domains Created
1 -
- - - - App
Domains Unloaded
1 - - - -
- Assemblies Loaded
2 - -
- - - Classes Loaded
175 -
- - - - Methods
Loaded 198
- - - -
- Closed Types Loaded
0 - - -
- - Closed Types Loaded per
Definition 0 0
0 0 0 0 Open
Types Loaded
0 - - - -
- Closed Methods Loaded
0 - -
- - - Closed Methods Loaded
per Definition 0 0
0 0 0 0 Open
Methods Loaded
0 - - - -
- Threads in Thread Pool
- 0 2
0 0 1 Pending Timers
- 0
2 0 0
1 Scheduled Timers
1 - - -
- - Timers Delayed by Thread Pool
Limit 0 -
- - - - Work Items
Queued 1
- - - -
- Uncontested Monitor.Enter Calls
2 - - -
- - Contested Monitor.Enter Calls
0 -
- - - - Peak Bytes
Allocated (native managed) 3326004
- - - -
- Managed Objects Allocated
60266 - - -
- - Managed Bytes Allocated
5801679432 28
60266 96267 8 580020 Managed
String Objects Allocated 20041
- - - -
- Bytes of String Objects Allocated
5800480578 - - -
- - Garbage Collections (GC)
4912 -
- - - - Bytes
Collected By GC 5918699036
1160076 4912 1204946 597824
1572512 Managed Bytes In Use After GC
- 580752 4912
381831 8364 580752 Total Bytes In Use
After GC - 1810560
4912 1611885 1097856 1810560 GC
Compactions
0 - - -
- - Code Pitchings
0 - -
- - - Calls to GC.Collect
0 -
- - - - GC
Latency Time (ms)
686 0 4912 0
0 16 Pinned Objects
0 - -
- - - Objects Moved by
Compactor 0
- - - -
- Objects Not Moved by Compactor
0 - - -
- - Objects Finalized
1 - -
- - - Boxed Value Types
3
- - - -
- Process Heap
- 278 235 2352
68 8733 Short Term Heap
- 0 278
986 0 10424 JIT Heap
-
0 360 12103 0
24444 App Domain Heap
- 0 1341 46799
0 64562 GC Heap
- 0
35524 2095727 0 3276800 Native
Bytes Jitted 22427
140 98 228 68
1367 Methods Jitted
98 - -
- - - Bytes Pitched
0 0
0 0 0 0 Methods
Pitched 0
- - - -
- Method Pitch Latency Time (ms)
0 0 0 0
0 0 Exceptions Thrown
0 -
- - - - Platform
Invoke Calls 0
- - - -
-
String result "" for (int i0 i result ".NET Compact Framework"
result " Rocks!"
Managed String Objects Allocated
20040 Garbage Collections (GC)
4912 Bytes of String Objects
Allocate 5,800,480,574 Bytes Collected
By GC 5,918,699,036 GC
latency 107128 ms
25
.stat
Run time 0.1 sec
counter
total last datum n mean
min max Total Program Run Time (ms)
11843 - -
- - - App Domains Created
1 -
- - - - App
Domains Unloaded
1 - - - -
- Assemblies Loaded
2 - -
- - - Classes Loaded
175 -
- - - - Methods
Loaded 198
- - - -
- Closed Types Loaded
0 - - -
- - Closed Types Loaded per
Definition 0 0
0 0 0 0 Open
Types Loaded
0 - - - -
- Closed Methods Loaded
0 - -
- - - Closed Methods Loaded
per Definition 0 0
0 0 0 0 Open
Methods Loaded
0 - - - -
- Threads in Thread Pool
- 0 2
0 0 1 Pending Timers
- 0
2 0 0
1 Scheduled Timers
1 - - -
- - Timers Delayed by Thread Pool
Limit 0 -
- - - - Work Items
Queued 1
- - - -
- Uncontested Monitor.Enter Calls
2 - - -
- - Contested Monitor.Enter Calls
0 -
- - - - Peak Bytes
Allocated (native managed) 3326004
- - - -
- Managed Objects Allocated
60266 - - -
- - Managed Bytes Allocated
5801679432 28
60266 96267 8 580020 Managed
String Objects Allocated 20041
- - - -
- Bytes of String Objects Allocated
5800480578 - - -
- - Garbage Collections (GC)
4912 -
- - - - Bytes
Collected By GC 5918699036
1160076 4912 1204946 597824
1572512 Managed Bytes In Use After GC
- 580752 4912
381831 8364 580752 Total Bytes In Use
After GC - 1810560
4912 1611885 1097856 1810560 GC
Compactions
0 - - -
- - Code Pitchings
0 - -
- - - Calls to GC.Collect
0 -
- - - - GC
Latency Time (ms)
686 0 4912 0
0 16 Pinned Objects
0 - -
- - - Objects Moved by
Compactor 0
- - - -
- Objects Not Moved by Compactor
0 - - -
- - Objects Finalized
1 - -
- - - Boxed Value Types
3
- - - -
- Process Heap
- 278 235 2352
68 8733 Short Term Heap
- 0 278
986 0 10424 JIT Heap
-
0 360 12103 0
24444 App Domain Heap
- 0 1341 46799
0 64562 GC Heap
- 0
35524 2095727 0 3276800 Native
Bytes Jitted 22427
140 98 228 68
1367 Methods Jitted
98 - -
- - - Bytes Pitched
0 0
0 0 0 0 Methods
Pitched 0
- - - -
- Method Pitch Latency Time (ms)
0 0 0 0
0 0 Exceptions Thrown
0 -
- - - - Platform
Invoke Calls 0
- - - -
-
StringBuilder result new StringBuilder() for
(int i0 iCompact Framework") result.Append("
Rocks!")
Managed String Objects Allocated
56 Bytes of String Objects Allocated
2097718 Garbage Collections (GC)
2 Bytes Collected By
GC 1081620 GC Latency 21 ms
26
Last notes on StringBuilder
  • Remember it's all about reducing memory traffic
  • If you roughly know the expected length of your
    final string allocate that much before hand
    (StringBuilder constructor)
  • Getting the string out of a StringBuilder doesn't
    cause a new alloc, the existing buffer is
    converted into a string

http//weblogs.asp.net/ricom/archive/2003/12/02/40
778.aspx
27
Common Language RuntimeWhere garbage comes from?
  • Unnecessary boxing
  • Value types allocated on the stack
  • (fast to allocate)
  • Boxing causes a heap allocation and a copy
  • Use strongly typed arrays and collections
  • (framework collections are NOT strongly typed)
  • class Hashtable
  • struct bucket
  • Object key
  • Object val
  • bucket buckets
  • public Object thisObject key get set

28
Demo
  • String vs. StringBuilder

29
Common Language RuntimeGenerics
  • Fully specialized implementation in .NET Compact
    Framework v2
  • Pros
  • Strongly typed
  • No unnecessary boxing and type casts
  • Specialized code is more efficient than shared
  • Cons
  • Internal execution engine data structures and
    JIT-compiled code arent shared
  • List, List, List
  • http//blogs.msdn.com/romanbat/archive/2005/01/06/
    348114.aspx

30
Common Language RuntimeFinalization and Dispose
  • Cost of finalizers
  • Non-deterministic cleanup
  • Extends lifetime of object
  • In general, rely on GC for automatic memory
    cleanup
  • The exceptions to the rule
  • If your object contains an unmanaged resource
    that the GC is unaware of, you need to implement
    a finalizer
  • Also implement Dispose pattern to release
    unmanaged resource in deterministic manner
  • Dispose method should suppress finalization
  • If the object you are using implements Dispose,
    call it when you are done with the object
  • Assumes an unmanaged resource in the object chain

31
Common Language RuntimeSample Code
Finalization and Dispose
  • class SerialPort IDisposable
  • IntPtr SerialPortHandle
  • public SerialPort(String name)
  • // Platform invoke to native code to open
    serial port
  • SerialPortHandle SerialOpen(name)
  • SerialPort()
  • // Platform invoke to native code to close
    serial port
  • SerialClose(SerialPortHandle)
  • public void Dispose()
  • // Platform invoke to native code to close
    serial port
  • SerialClose(SerialPortHandle)
  • GC.SuppressFinalize(this)

32
Common Language RuntimeSample Code
Finalization and Dispose
  • class SerialTrace IDisposable
  • SerialPort serialPort
  • public SerialTrace()
  • serialPort new SerialPort()
  • public void Dispose()
  • serialPort.Dispose()

33
Common Language RuntimeExceptions
  • Exceptions are cheapuntil you throw
  • Throw exceptions in exceptional circumstances
  • Do not use exceptions for normal flow control
  • Use performance counters to track the number of
    exceptions thrown
  • Replace On Error/Goto with Try/Catch/Finally
    in Microsoft Visual Basic .NET

34
Common Language RuntimeReflection
  • Reflection can be expensive
  • Reflection performance cost
  • Type comparisons (for example typeof() )
  • Member enumerations (for example
    Type.GetFields())
  • Member access (for example Type.InvokeMember())
  • Think 10-100x slower
  • Working set cost
  • Runtime data structures
  • Think 100 bytes per loaded type, 80 bytes per
    loaded method
  • Be aware of APIs that use reflection as a side
    effect
  • Override
  • Object.ToString()
  • GetHashCode() and Equals() (for value types)

35
Common Language RuntimeBuilding a Cost Model for
Managed Math
  • Math performance
  • 32 bit integers Similar to native math
  • 64 bit integers 5-10X cost of native math
  • Floating point Similar to native math
  • ARM processors do not have FPU

36
.NET Compact Framework
FX
Redist
Globalization
GUI
Net
I/O
Crypto
System.Globalization
System.Cryptography
System.IO.Ports
Microsoft.VisualBasic
System.WebServices
DirectX.DirectD3DM
System.Reflection
MSI Setup(ActiveSync)
Microsoft. Win32.Registry
System.Data
System
System.Net.Http
Windows.Forms
Per Device CABInstall (SMS, etc)
System.IO.File
System.Xml
mscorlib
System.Net.Sockets
System.Drawing
Visual Studio
CLR
JIT Compiler GC
Debugger
CalendarData
Debug Engine
ClassLoader
AssemblyCache
CultureData
ICorDbg
NativeInterop
App DomainLoader
Host
Windows CE
ProcessLoader
Memory and Threading
NTLM
CommonControls
File I/O
Sorting
Crypto API
Managed Loader
File Mapping
Cert/SecurityVerification
SSL
GDI/GWES
Registry
Encodings
Sockets
Casing
D3DM
37
Base Class LibraryCollections
  • Pre-size collection classes appropriately
  • Resizing creates unnecessary copies
  • Beware of foreach overhead, use indexer when
    available
  • ArrayList al new ArrayList(string_array)
  • foreach (MyType mt in al)//do something
  • will be compiled into
  • callvirt instance class IEnumeratorGetEnumerato
    r()
  • callvirt instance object IEnumeratorget_Curre
    nt()
  • callvirt instance bool IEnumeratorMoveNext()

38
Windows FormsBest Practices
  • Load and cache Forms in the background
  • Populate data separate from Form.Show()
  • Pre-populate data, or
  • Load data async to Form.Show()
  • Use BeginUpdate/EndUpdate when it is available
  • e.g. ListView, TreeView
  • Use SuspendLayout/ResumeLayout when repositioning
    controls
  • Keep event handling code tight
  • Process bigger operations asynchronously
  • Blocking in event handlers will affect UI
    responsiveness
  • Form load performance
  • Reduce the number of method calls during
    initialization

39
Graphics And GamesBest Practices
  • Compose to off-screen buffers to minimize direct
    to screen blitting
  • Approximately 50 faster
  • Avoid transparent blitting in areas that require
    performance
  • Approximate 1/3 speed of normal blitting
  • Consider using pre-rendered images versus using
    System.Drawing rendering primitives
  • Need to measure on a case-by-case basis

40
XMLBest Practices for Managing Large XML Data
Files
  • Use XMLTextReader/XMLTextWriter
  • Smaller memory footprint than using XmlDocument
  • XmlTextReader is a pull model parser which only
    reads a window of the data
  • XmlDocument builds a generic, untyped object
    model using a tree
  • Type stored as string
  • OK to use with smaller documents (64K XML
    0.25s)
  • Optimize the structure of XML document
  • Use elements to group
  • Allows use of Skip() in XmlReader
  • Use attributes to reduce size processing
    attribute-centric documents is faster
  • Keep it short! (attribute and element names)
  • Avoid gratuitous use of white space

41
XMLCreating optimized Reader/Writer
  • In v2 use XmlReader/XmlWriter factory classes to
    create optimized reader or writer
  • Applying proper XMLReaderSettings can improve
    performance
  • XmlReaderSettings settings new
    XmlReaderSettings()
  • settings.IgnoreWhitespace true
  • XmlReader reader XmlReader.Create(my.xml,setti
    ngs)
  • Up to 30 performance increase when
    IgnoreWhitespace true is specified (depends on
    document format)

42
Demo
  • XmlDocument vs. XmlTextReader

43
XMLReading local data with DataSet
  • DataSet is a database independent container of
    relational data
  • Allows you to work with XML
  • ReadXml Allows you to load XML data into DataSet
  • Simple to use, but performs badly, especially
    with large XML files
  • If you must use DS.ReadXml, make sure that you
    first supply the schema
  • Use XmlReader whereever possible for traversing
    through your data

44
Demo
  • DataSet and .NET CompactFramework

45
Non-XML local dataReading files locally
  • It might be required to read text file stored
    locally on the device
  • StreamReader and FileStream classes are typically
    employed
  • For large file sizes (100 K), FileStream
    outperforms StreamReader
  • StreamReader specifically looks for line-breaks,
    FileStream does not

46
Web ServicesWhere is a bottleneck
  • Are you network bound or CPU bound?
  • Use perf counters socket bytes sent / received
    Do you come close to the network capacity?
  • If you are network bound work on reducing the
    size of the message
  • Create a canned message, send over HTTP
    Compare performance with the web service
  • If you are CPU bound, optimize the serialization
    scheme for speed
  • http//blogs.msdn.com/mikezintel/archive/2005/03/3
    0/403941.aspx

47
Moving Forward
  • More tools
  • Live Remote Performance Counters (new in v2)
  • Under construction
  • Allocation profiler (CLR profiler)
  • Call profiler
  • Working set improvements
  • More speed

48
Summary
  • Make performance a requirement and measure
  • Understand the APIs
  • Isolate exactly what is being measured
  • Repeat tests several times and ignore the first
    time which is affected by JITting
  • Track the results in order for later comparisons
    and review
  • Ensure comparison of Apples to Apples
  • Use real code when possible
  • Test multiple designs and strategies - Understand
    the differences or variation
  • Avoid unnecessary object allocation and copies
    due to
  • String manipulations
  • Boxing
  • Not pre-sized collections
  • Performance FAQ
  • http//blogs.msdn.com/netcfteam/archive/2005/05/04
    /414820.aspx
Write a Comment
User Comments (0)
About PowerShow.com