How Much Pain for XMLs Gains - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

How Much Pain for XMLs Gains

Description:

Sun's research shows 10x overhead for SOAP. XML vs CSV ... Kohlhoff/Steele find SOAP 2-4x bigger, 8-10x slower. SOAP vs FIX ... 'Binary XML' is an oxymoron ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 15
Provided by: gca
Category:
Tags: gains | much | oxymoron | pain | xmls

less

Transcript and Presenter's Notes

Title: How Much Pain for XMLs Gains


1
How Much Pain for XMLs Gains?
  • Michael Champion
  • Sr. Technologist, Software AG USA
  • XML 2004

2
Outline
  • Measuring the Pain
  • Diagnosing the Causes
  • Proposed Analgesics
  • Just a Bunch of Snake Oil?
  • Conclusions

Laocoon Beware of geeks bearing gifts
3
XML Pain vs Existing Formats
  • SOAP-HTTP vs RMI
  • Suns research shows 10x overhead for SOAP
  • XML vs CSV
  • Nicola/John show XML parsing 26x slower
  • SOAP vs CDR
  • Kohlhoff/Steele find SOAP 2-4x bigger, 8-10x
    slower
  • SOAP vs FIX
  • Kohlhoff/Steele find SOAP 3-4x bigger, 9x slower
  • WS-Security vs SSL
  • Don Box asserts 10x slower
  • Generally people find XML imposes an overhead of
    an order of magnitude

4
Diagnosing the Pain
  • Element, attribute labels and namespace
    declarations create the bloat
  • Performance bottlenecks include
  • Well-formedness checking
  • Unicode character conversion
  • Char by char string processing
  • Node object construction
  • Entity reference expansion
  • Not to mention schema validation!
  • Issue is not text vs binary
  • But XMLs particular constraints

5
Where Does It Hurt?
  • Wireless industry - XML bandwidth requirements
    excessive
  • Maps
  • Images
  • Enterprise Transaction Processing
  • SOAP-based messaging
  • Multiple parse-serialize steps
  • XML-aware routers, firewalls,...
  • See Binary XML WG Use Case doc!

6
Relieving the Pain
  • Moores Law?
  • Doesnt apply to batteries!
  • Wireless bandwidth constrained by fundamental
    physical laws
  • In military scenarios, least power/bandwidth at
    the pointy end
  • GZIP?
  • Not for small documents
  • Considerable processing overhead
  • Only improves user latency in low bandwidth - big
    CPU scenarios
  • Better code?
  • IBM, MS seem convinced that parsers can be much
    faster
  • Doesnt help with bandwidth

Acetylsalicylic Acid (Asprin)
7
More Proposed Analgesics
  • Hardware Acceleration?
  • not much real world info found
  • Ask Datapower, Sarvega, Tarari, ...
  • Format Simplification?
  • SOAP forbids DTDs
  • Obvious interoperability issues!
  • Binary Infoset Serializations?
  • Much experimentation in the wild
  • W3C investigating value of stds
  • Assuming shared schema gives best technical
    results
  • No shared schema has best use cases but only 3-5x
    speedup over XML text
  • Hybrid approaches such as VTD-XML

8
Lots of Second Opinions
Premature optimization is the root of all evil
XML is about interop, stupid!!!
Fix your XML, dont expect standards to
accommodate your bad practices
If XML doesnt fit your needs, avoid it, dont
pollute it for the rest of us
9
Binary XML Snake Oil?
  • Binary XML is an oxymoron
  • There is ALREADY an unmanagemeable number of XML
    variants, we dont need more.
  • One-size-fits-all binary format is a pipedream
  • Industry-specific binary standards are fine, W3C
    core standard is premature
  • Better to invest in optimizing tools for existing
    formats

10
Facts Not In Serious Dispute
10x XML overhead vs app-specific formats
Wireless needs what XML offers but with less
overhead
GZIP is not the cure for bandwidth pain
Moores Law does not apply to batteries or
wireless networks
Its a user-perceived delay problem, not a
bandwidth problem
Overhead is probably NOT a problem for the
majority of existing XML users!
11
Perfect Storm of XML Politics?
Binary XML vs XML text
Infoset vs bits on the wire
Subsets / Profiles vs Complete Recommendations
12
Personal Assessment
  • Convenience has always come at a performance cost
  • Convenience eventually wins
  • Right now XML text overhead inhibits adoption in
    niches
  • REAL pain in niches that XML family could address
  • This is a genuine dilemma for W3C and mainstream
    vendors XML is NOT ubiquitous where it causes
    more pain than gain

Laocoon punished for speaking unpleasant truths
13
Recommendations
  • Dont deny unpleasant facts about XML pains
  • Moores Law wont make them all go away
  • Be smart about XML
  • Doctor, doctor it hurts when I do stupid thing
    X!
  • Densely code information if bandwidth is an issue
  • Use the right tool for speed - convenience
    tradeoff
  • Consider hybrid formats such as VTD-XML
  • Dont use off-the-shelf XML when you need a
    database
  • Let enterprise-class tools do the heavy lifting
  • DBMS, middleware, inference engines ...
  • Specialized XML processing hardware
  • Leave technology evolution to Darwin, not
    Berners-Lee
  • Mature standards good, premature standardization
    bad. Problems and solutions will find each other!

14
Further Reading
  • Proceedings of W3C Workshop on Binary Interchange
    of XML Information Item Sets. http//www.w3.org/2
    003/08/binary-interchange-workshop/Report.html
  • Sun Microsystems position paper to W3C Workshop
    Fast Web Services
  • Matthias Nicola, Jasmi John XML Parsing A Threat
    to Database Performance CIKM03
  • Christopher Kohlhoff, Robert Steele. Evaluating
    SOAP for High Performance Business Applications.
  • Jimmy Zhang. Better, Faster XML Processing with
    VTD-XML. http//www.devx.com/xml/Article/22219.
  • Michael Leventhal. Binary Showdown. XML Journal
    September 2003
  • W3C XML Binary Characterization Use Cases.
    http//www.w3.org/TR/xbc-use-cases/
Write a Comment
User Comments (0)
About PowerShow.com