Title: Tools for Automated Verification of Web Services
1Tools for Automated Verification of Web Services
- Tevfik Bultan
- Department of Computer Science
- University of California, Santa Barbara
- bultan_at_cs.ucsb.edu
- http//www.cs.ucsb.edu/bultan/
2Characteristics of Web Services
- Web services Web accessible software
applications which interact with each other
through the Internet - Goals
- Platform independent (.NET, J2EE)
- Dynamic service discovery
- Loosely coupled
- Tolerate pauses in availability and slow data
transmission - Approach
- Standardized data transmission XML
- Interaction through standardized interfaces WSDL
- Asynchronous messaging
3Web Service Standards
WSCI
Interaction
Composition
BPEL4WS
Service
WSDL
Implementation Platforms
Microsoft .Net, Sun J2EE
Message
SOAP
Type
XML Schema
XML
Data
Web Service Standards
4Challenges in Verification of Web Services
- Distributed nature, no central control
- How do we model the global behavior?
- How do we specify the global properties?
- Asynchronous messaging introduces undecidability
in analysis - How do we check the global behavior?
- How do we enforce the global behavior?
- XML data manipulation
- How do we specify the XML messages?
- How do we verify properties related to data?
5Outline
- Web Service Composition Model
- Capturing Global Behaviors
- Conversations
- Top-Down vs. Bottom-Up Specification and
Verification - Realizability vs. Synchronizability
- XML messaging
- MSL, XPath
- Translation to Promela
- Web Service Analysis Tool
- Conclusions and Future Work
Collaborators Xiang Fu, Jianwen Su, Rick Hull
6An Example Stock Analysis Service
- Three peers Investor (Inv), Stock Broker (SB),
and Research Department (RD) - Inv initiates the stock analysis service by
sending a register message to the SB - The SB may accept or reject the registration
- If the registration is accepted, the SB sends an
analysis request to the RD - RD sends the results of the analysis directly to
the Inv as a report - After receiving a report the Inv can either send
an ack to the SB or cancel the service - Then, the SB either sends the bill for the
services to the Inv, or continues the service
with another analysis request
7An Example Stock Analysis Service (SAS)
- SAS is a composite web service
- a finite set of peers Inv, SB, RD
- and a finite set of message classes register,
ack, cancel, accept, reject, bill, request,
terminate, report
register ack, cancel
Investor (Inv)
Stock Broker (SB)
accept, reject, bill
report
request, terminate
Research Dept. (RD)
8Communication Model
- We assume that the messages among the peers are
exchanged using reliable and asynchronous
messaging - FIFO and unbounded message queues
Stock Broker (SB)
Research Dept. (RD)
req
req
- This model is similar to industry efforts such as
- JMS (Java Message Service)
- MSMQ (Microsoft Message Queuing Service)
9Composite Web Service Execution
Investor
Stock Broker Firm
?register
!register
?reject
!reject
!accept
?accept
!request
!ack
acc
rep
bil
?report
?ack
reg
ack
!bill
?bill
?cancel
!cancel
?bill
!bill
!terminate
Research Dept.
!report
?request
req
ter
?terminate
10Conversations
- A virtual watcher records the messages as they
are sent
Investor (Inv)
Stock Broker (SB)
Watcher
rep
acc
bil
reg
ack
req
ter
Research Dept. (RD)
- A conversation is a sequence of messages the
watcher sees during an execution - Bultan, Fu, Hull, Su WWW03
11Effects of Asynchronous Communication
- Question Given a composite web service, is the
set of conversations a regular set? - Even when messages do not have any content and
the peers are finite state machines the
conversation set may not be regular - Reason asynchronous communication with unbounded
queues - Bounded queues or synchronous communication
- ? Conversation Set always regular
12Properties of Conversations
- The notion of conversation enables us to reason
about temporal properties of the composite web
services - LTL framework extends naturally to conversations
- LTL temporal operators
- X (neXt), U (Until), G (Globally), F (Future)
- Atomic properties
- Predicates on message classes (or contents)
- Example G ( accept ? F bill )
- Model checking problem Given an LTL property,
does the conversation set satisfy the property?
13Bottom-Up vs. Top-Down
- Bottom-up approach
- Specify the behavior of each peer
- The global communication behavior (conversation
set) is implicitly defined based on the composed
behavior of the peers - Global communication behavior is hard to
understand and analyze - Top-down approach
- Specify the global communication behavior
(conversation set) explicitly as a protocol - Ensure that the conversations generated by the
peers obey the protocol
14msg1
msg4
Peer A
Peer B
Peer C
Conversation Schema
msg2, msg6
msg3, msg5
LTL property
B?Amsg2
B?Cmsg5
?
Conversation Protocol
G(msg1 ? F(msg3 ? msg5))
A?Bmsg1
B?Amsg6
B?Cmsg3
C?Bmsg4
Peer A
Peer B
Peer C
?msg1
!msg1
Input Queue
!msg3
?msg3
!msg2
?msg2
!msg5
?msg5
?msg4
!msg4
?msg6
!msg6
...
?
Virtual Watcher
G(msg1 ? F(msg3 ? msg5))
LTL property
15Conversation Protocols
- Conversation Protocol
- An automaton that accepts the desired
conversation set - A conversation protocol is a contract agreed by
all peers - Each peer must act according to the protocol
- For reactive protocols with infinite message
sequences we use - Büchi automata which accept infinite strings
- For specifying message contents, we use
- Guarded automata
- Guards are constraints on the message contents
16SAS Conversation Protocol
- This conversation protocol specifies the set of
conversations for the SAS
report
ack
1
6
7
8
register
request
cancel
ack
request
reject
accept
bill
2
3
5
9
report
terminate
4
10
12
11
bill
cancel
terminate
17Synthesize Peer Implementations
- Conversation protocol specifies the global
communication behavior - How do we implement the peers?
- How do we obtain the contracts that peers have to
obey from the global contract specified by the
conversation protocol? - Project the global protocol to each peer
- By dropping unrelated messages for each peer
18Interesting Question
- If this equality holds the conversation protocol
is realizable - Are there conditions which ensure the
equivalence?
?
Conversations generated by the projected services
Conversations specified by the conversation
protocol
?
19Realizability Problem
- Not all conversation protocols are realizable!
A?B m1
C?D m2
Conversation protocol
Conversation m2 m1 will be generated by all
peer implementations which follow the protocol
20Another Non-Realizable Protocol
m1
A
B
m2
A
m2
m2
m3
C
m1
m3
B
m1
B
A, C
C
A?B m1
B?A m2
m3
Watcher
B?A m2
m2 m1 m3
Generated conversation
A?B m1
A?C m3
21Realizability Conditions
- Three sufficient conditions for realizability (no
message content) Fu, Bultan, Su, CIAA03,
TCS04 - Lossless join
- Conversation set should be equivalent to the join
of its projections to each peer - Synchronous compatible
- When the projections are composed synchronously,
there should not be a state where a peer is ready
to send a message while the corresponding
receiver is not ready to receive - Autonomous
- At any state, each peer should be able to do only
one of the following send, receive or terminate - (a peer can still choose among multiple
messages)
22Realizability Conditions
- Following protocols fail one of the three
conditions but satisfy the other two
A?B m1
A?B m1
B?A m2
A?B m1
B?A m2
C?D m2
A?B m1
C?A m2
A?C m3
Not lossless join
Not autonomous
Not synchronous compatible
23Bottom-Up Approach
- We know that analyzing conversations of composite
web services is difficult due to asynchronous
communication - The question is
- Can we identify the composite web services where
asynchronous communication does not create a
problem?
24Three Examples, Example 1
!a1
!a2
r1, r2
!e
e
?r1
?r2
?a1
?a2
?e
a1, a2
!r2
!r1
requester
server
- Conversation set is regular (r1a1 r2a2) e
- During all executions the message queues are
bounded
25Example 2
!a1
!a2
r1, r2
!e
?a1
?a2
e
?r1
?r2
?e
!r2
!r1
a1, a2
requester
server
- Conversation set is not regular
- Queues are not bounded
26Example 3
r1, r2
!e
!r1
!r2
?r
!a
e
?r1
?r2
?a
!r
a1, a2
?e
requester
server
- Conversation set is regular (r1 r2 ra) e
- Queues are not bounded
27State Spaces of the Three Examples
of states in thousands
queue length
- Verification of Examples 2 and 3 are difficult
even if we bound the queue length - How can we distinguish Examples 1 and 3 (with
regular conversation sets) from 2? - Synchronizability Analysis
28Synchronizability Analysis
- A composite web service is synchronizable, if its
conversation set does not change - when asynchronous communication is replaced with
synchronous communication - A composite web service is synchronizable, if it
satisfies the synchronous compatible and
autonomous conditions - Fu, Bultan, Su WWW04
29Are These Conditions Too Restrictive?
Problem Set Problem Set Size Size Size Synchronizable?
Source Name msg states trans.
ISSTA04 SAS 9 12 15 yes
IBM Conv. Support Project CvSetup 4 4 4 yes
IBM Conv. Support Project MetaConv 4 4 6 no
IBM Conv. Support Project Chat 2 4 5 yes
IBM Conv. Support Project Buy 5 5 6 yes
IBM Conv. Support Project Haggle 8 5 8 no
IBM Conv. Support Project AMAB 8 10 15 yes
BPEL spec shipping 2 3 3 yes
BPEL spec Loan 6 6 6 yes
BPEL spec Auction 9 9 10 yes
Collaxa. com StarLoan 6 7 7 yes
Collaxa. com Cauction 5 7 6 yes
30Web Service Analysis Tool (WSAT)
Verification Languages
WebServices
Front End
Analysis
Back End
Intermediate Representation
GFSA to Promela (synchronous communication)
success
BPEL to GFSA
SynchronizabilityAnalysis
Guarded automata
BPEL
fail
(bottom-up)
GFSA to Promela (bounded queue)
Promela
skip
GFSA parser
Conversation Protocol
Guarded automaton
GFSA to Promela(single process, no
communication)
success
Realizability Analysis
fail
(top-down)
http//www.cs.ucsb.edu/su/WSAT/
Fu, Bultan, Su CAV04
31Guarded Automata Model
- Uses XML messages
- Uses MSL for declaring message types
- MSL (Model Schema Language) is a compact formal
model language which captures core features of
XML Schema - Uses XPath expressions for guards
- XPath is a language for writing expressions
(queries) that navigate through XML trees and
return a set of answer nodes
32The Guarded Automata Model
//type declaration request id int //
message declaration r2 request // local
variable declaration last request
!e
?a1
?a2
!r2
!r1
Guard a2/id last/id gt r2/id last/id
1, last/id last/id 1
33XML (eXtensible Markup Language)
- XML is a markup language like HTML
- Similar to HTML, XML tags are written as
- lttaggt followed by lt/taggt
- HTML vs. XML
- In HTML, tags are used to describe the appearance
of the data - ltbgt lt/bgt ltigt lt/igt ltbrgt ltpgt ...
- In XML, tags are used to describe the content of
the data rather than the appearance - ltdategt lt/dategt ltaddressgt lt/addressgt
34An XML Document and Its Tree
ltRegistergt ltinvestorIDgt VIP01 lt/investorIDgt ltreque
stListgt ltstockIDgt 0001 lt/stockIDgt ltstockIDgt 0002 lt
/stockIDgt lt/requestListgt ltpaymentgt ltaccountNumgt 04
25 lt/accountNumgt lt/paymentgt lt/Registergt
- XML documents can be modeled as trees
- where each internal node corresponds to a
- tag and leaf nodes correspond to basic types
35XML Schema
- XML provides a standard way to exchange data over
the Internet. - However, the parties which exchange XML documents
still have to agree on the type of the data - What are the tags that will appear in the
document, in what order, etc. - XML Schema is a language for defining XML data
types - MSL (Model Schema Language) is a compact formal
model language which captures core features of
XML Schema
36MSL (Model Schema Language)
- Basic MSL syntax
- g ? ? b t g g m , n
- g , g g g g g
- g is an XML type (i.e., an MSL type expression)
- ? is the empty sequence
- b is a basic type such as string, boolean, int,
etc. - t is a tag
- m and n are positive integers
- , are MSL type constructors
37MSL Semantics
- t g denotes a type with root node labeled t
with children of type g - g m , n denotes a sequence of size at least m
and at most n where each member is of type g - g1 , g2 denotes an ordered sequence where the
first member is of type g1 and the second member
is of type g2 - g1 g2 denotes an unordered sequence where one
member is of type g1 and the other member is of
type g2 - g1 g2 denotes a choice between type g1 and type
g2, i.e., either type g1 or type g2, but not both
38An MSL Type Declaration and an Instance
ltRegistergt ltinvestorIDgt VIP01 lt/investorIDgt ltreque
stListgt ltstockIDgt 0001 lt/stockIDgt ltstockIDgt 0002 lt
/stockIDgt lt/requestListgt ltpaymentgt ltaccountNumgt 04
25 lt/accountNumgt lt/paymentgt lt/Registergt
Register investorIDstring , requestList
stockIDint1,3 , payment
creditCardNumint accountNumint
39Translating Guarded Automata to Promela
- We used the SPIN model checker to verify the
properties of conversations - SPIN is a finite state model checker
- we restricted XML message contents to finite
domains - We translate guarded automata models to Promela
(input language of the SPIN model checker) - First, translate MSL type declarations to Promela
type declarations - Then, translate XPath expressions to Promela code
40Mapping MSL types to Promela
- Basic types
- integer and boolean types are mapped to Promela
basic types int and bool - We only allow constant string values and strings
are mapped to enumerated type (mtype) in Promela - Other type constructors are handled using
- structured types (declared using typedef) in
Promela - or arrays
41Mapping MSL type constructors to Promela
- t g is translated to a typedef declaration
- g m , n is translated to an array declaration
- g1 , g2 is translated to a sequence of type
declarations - g1 g2 is translated to a sequence of type
declarations and an enumerated variable which is
used to record which type is chosen - g1 g2 is not handled! We do not handle
unordered type sequence (it can cause state-space
explosion)
42Example
typedef t1_investorID mtype
stringvalue typedef t2_stockIDint
intvalue typedef t3_requestList t2_stockID
stockID 3 int stockID_occ typedef
t4_accountNumint intvalue typedef
t5_creditCardint intvalue mtype m_accountNum,
m_creditCard typedef t6_payment t4_accountNum
accountNum t5_creditCard creditCard mtype
choice typedef Register t1_investorID
investorID t3_requestList requestList
t6_payment payment
Register investorIDstring , requestList
stockIDint1,3 , payment
creditCardNumint accountNumint
43XPath
- In order to write specifications or programs that
manipulate XML documents we need - an expression language to access values and nodes
in XML documents - XPath is a language for writing expressions
(queries) that navigate through XML trees and
return a set of answer nodes - An XPath query defines a function which
- takes and XML tree and a context node (in the
same tree) as input and - returns a set of nodes (in the same tree) as
output
44XPath Syntax
- Basic XPath syntax
- q ? . .. b t
- /q //q q / q q // q
- q q q exp
- q is an XPath query
- exp denotes a predicate on basic types, i.e., on
the leaf nodes of the XML tree - b denotes a basic type such as string, boolean,
int, etc. - t denotes a tag
45XPath Semantics
- Given an XML tree and a node n as a context node
- . returns n
- .. returns the parent of n
- Given an XML tree and a set of nodes
- returns all the nodes
- b returns the nodes that are of basic type b
- t returns the nodes which are labeled with tag
t
46XPath Semantics Contd.
- Starting at the context node
- /q returns the nodes that match q
- //q returns the nodes that match q starting at
any descendant - q1 / q2 returns each node which matches q2
starting at a child of a node which matches q1 - q1 // q2 returns each node which matches q2
starting at a descendant of a node which matches
q1 - q1 q2 applies q2 to the children of the
nodes which match q1 - q exp returns the nodes that match q and for
children of which the expression exp evaluates to
true
47Examples
//payment/ returns the node labeled
accountNum /Register/requestList/stockID/int
returns the nodes labeled 0001 and
0002 //stockIDint gt 1/int returns the node
labeled 0002
48XPath to Promela
- Generate code that evaluates the XPath expression
- Fu, Bultan, Su ISSTA04
- Traverse the XPath expression from left to right
- Code generated in each step is inserted into the
BLANK spaces left in the code from the previous
step - A tree representation of the MSL type is used to
keep track of the context of the generated code - Uses two data structures
- Type tree shows the structure of the
corresponding MSL type - Abstract statements which are mapped to Promela
code
49Statement
Promela Code
if v -gt BLANK else -gt skip fi
IF(v)
FOR(v,l,h)
v l 1 do v lt h -gt BLANK v
else -gt break od
BLANK
EMPTY
INC(v)
v
SET(v,a)
v a
50Type Tree
Register investorIDstring requestList
stockIDint1,3 payment
creditCardNumint accountNumint
1
Register
7
2
4
payment
investorID
requestList
8
10
3
5
string
creditCard
stockID (idx i1)
accountNum
9
11
int
int
6
int
51Generated Statements
register // stockID / int()gt5 / position()
last()/ int()
EMPTY
5
5
FOR (i1,1,3)
IF (i2i3)
1
5
EMPTY
5
5
5
5
5
6
Sequence
cond ? v_register.requestlist.stockIDi1 gt 5
Insert
52request//stockIDregister//stockIDint()gt5posi
tion()last()
/ result of the XPath expression / bool
bResult false / results of the predicates 1,
2, and 1 resp. / bool bRes1, bRes2, bRes3 /
index, position(), last(), index, position() /
int i1, i2, i3, i4, i5 i21 / pre-calculate
the value of last(), store in i3 / i40 i51
i30 do i4 lt v_register.requestList.stockID_
occ -gt / compute first predicate /
bRes3 false if v_register.requestList.
stockIDi4.intvaluegt5 -gt bRes3 true
else -gt skip fi if bRes3 -gt i5
i3 else -gt skip fi i4
else -gt break od
53request//stockIDregister//stockIDint()gt5posi
tion()last()
i10 do i1 lt v_register.requestList.stockID
_occ -gt bRes1 false if
v_register.requestList.stockIDi1.intvaluegt5 -gt
bRes1 true else -gt skip fi if
bRes1 -gt bRes2 false if
(i2 i3) -gt bRes2 true else -gt
skip fi if bRes2 -gt
if (v_request.stockID.intvalue
v_register.requestList.stockIDi
1.intvalue) -gt bResult true
else -gt skip fi else -gt
skip fi i2 else -gt skip
fi i1 else -gt break od
54Model Checking Using Promela
- Found subtle errors in an example
- SAS Stock Analysis Service Fu, Bultan, Su
ISSTA04 - 3 peers Investor, Broker, ResearchDept.
- Investor ? Broker a registerList of stockIDs
- Broker ? ResearchDept.
- relay request (1 stockID per request)
- find the stockID in the latest request, send its
subsequent stockID in registerList - Repeating stockID will cause error.
- Only discoverable by analysis of XPath expressions
55Related Work
- Conversation specification
- IBM Conversation support project
http//www.research.ibm.com/convsupport/ - Conversation support for business process
integration Hanson, Nandi, Kumaran EDOCC02 - Orchestrating computations on the world-wide web
Choi, Garg, Rai, Misram, Vin EuroPar02 - Realizability problem
- Realizability of Message Sequence Charts (MSC)
Alur, Etassami, Yannakakis ICSE00, ICALP01
56Related Work
- Verification of web services
- Simulation, verification, composition of web
services using a Petri net model Narayanan,
McIlraith WWW02 - BPEL verification using a process algebra model
and Concurrency Workbench Koshkina, van Breugel
TAV-WEB03 - Using MSC to model BPEL web services which are
translated to labeled transition systems and
verified using model checking Foster, Uchitel,
Magee, Kramer ASE03 - Model checking Web Service Flow Language
specifications using SPIN Nakajima ICWE04
57Current and Future Work
- Extending the source and target languages
- Symbolic analysis
- Fu, Bultan, Su ICWS04
- Abstraction
- Design for verification for web services
- Betin-Can, Bultan 04
58Current and Future Work
Web Service Specification Languages
Verification Languages
Front End
Analysis
Back End
Intermediate Representation
BPEL
Translation with synchronous communication
success
Translator for bottom-up specifications
Promela
SynchronizabilityAnalysis
DAML-S
SMV
Guarded automata
fail
Translation with bounded queue
WSCI
Automated Abstraction
skip
ActionLanguage
Conversation Protocols
Translator for top-down specifications
. . .
Realizability Analysis
success
Translation withsingle process, no communication
Guarded automaton
. . .
fail