Title: VoIP Moving from protocols to architectures
1VoIP - Moving from protocols to architecture(s)
- Henning Schulzrinne
- Dept. of Computer Science
- Columbia University
- September 2005
2Overview
- The big transitions in VoIP
- An Internet protocol framework
- Open issues in VoIP and interactive multimedia
communications - service creation and programmable systems
- VoIP poll model ? presence model
- application sharing
- SIP architecture and design philosophy
- Philosophies Skype, IETF, NGS,
3Philosophy transition
PC era cell phone era
One computer/phone, many users
One computer/phone, one user
mainframe era home phone party line
Many computers/phones, one user
ubiquitous computing
anywhere, any time any media
right place (device), right time, right media
4Evolution of VoIP
how can I make it stop ringing?
does it do call transfer?
long-distance calling, ca. 1930
going beyond the black phone
amazing the phone rings
catching up with the digital PBX
1996-2000
2000-2003
2004-
5Collaboration in transition
inter-organization multiple technology
generations diverse end points
intra-organization small number of systems
(meeting rooms)
standards-based solutions
proprietary (single-vendor) systems
6Current challenges
- Protocol (point) challenges
- 9-1-1 support
- location mapping
- presence configuration and policy
- automated system configuration
- System challenges
- 9-1-1
- reliability (incl. consistent QoS)
- manageability
- by non-experts
- cross-domain AAA
- inter-domain trust
7Internet services the missing entry
8Filling in the protocol gap
9An eco system, not just a protocol
configures
XCAP (config)
SIMPLE policy RPID .
XCON (conferencing)
initiates
carries
SIP
RTSP
SDP
carries
controls
provide addresses
STUN TURN
RTP
10A constellation of SIP RFCs
Non-adjacent (3327) Symmetric resp.
(3581) Service route (3608) User agent caps
(3840) Caller prefs (3841)
Request routing
Resource mgt. (3312) Reliable prov. (3262) INFO
(2976) UPDATE (3311) Reason (3326)
SIP (3261) DNS for SIP (3263) Events (3265) REFER
(3515)
ISUP (3204) sipfrag (3240)
Mostly PSTN
Core
Content types
Digest AKA (3310) Privacy (3323) P-Asserted
(3325) Agreement (3329) Media auth. (3313) AES
(3853)
DHCP (3361) DHCPv6 (3319)
Configuration
Security privacy
11SIP a bi-cultural protocol
- multimedia
- IM and presence
- location-based service
- user-created services
- decentralized operation
- everyone equally suspect
- overlap dialing
- DTMF carriage
- key systems
- notion of lines
- per-minute billing
- early media
- ISUP BICC interoperation
- trusted service providers
12SIP is PBX/Centrex ready
boss/admin features
centrex-style features
attendant features
from Rohan Mahys VON Fall 2003 talk
13SIP design objectives
- new features and services
- support features not available in PSTN
- e.g., presence and IM, session mobility
- not a PSTN replacement
- not just SS7-over-IP
- even similar services use different models (e.g.,
call transfer) - client heterogeneity
- clients can be smart or dumb (terminal adapter)
- mobile or stationary
- hardware or software
- client multiplicity
- one user multiple clients one address
- multimedia
- nothing in SIP assumes a particular media type
Rosenberg/Schulzrinne draft-rosenberg-sipping-sip
-arch-00
14SIP architectural principles (1)
- proxies are for routing
- do not maintain call state
- availability
- scalability
- flexibility
- extensibility (new methods, services)
- end point call state and features
- dialog models, not call models
- does not standardize features
- endpoint fate sharing
- call fails only if endpoints fail
- component-based design
- building blocks
- call features notification and manipulation
- logical components, not physical
- UA, proxy, registrar, redirect server
- can be combined into one box
Rosenberg/Schulzrinne draft-rosenberg-sipping-sip
-arch-00
15SIP architectural principles (2)
- designed for the (large) Internet
- does not assume particular network topology
- congestion-controlled
- deals with packet loss
- uses core Internet services
- DNS for load balancing
- DHCP for configuration
- S/MIME for e2e security
- TLS for channel security
- generality over efficiency
- focuses on algorithm efficiency, not
constant-factor encoding efficiency - efficiency penalty is temporary, generality is
permanent - text encoding
- extensibility
- use shim layer for compression where needed
- allow splitting of functionality for scaling
16SIP architectural principles (3)
- separation of signaling and media
- path followed by media packets independent of
signaling path - allows direct routing of latency-sensitive media
packets (10 ms matters) - without constraining service delivery (1s
matters) - facilitates mobility
- avoid hair pinning, tromboning
- facilitates vertical split between ISP and VSP
17SIP design principles (1)
- Proxies are method, body and header independent
- does not depend on method
- except CANCEL, ACK
- can add new methods without upgrading proxies
- primarily rely on URI, Via, Route and
Record-Route header fields - extensions Accept-Contact and Request-Disposition
- may use anything to guide routing decision
- Full-state nature of INVITE
- each (re)INVITE contains full session state
- facilitates MIDCOM-style interactions
- allows session transfer
- SIP URIs identify resources
- can be device instance, service, person
- but cannot tell from URI which (good!)
- can specify services and service parameters
18SIP design principles (2)
- Extensibility and compatibility
- can define new methods, header fields, body
types, parameters - supported by OPTIONS, Accept, Accept-Language,
Allow, Supported, Require, Proxy-Require,
Accept-Encoding and Unsupported - asking permission
- OPTIONS, dialog establishment
- asking forgiveness
- use extension without asking
- (Proxy)-Require please reject if you dont
understand it - use if you like
- allow recipients to safely ignore information
- must provide fallback!
- Internationalization
- UTF-8 for freeform text
- negotiation of languages
- Explicit intermediaries
- SIP proxies
- unlike transparent HTTP proxies or NAT boxes,
announce themselves - Via, Record-Route
- only involved if asked by UA or proxy
- should ask endpoints, rather than just do
- e.g., session policy
19SIP design principles (3)
- Guided proxy routing
- predetermine a set of downstream proxy resource
that must be visited - supported by Record-Route, Path, Service-Route
- Transport protocol independence
- can use UDP, TCP, SCTP,
- only requires packet-based (unreliable) delivery
- design decision that comes with some regret ?
- Protocol reuse
- MIME for body transport
- S/MIME for end-to-end security
- HTTP header field and semantics
- HTTP digest authentication
- URI framework
- non-SIP URIs (e.g., tel)
- re-use TLS for channel security
- use DNS SRV and NAPTR for server failover and
reliability
20SIP division of labor
21Interconnection approaches
22IETF 4G (access-neutral) model
Check reputation of columbia.edu
sipalice_at_columbia.edu ? sipbob_at_example.com
TLS
columbia.edu
example.com
Visited network
NSIS NTLP for QoS
802.1x
DIAMETER server
AP
alice_at_isp.net
isp.net
23Session Border Controllers (SBCs)
- Provider border element
- SIP terms either B2BUA or proxies
- but often ill-defined (may change roles)
- Functions differ
- similar definitional problem as soft switches
- May force convergence of media and signaling path
24SBCs High-level motivations
- Why application-layer elements in SIP that are
not quite proxies? - SMTP has various MTAs, but they are just MTAs
(e.g., spam filter) - Guesses
- media vs. control separation
- good idea in theory, harder in todays
limited-functionality Internet - force media through single control point (IP
address) - rather than from millions of sources
- see Asterix, Skype
- proxy model of no content (SDP) inspection or
modification too limited - CALEA (needs to be invisible)
- charging for services
- not an issue for email and web
25SBC functionality, contd
- Signaling functionality
- Protocol Conversion H.323 ?? SIP
- Protocol integrity - SIP normalization
- ENUM SIP redirect
- Policy enforcement and access control
- CDR creation
- Firewall (dest. port, source)
- Least-cost routing
- Certificate handling
- Caller-ID authorization
- Signaling encryption
- S/MIME encapsulation
- TCP/UDP-TLS bridging
- DoS attack mitigation
- Media functionality
- Codec conversion
- SLA enforcement
- Legal Intercept CALEA compliance
- Bandwidth Management
- Packet marking
- QoS guarantees
- Packet steering
- Media encryption
- Firewall (pinholes)
- DoS attack mitigation
26SBC Network evolution
stand-alone networks (Vonage, Skype)
media
earlier email, IM
SBC
only IP-level (with filter)
27SBC Concerns
- Common concerns
- may drop some header fields
- may fail to understand some request methods
- may modify headers inserted by others
- may modify session descriptions
- may inspect session descriptions
- Not all SBCs do this all the time, but some do
some of this sometimes
28SBC The dangers
- May not be present in all instances
- SBCs are a box description, not a function
description - Lack of visibility
- cannot tell where SBC is located
- hard to diagnose failures
- see HTTP transparent proxy experience
- one example TP thought SIP was HTTP
- hard to address content cryptographically to such
box - Lack of transparency
- not all features make it through SBC
- header support
- copying content
- routing loops
- Lack of security
- Inherent conflict between need for media session
inspection and session privacy - Session description modification removes
accountability - Lack of scalability
- needs to handle all media packets
- often, call stateful
- rather than stateless or transaction-stateful
29Whats left to do?
- Transition from poll model to context-based
communications - Higher-level service creation in end systems
- Dealing with NATs
- STUN (and SIP modifications) as first step
- ICE and BEHAVE WG as longer-term solutions
- The role of intermediaries
- session-border controllers
- end-to-middle security
- session policies
- Conference control
- Application sharing
- Security issues (spam, spit --gt identity and
reputation management)
30The role of presence
- Guess-and-ring
- high probability of failure
- telephone tag
- inappropriate time (call during meeting)
- inappropriate media (audio in public place)
- current solutions
- voice mail ? tedious, doesnt scale, hard to
search and catalogue, no indication of when call
might be returned - automated call back ? rarely used, too inflexible
- ? most successful calls are now scheduled by email
- Presence-based
- facilitates unscheduled communications
- provide recipient-specific information
- only contact in real-time if destination is
willing and able - appropriately use synchronous vs. asynchronous
communication - guide media use (text vs. audio)
- predict availability in the near future (timed
presence)
Prediction almost all (professional)
communication will be presence-initiated or
pre-scheduled
31Context-aware communication
- context the interrelated conditions in which
something exists or occurs - anything known about the participants in the
(potential) communication relationship - both at caller and callee
32Basic presence
- Role of presence
- initially can I send an instant message and
expect a response? - now should I use voice or IM? is my call going
to interrupt a meeting? is the callee awake? - Yahoo, MSN, Skype presence services
- on-line off-line
- useful in modem days but many people are
(technically) on-line 24x7 - thus, need to provide more context
- simple status (not at my desk)
- entered manually ? rarely correct
- does not provide enough context for directing
interactive communications
33Presence data model
calendar
cell
manual
person (presentity) (views)
alice_at_example.com audio, video, text
r42_at_example.com video
services
devices
34Presence data architecture
presence sources
PUBLISH
raw presence document
privacy filtering
create view (compose)
depends on watcher
XCAP
XCAP
select best source resolve contradictions
composition policy
privacy policy
(not defined yet)
draft-ietf-simple-presence-data-model
35Presence data architecture
candidate presence document
raw presence document
post-processing composition (merging)
watcher filter
SUBSCRIBE
remove data not of interest
difference to previous notification
final presence document
watcher
NOTIFY
36Rich presence
- More information
- automatically derived from
- sensors physical presence, movement
- electronic activity calendars
- Rich information
- multiple contacts per presentity
- device (cell, PDA, phone, )
- service (audio)
- activities, current and planned
- surroundings (noise, privacy, vehicle, )
- contact information
- composing (typing, recording audio/video IM, )
37RPID rich presence
38The role of presence for call routing
PUBLISH
- Two modes
- watcher uses presence information to select
suitable contacts - advisory caller may not adhere to suggestions
and still call when youre in a meeting - user call routing policy informed by presence
- likely less flexible machine intelligence
- if activities indicate meeting, route to tuple
indicating assistant - try most-recently-active contact first (seq.
forking)
PA
NOTIFY
translate RPID
LESS
CPL
INVITE
39Presence and privacy
- All presence data, particularly location, is
highly sensitive - Basic location object (PIDF-LO) describes
- distribution (binary)
- retention duration
- Policy rules for more detailed access control
- who can subscribe to my presence
- who can see what when
lttuple id"sg89ae"gt ltstatusgt ltgpgeoprivgt
ltgplocation-infogt ltgmllocationgt
ltgmlPoint gmlid"point1 srsName"ep
sg4326"gt ltgmlcoordinatesgt374630N
1222510W lt/gmlcoordinatesgt
lt/gmlPointgt lt/gmllocationgt
lt/gplocation-infogt ltgpusage-rulesgt
ltgpretransmission-allowedgtno lt/gpretransmissi
on-allowedgt ltgpretention-expirygt2003-06-2
3T045729Z lt/gpretention-expirygt
lt/gpusage-rulesgt lt/gpgeoprivgt lt/statusgt
lttimestampgt2003-06-22T205729Zlt/timestampgt lt/tupl
egt
40Location-based services
- Finding services based on location
- physical services (stores, restaurants, ATMs, )
- electronic services (media I/O, printer, display,
) - not covered here
- Using location to improve (network) services
- communication
- incoming communications changes based on where I
am - configuration
- devices in room adapt to their current users
- awareness
- others are (selectively) made aware of my
location - security
- proximity grants temporary access to local
resources
41Location-based SIP services
- Location-aware inbound routing
- do not forward call if time at callee location is
11 pm, 8 am - only forward time-for-lunch if destination is on
campus - do not ring phone if Im in a theater
- outbound call routing
- contact nearest emergency call center
- send delivery_at_pizza.com to nearest branch
- location-based events
- subscribe to locations, not people
- Alice has entered the meeting room
- subscriber may be device in room ? our lab stereo
changes CDs for each person that enters the room
42Program location-based services
43Service creation
- Tailor a shared infrastructure to individual
users - traditionally, only by vendors (and sometimes
carriers) - learn from web models killer app ?vertical apps
44Automating media interaction service examples
- If call from my boss, turn off the stereo ? call
handling with device control - As soon as Tom is online, call him ? call
handling with presence information - Vibrate instead of ring when I am in movie
theatre ? call handling with location information - At 900AM on 09/01/2005, find the multicast
session titled ABC keynote and invite all the
group members to watch ? call handling with
session information - When incoming call is rejected, send email to the
callee ? call handling with email
45LESS simplicity
- Generality (few and simple concepts)
- Uniformity (few and simple rules)
- Trigger rule
- Switch rule
- Action rule
- Modifier rule
- Familiarity (easy for user to understand)
- Analyzability (simple to analyze)
modifiers
switches
trigger
actions
46LESS Decision tree
- No loops
- Limited variables
- Not necessarily
- Turing-complete
47LESS Safety
- Type safety
- Strong typing in XML schema
- Static type checking
- Control flow safety
- No loop and recursion
- One trigger appear only once, no feature
interaction for a defined script - Memory access
- No direct memory access
- LESS engine safety
- Ensure safe resource usage
- Easy safety checking
- Any valid LESS scripts can be converted into
graphical representation of decision trees.
48LESS snapshot
incoming call
ltlessgt ltincominggt ltaddress-switchgt
ltaddress issipmyboss_at_abc.com"gt
ltdeviceturnoff devicesipstereo_room
1_at_abc.com/gt ltmedia mediaaudiogt
ltaccept/gt lt/mediagt lt/addressgt
lt/address-switchgt lt/incominggt lt/lessgt
If the call from my boss
Turn off the stereo
Accept the call with only audio
trigger, switch, modifier, action
49LESS packages
- Use packages to group elements
email
web
im
conference
calendar
location
session
50When Tom is online,
- ltlessgt
- ltEVENTnotificationgt
- ltaddress-switchgt
- ltaddress is"siptom_at_example.com"gt
- ltEVENTevent-switchgt
- ltEVENTevent is"open"gt
- ltlocation url"siptom_at_example.com"gt
- ltIMim message"Hi, Tom"/gt
- lt/locationgt
- lt/EVENTeventgt
- lt/EVENTevent-switchgt
-
- lt/lessgt
51When I am in a movie theatre,
- ltlessgt
- ltincominggt
- ltlocation-switchgt
- ltlocation placetypequietgt
- ltalert soundnone vibrateyes/gt
- lt/locationgt
- lt/location-switchgt
- lt/incominggt
- lt/lessgt
52(No Transcript)
53Programming VoIP clients
- Precursor CTI
- but rarely used outside call centers
- Call external programs
- e.g., Google maps, local search
- Scripting APIs
- e.g., call Tcl or PHP scripts ? sip-cgi
- Controllable
- COM, XML RPC
- used for media agents in sipc
- Embeddable
- no UI, just signaling and media
54Interfacing with Google
911 caller location IM/presence location of
friends call Im here
55Interfacing with Google
show all files from caller Xiaotao Wu
56Embedding VoIP FAA training
controls pilot and ATC agents using multicast
and unicast (landlines)
57Conference control
- Setting up parameterized conferences
- SIP INVITE and NOTIFY suffice for basic dial-in
conference functionality and change notification - IETF XCON WG struggling with model and complexity
58XCON System
59Open issues application sharing
- Current T.120
- doesnt integrate well with other conference
control mechanisms - hard to make work across platforms (fonts)
- ill-defined security mechanisms
- Current web-based sharing
- hard to integrate with other media, control and
record - generally only works for Windows
- mostly limited to shared PowerPoint
- Current vnc
- whole-screen sharing only
- can be coerced into conferencing, but doesnt
integrate well with control protocols
60IETF effort standardized application sharing
- Remote access application sharing
- Four components
- window drawing ops ? PNG
- keyboard input
- mouse input
- window operations (raise, lower, move)
- Uses RTP as transport
- synchronization with continuous media
- but typically, TCP
- allow multicast ? large group sessions
61Spam and spit
62SIP unsolicited calls and messages
- Possibly at least as large a problem
- more annoying (ring, pop-up)
- Bayesian content filtering unlikely to work
- ? identity-based filtering
- PKI for every user unrealistic
- Use two-stage authentication
- SIP identity work
mutual PK authentication (TLS)
home.com
Digest
63Domain Classification
- Classification of domains based on their identity
instantiation and maintenance procedures plus
other domain policies. - Admission controlled domains
- Strict identity instantiation with long term
relationships - Example Employees, students, bank customers
- Bonded domains
- Membership possible only through posting of bonds
tied to a expected behavior - Membership domains
- No personal verification of new members but
verifiable identification required such as a
valid credit card and/or payment - Example E-bay, phone and data carriers
- Open domains
- No limit or background check on identity creation
and usage - Example Hotmail
- Open, rate limited domains
- Open but limits the number of messages per time
unit and prevents account creation by bots - Example Yahoo
64Reputation service
David
Carol
has sent IM to
has sent email to
Frank
Emily
is this a spammer?
Bob
Alice
65SIP standards deployment issues and competition
- Interoperability
- Proprietary systems
66Provider combinations
Cisco CM
Skype
software
hardware
mobile operators?
cable DSL op
ISP IAP
VSP
67VoIP service providers
mostly IM no PSTN (now)
Google MSN Yahoo! Xbox
ATT (Vantage) Verizon (VoiceWing) JP SoftBank
(8.3m)
primary line replacement voice only
Cable (911,000) ComCast
Skype Vonage (1m)
68Protocol interoperability problems
- Three core interoperability problems
- syntactic robustness
- You mean you could have a space there?
- often occurs when testing only against common
reference implementations - need stress test (also for buffer overflows)
- implementation by protocol example
- limiting assumptions (e.g., user name format)
- see SIP Robustness Testing for Large-Scale Use,
First International Workshop on Software Quality
(SOQA) - semantic assumptions
- I didnt expect this error
- mutually incompatible extensions
- expect extension to make something work
69Protocol interoperability proprietary protocols
- Proprietary protocol
- Example Skype
- quicker evolution not dependent on IETF
volunteers with day jobs - can do hacks without IESG objection
- media over TCP
- inefficient search
- bypass routing policies
- circumvent firewall policies
- Can only reverse-engineer ? only
backwards-compatibility problems - incentive to force upgrades (see Microsoft Word)
- less Metcalfes law value
70Why is Skype successful?
- All the advantages of a proprietary protocol
- Peer-to-peer coincidental
- Good out-of-box experience
- Software vendor service provider
- Didnt know that you couldnt do voice quality
beyond PSTN - others too focused on PSTN interoperability why
do better voice than PSTN? - Simpler solutions for NAT traversal
- use TCP if necessary
- use port 80
- Did encryption from the very beginning
- Kazaa marketing vehicle
71Skype vs. SIP-based systems
72Open standard, dominant vendor
- Example H.323
- doesnt matter what the standard says
- NetMeeting and H.323 ? test with Microsoft
implementation - limits feature evolution to dominant vendor speed
and interests
73Open standard, multiple vendors
- Example SIP
- More than just one application
- Software UAs, proxies, phones, gateways, media
servers, test tools, OAM, - interoperability problems likely until product
maturity - harder to test internally against all (competing)
products - divergent views and communities in IETF and other
SDOs - likely have to support union of requirements
- emphasis on extensibility, modularity and
protocol re-use - ? temptation to not implement everything
- security
- SIP generality over efficiency
- better long-term outcome, but slower
74The SIP complexity fallacy
- IAX (for example) is simpler than SIP
- but you could build the IAX functionality in SIP
at just about the same complexity - no proxies
- no codec negotiation
- no distributed services
- Difficulty extracting those simple pieces from
269 pages of specification ( SDP RTP) ? - SIP still more complex due to signaling-data
separation
IAX model
Signaling Media
Signaling Media
Signaling
Signaling
Media
SIP, H.323, MCGP model
75Does it have to be that complicated?
- highly technical parameters, with differing names
- inconsistent conventions for user and realm
- made worse by limited end systems (configure by
multi-tap) - usually fails with some cryptic error message and
no indication which parameter - out-of-box experience not good
76Solving the configuration mess
- Initial development assumed enterprise deployment
- pre-configured via tftp or (rarely) DHCP
- not suitable for residential use, except if box
is shipped - pathetic security password accessible to
anybody who knows MAC address of phone - Short term
- adopt simple default conventions
- should only need SIP URI (AoR), display name and
password - realm URI
- outbound proxy domain
- provide and expose error feedback
- not authentication failure
- but realm not recognized change to user_at_domain
format - use DNS NAPTR and SRV for STUN server
77Solving the configuration mess longer term
- IETF efforts on configuration management
- retrieve via HTTP ( TLS)
- change notification via SIP event notification
- problem of configuring initial secret remains
- probably need embedded public keys
- Still need better diagnostics
- one-way voice issues
- authentication failures
78Conclusion
- Slow transition from emulating PSTN to new
services - presence-based
- embedded (e.g., games)
- Emphasis moving from protocol mechanics to
architecture - slow transition to open systems
- different combinations of software vendors,
IAP/ISP, VSP, hardware vendors - Still need to fill out infrastructure for
collaboration and presence - Standardization bodies face challenges of overlap
and resource exhaustion