SIP Server Scalability - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

SIP Server Scalability

Description:

stateful vs stateless proxy, redirect, programmable scripts. Beyond telephony (Don't know) ... Stateful proxy. UDP, no DNS, eight messages per call. Event-based ... – PowerPoint PPT presentation

Number of Views:400
Avg rating:3.0/5.0
Slides: 35
Provided by: kundan8
Category:

less

Transcript and Presenter's Notes

Title: SIP Server Scalability


1
SIP Server Scalability
  • IRT Internal Seminar
  • Kundan Singh, Henning Schulzrinne
  • and Jonathan Lennox
  • May 10, 2005

2
Agenda
  • Why do we need scalability?
  • Scaling the server
  • SIP express router (Iptel.org)
  • SIPd (Columbia University)
  • Threads/Processes/Events
  • Scaling using load sharing
  • DNS-based, Identifier-based
  • Two stage architecture
  • Conclusions

27 slides
3
Internet telephony(SIP Session Initiation
Protocol)
alice_at_yahoo.com
bob_at_example.com
yahoo.com
example.com
192.1.2.4
129.1.2.3
DB
4
Scalability RequirementsDepends on role in the
network architecture
Cybercafe
ISP
IP network
IP phones
GW
ISP
MG
MG
SIP/MGC
GW
SIP/PSTN
SIP/MGC
Carrier network
MG
GW
PBX
T1 PRI/BRI
PSTN phones
PSTN
5
Scalability RequirementsDepends on traffic type
  • Registration (uniform)
  • Authentication, mobile users
  • Call routing (Poisson)
  • stateful vs stateless proxy, redirect,
    programmable scripts
  • Beyond telephony (Dont know)
  • Instant message, presence (including sensors),
    device control
  • Stateful calls (Poisson arrival, exponential call
    duration)
  • Firewall, conference, voicemail
  • Transport type
  • UDP/TCP/TLS (cost of security)

6
SIPstoneSIP server performance metrics
SQL database
  • Steady state rate for
  • successful registration, forwarding and
    unsuccessful call attempts measured using 15 min
    test runs.
  • Measure requests/s with given delay constraint.
  • Performancef(user,DNS,UDP/TCP,g(request),L)
    where gtype and arrival pdf (request/s),
    Llogging?
  • For register, outbound proxy, redirect, proxy480,
    proxy200.
  • Parameters
  • Measurement interval, transaction response time,
    RPS (registers/s), CPS (calls/s), transaction
    failure probability
  • Delay budget R1
  • Shortcomings
  • does not consider forking, scripting, Via header,
    packet size, different call rates, SSL. Is there
    linear combination of results?
  • Whitebox measurements turnaround time
  • Extend to SIMPLEstone

Server
Loader
Handler
REGISTER
R1
200 OK
INVITE
100 Trying
INVITE
R2
180 Ringing
180 Ringing
200 OK
200 OK
ACK
ACK
BYE
BYE
200 OK
200 OK
7
SIP serverWhat happens inside a proxy?

(Blocking) I/O
Critical section (lock)

Critical section (r/w lock)

8
Lessons Learnt (sipd)In-memory database
  • Call routing involves (? 1) contact lookups
  • 10 ms per query (approx)
  • Cache (FastSQL)
  • Loading entire database is easy
  • Periodic refresh
  • Potentially useful for DNS lookups

Web config
SQL database
Periodic Refresh
Cache
2002Narayanan Single CPU Sun Ultra10
Turnaround time vs RPS
9
Lessons Learnt (sipd)Thread-per-request does not
scale
  • One thread per message
  • Doesnt scale
  • Too many threads over a short timescale
  • Stateless 2-4 threads per transaction
  • Stateful 30s holding time
  • Thread pool queue
  • Thread overhead less more useful processing
  • Pre-fork processes for SIP-CGI
  • Overload management
  • Graceful failure, drop requests over responses
  • Not enough if holding time is high
  • Each request holds (blocks) a thread

Thread pool with overload control
Throughput
Thread per request
Load
10
What is the best architecture?
  • Event-based
  • Reactive system
  • Process pool
  • Each pool process receives and processes to the
    end (SER)
  • Thread pool
  • Receive and hand-over to pool thread (sipd)
  • Each pool thread receives and processes to the
    end
  • Staged event-driven each stage has a thread pool

11
Stateless proxyUDP, no DNS, six messages per call
Match transaction
Modify response
stateful
Stateless proxy
Response
sendto, send or sendmsg
recvfrom or accept/recv
Update DB
Found
parse
Redirect/reject
REGISTER
Match transaction
Build response
Lookup DB
Request
other
Stateless proxy
Proxy
Modify Request
DNS
12
Stateless proxyUDP, no DNS, six messages per call
13
Stateful proxyUDP, no DNS, eight messages per
call
  • Event-based
  • single thread socket listener scheduler/timer
  • Thread-per-message
  • pool_schedule pthread_create
  • Thread-pool1 (sipd)
  • Thread-pool2
  • N event-based threads
  • Each handles specific subset of requests
    (hash(call-id))
  • Receive hand over to the correct thread
  • poll in multiple threads bad on multi-CPU
  • Process pool
  • Not finished yet

14
Stateful proxyUDP, no DNS, eight messages per
call
15
Lessons LearntWhat is the best architecture?
  • Stateless
  • CPU is bottleneck
  • Memory is constant
  • Process pool is the best
  • Event-based not good for multi-CPU
  • Thread/msg and thread-pool similar
  • Thread-pool2 close to process-poll
  • Stateful
  • Memory can become bottle-neck
  • Thread-pool2 is good
  • But not N x CPU
  • Not good if P ? CPU
  • Process pool may be better (?)

16
Lessons Learnt (sipd)Avoid blocking function
calls
  • DNS
  • 10-25 ms (29 queries)
  • Cache
  • 110 to 900 CPS
  • Internal vs external
  • non-blocking
  • Logger
  • Lazy logger as a separate thread
  • Date formatter
  • Strftime() 10 REG processing
  • Update date variable every second
  • random32()
  • Cache gethostid()- 37?s

Logger while (1) lock writeall
unlock sleep
17
Lessons Learnt (sipd)Resource management
  • Socket management
  • Problems OS limit (1024), liveness detection,
    retransmission
  • One socket per transaction does not scale
  • Global socket if downstream server is alive, soft
    state works for UDP
  • Hard for TCP/TLS apply connection reuse
  • Socket buffer size
  • 64KB to 128KB Tradeoff memory per socket vs
    number of sockets
  • Memory management
  • Problems too many malloc/free, leaks
  • Memory pool
  • Transaction specific memory, free once also,
    less memcpy
  • About 30 performance gain
  • Stateful 650 to 800 CPS Stateless 900 to 1200
    CPS

18
Lessons Learnt (SER)Optimizations
  • Reduce copying and string operations
  • Data lumps, counted strings (5-10)
  • Reduce URI comparison to local
  • User part as a keyword, use r2 parameters
  • Parser
  • Lazy parsing (2-6x), incremental parsing
  • 32-bit header parser (2-3.5x)
  • Use padding to align
  • Fast for general case (canonicalized)
  • Case compare
  • Hash-table, sixth bit
  • Database
  • Cache is divided into domains for locking

2003Jan Janak SIP proxy server effectiveness,
Masters thesis, Czech Technical University
19
Lessons Learnt (SER)Protocol bottlenecks and
other scalability concerns
  • Protocol bottlenecks
  • Parsing
  • Order of headers
  • Host names vs IP address
  • Line folding
  • Scattered headers (Via, Route)
  • Authentication
  • Reuse credentials in subsequent requests
  • TCP
  • Message length unknown until Content-Length
  • Other scalability concerns
  • Configuration
  • broken digest client, wrong password, wrong
    expires
  • Overuse of features
  • Use stateless instead of stateful if possible
  • Record route only when needed
  • Avoid outbound proxy if possible

20
Load SharingDistribute load among multiple
servers
  • Single server scalability
  • There is a maximum capacity limit
  • Multiple servers
  • DNS-based
  • Identifier-based
  • Network address translation
  • Same IP address

21
Load Sharing (DNS-based)Redundant proxies and
databases
  • REGISTER
  • Write to D1 D2
  • INVITE
  • Read from D1 or D2
  • Database write/ synchronization traffic becomes
    bottleneck

P1
D1
P2
D2
P3
22
Load Sharing (Identifier-based)Divide the user
space
  • Proxy and database on the same host
  • First-stage proxy may get overloaded
  • Use many
  • Hashing
  • Static vs dynamic

P1
D1
a-h
P2
D2
i-q
P3
D3
r-z
23
Load SharingComparison of the two designs
P1
P1
a-h
D1
D1
P2
P2
i-q
D2
D2
P3
P3
D2
r-z
Total time per DB
  • ((tr/D)1)TN
  • (A/D) B
  • ((tr1)/D)TN
  • (A/D) (B/D)

D number of database servers N number of
writes (REGISTER) r reads/writes
(INVREG)/REG T write latency t read
latency/write latency
24
Scalability (and Reliability)Two stage
architecture for CINEMA
a_at_example.com
a.example.com _sip._udp SRV 0 0 a1.example.com
SRV 1 0 a2.example.com
a1
s1
a2
sipbob_at_example.com
s2
sipbob_at_b.example.com
b_at_example.com
b.example.com _sip._udp SRV 0 0 b1.example.com
SRV 1 0 b2.example.com
s3
b1
b2
ex
example.com _sip._udp SRV 0 40 s1.example.com
SRV 0 40 s2.example.com SRV 0 20
s3.example.com SRV 1 0 ex.backup.com
Request-rate f(stateless, groups) Bottleneck
CPU, memory, bandwidth?
25
Load SharingResult (UDP, stateless, no DNS, no
mempool)
  • S P CPS
  • 3 3 2800
  • 2 3 2100
  • 2 2 1800
  • 1 2 1050
  • 0 1 900

26
Lessons LearntLoad sharing
  • Non-uniform distribution
  • Identifier distribution (bad hash function)
  • Call distribution dynamically adjust
  • Stateless proxy
  • S1050, P900 CPS
  • S3P3 10 million BHCA (busy hour call attempts)
  • Stateful proxy
  • S800, P650 CPS
  • Registration (no auth)
  • S2500, P2400 RPS
  • S3P3 10 million subscribers (1 hour refresh)
  • Memory pool and thread-pool2/event-based further
    increase the capacity (approx 1.8x)

27
Conclusions and future work
  • Server scalability
  • Non-blocking, process/events/thread, resource
    management, optimizations
  • Load sharing
  • DNS, Identifier, two-stage
  • Current and future work
  • Measure process pool performance for stateful
  • Optimize sipd
  • Use thread-pool2/event-based (?)
  • Memory - use counted strings clean after 200 (?)
  • CPU - use hash tables
  • Presence, call stateful and TLS performance
    (Vishal and Eilon)

28
Backup slides
29
Telephone scalability(PSTN Public Switched
Telephone Network)
bearer network
telephone switch(SSP)
30
SIP serverComparison with HTTP server
  • Signaling (vs data) bound
  • No File I/O (exception scripts, logging)
  • No caching DB read and write frequency are
    comparable
  • Transactions
  • Stateful wait for response
  • Depends on external entities
  • DNS, SQL database
  • Transport
  • UDP in addition to TCP/TLS
  • Goals
  • Carrier class scaling using commodity hardware
  • Try not to customize/recompile OS or implement
    (parts of) server in kernel (khttpd, AFPA)

31
Related workScalability for (web) servers
  • Existing work
  • Connection dispatcher
  • Content/session-based redirection
  • DNS-based load sharing
  • HTTP vs SIP
  • UDPTCP, signaling not bandwidth intensive, no
    caching of response, read/write ratio is
    comparable for DB
  • SIP scalability bottleneck
  • Signaling (chapter 4), real-time media data,
    gateway
  • 302 redirect to less loaded server, REFER session
    to another location, signal upstream to reduce

32
Related work3GPP (release 5)s IP Multimedia
core network Subsystem uses SIP
  • Proxy-CSCF (call session control function)
  • First contact in visited network. 911 lookup.
    Dialplan.
  • Interrogating-CSCF
  • First contact in operators network.
  • Locate S-CSCF for register
  • Serving-CSCF
  • User policy and privileges, session control
    service
  • Registrar
  • Connection to PSTN
  • MGCF and MGW

33
Server-based vs peer-to-peer
34
Comparison of sipd and SER
  • sipd
  • Thread pool
  • Events (reactive system)
  • Memory pool
  • PentiumIV 3GHz, 1GB, 1200 CPS, 2400 RPS (no auth)
  • SER
  • Process pool
  • Custom memory management
  • PentiumIII 850 MHz, 512 MB 2000 CPS, 1800 RPS
Write a Comment
User Comments (0)
About PowerShow.com