Revealing Skype Traffic: When Randomness Plays with You - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Revealing Skype Traffic: When Randomness Plays with You

Description:

Only can be obfuscated. Only encrypt partial message. TCP E2E Message ... Not Random, but obfuscate (Mixed) Frame: ciphered information. UDP E2E Message. 1. 2 ... – PowerPoint PPT presentation

Number of Views:170
Avg rating:3.0/5.0
Slides: 54
Provided by: TeY3
Category:

less

Transcript and Presenter's Notes

Title: Revealing Skype Traffic: When Randomness Plays with You


1
Revealing Skype TrafficWhen Randomness Plays
with You
  • D. Bonfiglio1, M. Mellia1, M. Meo1,D. Rossi2, P.
    Tofanelli3Dipartimento di Elettronica,
    Politecnico di Torino1
  • ENST Télécom Paris2
  • Motorola Inc.3
  • ACM Sigcomm 2007

Presented by Te-Yuan Huang
2
Outline
  • Goal
  • Contribution
  • Know More about Skype
  • Classifiers
  • Experiments
  • Conclusions

3
Outline
  • Goal
  • Contribution
  • Know More about Skype
  • Classifiers
  • Experiments
  • Conclusions

4
Goal
  • Identify Skype Traffic among
  • aggregated traffic
  • Direct session
  • Either UDP or TCP
  • The algorithm should be
  • Work in Real-Time
  • Reliable
  • Able to detect short flows (only last several
    seconds)

5
Outline
  • Goal
  • Contribution
  • Know More about Skype
  • Classifiers
  • Experiments
  • Conclusions

6
Importance of Skype Traffic Identification
  • Interest of network operator
  • Network Design Provisioning
  • Traffic and Performance Monitoring
  • Tariff Policies
  • Traffic Differentiation

7
Difference from Related Work
  • K.T. Chen et al.Quantifying Skype USI
  • Only identify UDP traffic
  • Need Skype login phase to be monitored
  • Fail on backbone links
  • Fail if any modification on Skype login proc.
  • K. Suh et al.Characterizing and Detect relayed
    traffic A case study using Skype
  • Only identify relayed Skype traffic

8
Outline
  • Goal
  • Contribution
  • Know More about Skype
  • Classifiers
  • Experiments
  • Conclusions

9
Lets get hands dirty Know more about Skype
traffic sources
A Skype Message
10
Skype Parameters
  • Rate
  • Codec Rate
  • Delta T
  • Skype Message Framing Time
  • The time between two subsequent Skype Message
  • RF (Redundancy Factor)
  • The number of past blocks that Skype retransmits

11
Parameters changes on Network Conditions
12
Skype Communication Mode
  • End-to-End (E2E)
  • Skype user call Skype user
  • End-to-Out (E2O)
  • Skype-in/Skype-out
  • PSTN involved
  • Only voice data
  • No video / file transfer / IM

13
Skype Codec
  • Codecs
  • Automatically selected
  • ISAC
  • The preferred codec for E2E
  • G.729
  • The preferred codec for E2O

14
More on Skype Message
  • Skype encrypt the message
  • TCP
  • Reliable transport
  • Receive packet in correct sequence(from
    application layer point of view)
  • encrypt the whole content of the message
  • UDP
  • Unreliable
  • Maybe out-of-order
  • Application layer header is needed
  • to resolve incorrect order
  • Only can be obfuscated
  • Only encrypt partial message

15
TCP E2E Message
1
2
3
Byte
Frame
  • All ciphered

16
UDP E2E Message
1
2
3
4
Byte

ID
Frame
Fun
  • Identified Field
  • ID 16-bit long identifier.
  • Randomly selected
  • Fun 5-bit long field masked by 0x8f
  • Used to stating the payload type
  • 0x02, 0x03, 0x07,0x0f signaling message
  • 0x0d Data message (all 4 types DATA)
  • Not Random, but obfuscate (Mixed)
  • Frame ciphered information

17
E2O Message
1
2
3
4
Byte

CID
Frame
  • Identified Field
  • CCID 4 bytes
  • Connection Identifier (CID) of PSTN gateway
  • Deterministic
  • After initial signaling

18
Outline
  • Goal
  • Contribution
  • Know More about Skype
  • Classifiers
  • Experiments
  • Conclusions

19
How to Identify Skype Traffic?
  • Chi-Square Classifier (CSC)
  • Utilize the knowledge of ciphering mechanism
  • Naïve Bayes Classifier (NBC)
  • Utilize the general characteristics of VoIP
    traffics
  • Payload-Based Classifier (PBC)
  • Look into the non-ciphered SoM
  • Only used for traffic in UDP

20
Chi-Square Classifier (CSC)
  • Purpose
  • To Know whether message portion is encrypted
  • Rationale
  • Given a message,
  • Only the third bytes is not random
  • Probably, E2E Skype flow by UDP
  • The first four bytes are deterministic, others
    are ciphered
  • Probably, E2O Skype flow by UDP
  • The whole message is ciphered
  • Probably, Skype flow transported by TCP

21
Chi-Square Classifier (CSC) Cont.
  • Chi-Square Distr.
  • Observing the objects ouput for nTOT times
  • There are n possible output
  • For ith output, it is expected to occur Ei times
    among nTOT, and is observed to occur Oi times
  • Then, is Chi-Square Distr. With n-1 degree
    of freedom

22
Chi-Square Classifier (CSC) Cont.
  • For each flow, take first G group of b bits
  • For each group g, there are 2b possible output
  • If the content of the flow is random, then Ei for
    each group is nTOT / 2b

b bits
b bits
b bits
..
b bits
..
1
2
3
G


23
Chi-Square Classifier (CSC) Cont.
  • Evaluate the test statistic as
  • Define the thresholds by

24
Chi-Square Classifier (CSC) Cont.
  • G 16, b 4bits are used
  • E2E over UDP
  • The block g 5 or 6 is mixed
  • Others are random
  • Classified Criteria

25
Chi-Square Classifier (CSC) Cont.
  • E2O over UDP
  • E2E or E2O over TCP
  • Not Skype
  • Otherwise

26
Chi-Square Classifier (CSC) Cont.
  • Deterministic test satistics
  • Linear with nTOT

27
Chi-Square Classifier (CSC) Cont.
  • Mixed block
  • If one bit is fixed and the others are random
  • Linearly increase with nTOT

28
Chi-Square Classifier (CSC) Cont.
29
Chi-Square Classifier (CSC) Cont.
  • Chi-Square works only if the observation is large
    enough, that is
  • Ei nTOT/2b gt5
  • Namely, nTOT gt 80
  • Choose nTOT 100
  • Also, set

30
Naïve Bayes Classifier
  • Feature vector x xi
  • PCx the probability that the object is
    belong to class C, given the feature x is
    observed
  • PxC the probability that the feature x will
    be observed, given the object is belong to class
    C
  • Bayes Rule
  • PCx PxCPC / Px

31
Naïve Bayes Classifier cont.
  • Naïve features are independent
  • PxC called belief

32
NBC Feature Selection
  • VoIP
  • Small Message Size
  • Less burstier than data traffic
  • Feature
  • Message size
  • Observe a window of message at a timex s1,
    s2, , sw
  • Average-Inter Packet Gap (average-IPG)

33
NBC Feature Selection
  • Belief
  • How to determine
  • PsiC

34
NBC Feature Characterization
  • For each codec, the message size is determined by
  • Rate
  • Header length
  • Redundancy factor (RF)
  • Message framing time (delta T)
  • The message size can be represented by Gaussian
    distribution

35
NBC Feature Characterization
  • Map each codec to a Gaussian distr.
  • Model average-IPG to a Gaussian distr. with

For Constant Bit Rate Codec
For variable Bit Rate Codec
36
NBC Derive Beliefs
37
NBC Make Decision
  • Let
  • Define a threshold Bmin
  • If B gt Bmin
  • Valid Skype flow
  • Otherwise
  • Not Skype flow

38
Payload Based Classifier (PBC)
  • Used as cross check for previous two classifier
  • Only useful for UDP traffic
  • Two Part
  • Per-flow Identification
  • Per-host Identification

39
PBC - Per-flow Identification
  • Utilize the knowledge about UDP E2E Message
  • Fun 5-bit long field masked by 0x8f
  • Used to stating the payload type
  • 0x02, 0x03, 0x07,0x0f signaling message
  • 0x0d Data message (all 4 types DATA)

1
2
3
4
Byte

ID
Frame
Fun
40
PBC - Per-flow Identification
  • Terminology
  • nTOT the total number of packets in the flow
  • nsig the number of Skype signaling message
  • nE2E the number of Skype E2E data/video/chat/voic
    e message
  • nE2O the number of Skype E2O voice message

41
PBC - Per-flow Identification
  • Criteria

42
PBC - Per-host Identification
  • Known a Skype client always uses the same UDP
    port to send/receive traffic
  • Before start conversation,
  • Signaling messages are sent between two clients
  • Able to identify a Skype client running at a
    specific IP and port

43
PBC - Per-host Identification
  • Criteria to identify the Skype client IP/port

44
Experiment
  • Two Data Set
  • Campus 95 hours took on 2006/5/29
  • No P2P traffic is allowed
  • Most traffic are TCP data flows
  • ISP one day took on 2006/5/15
  • All traffic is allowed
  • More heterogeneous
  • Expect little Skype traffic

45
Measurement Result
46
Measurement Result UDP, Campus
47
Measurement Result UDP, ISP
48
Measurement Result - TCP
49
Parameter Tuning - Bmin
50
Parameter Tuning X2(Thr)
51
Parameter Tuning Bmin X2(Thr)
52
Parameter Tuning Bmin X2(Thr)
53
Conclusion
  • Reveal Skype Traffic from aggregate streams of
    packets
  • Two Approach
  • Statistical properties of randomness
  • Stochastic characteristics of voice traffic
  • Negligible False Positives
  • Few False Negative left out
Write a Comment
User Comments (0)
About PowerShow.com