Traffic Data Retention - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Traffic Data Retention

Description:

1 Megabyte = 1 Tom Clancy book. 1 Gigabyte = 20 bookshelves (of Tom Clancy books) Relative sizes ... (a big Tom Clancy Book) A modern hard disk holds 40 ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 41
Provided by: sag4
Category:

less

Transcript and Presenter's Notes

Title: Traffic Data Retention


1
Traffic Data Retention
  • The System Administrators Guild of Ireland
  • www.sage-ie.org

2
Table of Contents
  • Introduction
  • Internet Fundamentals
  • Network Fundamentals
  • Network Speeds
  • Storage Costs

3
Introduction
  • Traffic Data Retention
  • Phone numbers, duration, location
  • Email addresses, attachments, size
  • Web pages visited, search terms
  • Non-technical audience
  • A few technical people may have slipped in

4
Running order
  • Fundamentals Dónal Cunningham
  • Email Liam Bedford
  • WWW Dónal Cunningham
  • QA session
  • Social session

5
Disclaimer
  • Mobile phones
  • not a court in the land would convict me
  • Slides?
  • Eco-friendly (on the web tomorrow)
  • Bias
  • Were sysadmins. We do tech.

6
Table of Contents
  • Introduction
  • Internet Fundamentals
  • Network Fundamentals
  • Network Speeds
  • Storage Costs

7
Internet Fundamentals I
  • Networks link computers together
  • Internets link networks together
  • The Internet
  • Links a LOT of networks together
  • A world of ends
  • Uses addresses to identify devices attached to it

8
A Picture with a Pretty Cloud
The Internet
Amazon
9
Bits and Bytes, oh my!
  • Bit tiny piece of information
  • 8 bits in a byte
  • Just remember the ratio
  • Kilo 1,024 thousand
  • Mega 1,0242 million
  • Giga 1,0243 thousand million
  • Why 1,024? Why not 1,000?

10
Visualise
  • 1 byte 1 letter/number/punctuation
  • 1 Kilobyte 1 page of text
  • 1 Megabyte 1 Tom Clancy book
  • 1 Gigabyte 20 bookshelves
  • (of Tom Clancy books)

11
Relative sizes
  • A floppy disk holds 1.44 megabytes
  • (a big Tom Clancy Book)
  • A modern hard disk holds 40 Gigabytes
  • (and heres one I made earlier)
  • A DAT tape holds 20 Gigabytes

12
Magic Numbers
  • Where do my numbers come from?
  • Email is roughly 50 headers, 50 content
  • Internet use doubles every year
  • Last years Internet traffic datacontent
  • is this years traffic data.
  • Roughly
  • So I am using last years HEAnet traffic
    datacontent figures to model this years HEAnet
    traffic data figures!

13
The Players
  • Internet Service Providers (ISPs)
  • Provide Internet access to customers
  • Employ System Administrators

14
Internet Services
  • Electronic Mail (email)
  • Will be covered by Liam
  • WWW (World Wide Web)
  • Worldwide Bulletin Board
  • Now used for more than it was designed for
  • including email

15
Ports
  • Address gets you to the building
  • Which bell to ring?
  • IP address gets you to the computer
  • Not a bell, but a port
  • Service number
  • Email port 25 (usually)
  • Web port 80 (usually)
  • but not always

16
Table of Contents
  • Introduction
  • Internet Fundamentals
  • Network Fundamentals
  • Network Speeds
  • Storage Costs

17
Networking Equipment
  • Routers
  • Decide where your traffic should go
  • Made by Cisco, Juniper, Bay, 3Com
  • Come in all shapes and sizes
  • Speak their own languages (BGP, OSPF)

18
More pretty pictures
19
Networking Concepts
  • Routers link networks together
  • Copper (wires)
  • Phone lines/leased lines
  • Glass (Fiber-optic cables)
  • Wireless
  • Take time and money to configure properly
  • Broadband
  • The magic word

20
More Networking Equipment
  • Firewalls
  • The Bouncers of the Internet
  • Rules govern whos allowed in and out
  • Take time and money to configure correctly
  • Need regular maintenance

21
Transit vs. Termination
  • Some traffic crosses your network on the way to
    somewhere else (Transit)
  • Some traffic starts/finishes on your network
    (Termination)
  • What can/should you track?

22
Table of Contents
  • Introduction
  • Internet Fundamentals
  • Network Fundamentals
  • Network Speeds
  • Storage Costs

23
Network Speeds 101
  • Normally measured in bits
  • Were converting to bytes so they match storage
  • LAN speeds
  • 10 Mbps 1.25 Mbytes/second
  • 100 Mbps 12.5 Mbytes/second
  • WAN speeds
  • from 10 Kbytes/second to 200 Gbytes/second!
  • In common use in Ireland 300 Mbytes/second

24
Firewall Limitations
  • Current Firewalls have limits
  • Theyre fast because they do everything in
    dedicated hardware
  • Cisco Pix 535 is current top-of-the-line (50k)
  • Can only handle 125 Mbytes/sec Fudge
  • Very low-level rules
  • Can log all traffic using port 25
  • Logs email content as well as headers
  • Data Protection issues

25
Router limitations
  • Primary job is to route
  • Logging means more overhead
  • More overhead means less performance
  • Who bears this cost?
  • Same logging level as firewalls
  • ALL traffic to certain ports
  • Headers (Traffic Data) AND Content
  • Data Protection issues

26
HEAnet examples
  • 1 Gigabyte/s to Internet
  • 40 Megabytes/s at peak use
  • Doubles every year
  • Peaked at 100 Megabytes/s in last 6 months
  • 9 core routers
  • Connect 40 client institutions to each other (and
    the Internet)

27
Table of Contents
  • Introduction
  • Internet Fundamentals
  • Network Fundamentals
  • Network Speeds
  • Storage Costs

28
Storage Types
  • Disk
  • Fast
  • Delicate
  • Tape
  • Not so fast
  • Durable
  • Optical
  • CDs
  • Scratchable, as we all know

29
Disks
  • 40 Gb minimum you can buy
  • Costs about 100
  • HEAnet would use one every 17 minutes at todays
    data rates -
  • Can write 100 Mbytes/sec, which is fine
  • Cant be used for anything else
  • Did we say delicate already?

30
Tapes I
  • DAT/DDS
  • DDS holds 20 Gb, writes 3 Mbytes/sec, 25 each
  • HEAnet 13 tapes every 5 days, 65 per day
  • HEAnet 23k and 949 tapes/year!
  • DLT (Digital Linear Tape)
  • DLT holds 40 Gb, writes 6 Mbytes/sec, 100 each
  • HEAnet 7 DLTs every 5 days
  • HEAnet51k and 511 tapes/year!

31
But wait
  • Internet speeds double every year
  • Network speeds keep pace
  • but storage speeds dont!
  • Year 1 51k and 511 tapes...
  • Year 2 100k and 1,022 tapes (511)
  • Year 3 200k and 2,044 tapes (1,533)
  • Year 4 400k and 4,044 tapes (3,677)
  • Oh, and dont forget the extra hardware
  • What happens if a tape fails? A drive?

32
Disk Arrays
  • Ah, but we can use HUGE disk boxes, no?
  • Sun 9980 disk array uses 6 cabinets
  • Standard config holds 20 Tb (20,480 Gb)
  • At HEAnet rates, good for just over 5 days
  • List price 2.3m
  • 24 hr hardware support is approx. 200k/year

33
Why store to disk at all?
  • Cant use routers to analyse
  • Need dedicated machines to write data!
  • Need to sift through the data
  • Need to verify integrity of data
  • Raw logs will produce refined logs (which need to
    be stored)
  • Shall we calculate cost of analysis computers?

34
Hidden costs of storage
  • Need to keep data off-site for security reasons
  • What if building floods/catches fire/collapses?
  • Cost of off-site storage
  • Cost of analysis
  • Need machines and people to sift through old data
  • Need to maintain analysis infrastructure

35
Conclusion
  • More to Traffic Data Retention than most people
    realise
  • Costs
  • Router overhead
  • Storage retrieval
  • Staff overhead

36
Email
  • and over to Liam!
  • (come back to me later)

37
Web Fundamentals
  • Web browser connects to Web Server
  • Grabs document, and associated images
  • Simple solution log all web traffic
  • How do we define this? Port 80? 8080?
  • This will capture headers and contents
  • Data Protection issues

38
Problems
  • Secure HTTP
  • Log all you like its encrypted!
  • Proxies
  • All traffic appears to come from one address
  • Youd better hope it keeps logs
  • TCD 21 days of logs, then deleted.

39
Webmail
  • Hotmail
  • Yahoo
  • Ireland.com
  • Email that looks like web traffic
  • Can do encryption
  • Emails come from a server in another legal
    juristiction

40
Summary
  • Internet traffic is harder to log than you think
  • There are legal, financial and technical barriers
  • A balance must be reached
  • Viva consultation!
Write a Comment
User Comments (0)
About PowerShow.com