Cloud Computing Skepticism - PowerPoint PPT Presentation

About This Presentation
Title:

Cloud Computing Skepticism

Description:

Abhishek Verma, Saurabh Nangia Video download external traffic Search application internal traffice * Requests from Internet are IP (layer 3) routed through ... – PowerPoint PPT presentation

Number of Views:136
Avg rating:3.0/5.0
Slides: 70
Provided by: coursesEn3
Category:

less

Transcript and Presenter's Notes

Title: Cloud Computing Skepticism


1
Cloud Computing Skepticism
  • Abhishek Verma, Saurabh Nangia

2
Outline
  • Cloud computing hype
  • Cynicism
  • MapReduce Vs Parallel DBMS
  • Cost of a cloud
  • Discussion

3
Recent Trends
Amazon S3 (March 2006)
Amazon EC2 (August 2006)
Salesforce AppExchange (March 2006)
Google App Engine (April 2008)
Microsoft Azure (Oct 2008)
Facebook Platform (May 2007)
4
Tremendous Buzz
5
Gartner Hype Cycle
From http//en.wikipedia.org/wiki/Hype_cycle
6
Blind men and an Elephant
7
  • Cloud computing is simply a buzzword used to
    repackage grid computing and utility computing,
    both of which have existed for decades.

whatis.com Definition of Cloud Computing
8
  • The interesting thing about cloud computing is
    that weve redefined cloud computing to include
    everything that we already do.
  • The computer industry is the only industry that
    is more fashion-driven than womens fashion.
  • Maybe Im an idiot, but I have no idea what
    anyone is talking about. What is it? Its
    complete gibberish. Its insane. When is this
    idiocy going to stop?

Larry Ellison During Oracles Analyst Day
From http//blogs.wsj.com/biztech/2008/09/25/larry
-ellisons-brilliant-anti-cloud-computing-rant/
9
From http//geekandpoke.typepad.com
10
Reliability
  • Many enterprise (necessarily or unnecessarily)
    set their SLAs uptimes at 99.99 or higher, which
    cloud providers have not yet been prepared to
    match

Amazons cloud outages receive a lot of exposure Amazons cloud outages receive a lot of exposure
July 20, 2008 Failure due to stranded zombies, lasts 5 hours
Feb 15, 2008 Authentication overload leads to two-hour service outage
October 2007 Service failure lasts two days
October 2006 Security breach where users could see other users data
and their current SLAs dont match those of enterprises and their current SLAs dont match those of enterprises and their current SLAs dont match those of enterprises and their current SLAs dont match those of enterprises
Amazon EC2 99.95 Amazon S3 99.9
  • Not clear that all applications require such
    high services
  • IT shops do not always deliver on their SLAs
    but their failures are less public and customers
    cant switch easily

SLAs expressed in Monthly Uptime Percentages
Source McKinsey Company
11
A Comparison of Approaches to Large-Scale Data
Analysis
  • Andrew Pavlo, Erik Paulson, Alexander Rasin,
    Daniel J. Abadi, David J. DeWitt, Samuel Madden,
    Michael Stonebraker
  • To appear in SIGMOD 09

Basic ideas from MapReduce - a major step
backwards, D. DeWitt and M. Stonebraker
12
MapReduce A major step backwards
  • A giant step backward
  • No schemas, Codasyl instead of Relational
  • A sub-optimal implementation
  • Uses brute force sequential search, instead of
    indexing
  • Materializes O(m.r) intermediate files
  • Does not incorporate data skew
  • Not novel at all
  • Represents a specific implementation of well
    known techniques developed nearly 25 years ago
  • Missing most of the common current DBMS features
  • Bulk loader, indexing, updates, transactions,
    integrity constraints, referential Integrity,
    views
  • Incompatible with DBMS tools
  • Report writers, business intelligence tools, data
    mining tools, replication tools, database design
    tools

13
Architectural Element Parallel Databases MapReduce
Schema Support Structured Unstructured
Indexing B-Trees or Hash based None
Programming Model Relational Codasyl
Data Distribution Projections before aggregation Logic moved to data, but no optimizations
Execution Strategy Push Pull
Flexibility No, but Ruby on Rails, LINQ Yes
Fault Tolerance Transactions have to be restarted in the event of a failure Yes Replication, Speculative execution
14
MapReduce II
  • MapReduce didn't kill our dog, steal our car, or
    try and date our daughters. 
  • MapReduce is not a database system, so don't
    judge it as one
  • Both analyze and perform computations on huge
    datasets
  • MapReduce has excellent scalability the proof is
    Google's use
  • Does it scale linearly?
  • No scientific evidence
  • MapReduce is cheap and databases are expensive
  • We are the old guard trying to defend our
    turf/legacy from the young turks
  • Propagation of ideas between sub-disciplines is
    very slow and sketchy
  • Very little information is passed from generation
    to generation

http//www.databasecolumn.com/2008/01/mapreduce-
continued.html
15
Tested Systems
  • Hadoop
  • 0.19 on Java 1.6, 256MB block size, JVM reuse
  • Rack-awareness enabled
  • DBMS-X (unnamed)
  • Parallel DBMS from a major relational db vendor
  • Row based, compression enabled
  • Vertica (co-founded by Stonebraker)
  • Column oriented
  • Hardware configuration 100 nodes
  • 2.4 GHz Intel Core 2 Duo
  • 4GB RAM, 2 250GB SATA hard disks
  • GigE ports, 128Gbps switching fabric

16
Data Loading
  • Hadoop
  • Command line utility
  • DBMS-X
  • LOAD SQL command
  • Administrative command to re-organize data
  • Grep Dataset
  • Record 10b key 90b random value
  • 5.6 million records 535MB/node
  • Another set 1TB/cluster

17
Grep Task Results
SELECT FROM Data WHERE field LIKE XYZ
18
Select Task Results
SELECT pageURL, pageRank FROM Rankings WHERE
pageRank gt X
19
Join Task
SELECT INTO Temp sourceIP, AVG(pageRank) as
avgPageRank, SUM(adRevenue) as totalRevenue FROM
Rankings AS R, UserVisits AS UV WHERE R.pageURL
UV.destURL AND UV.visitDate BETWEEN
Date(2000-01-15) AND Date(2000-01-22) GROUP
BY UV.sourceIP SELECT sourceIP, totalRevenue,
avgPageRank FROM Temp ORDER BY totalRevenue DESC
LIMIT 1
20
Concluding Remarks
  • DBMS-X 3.2 times, Vertica 2.3 times faster than
    Hadoop
  • Parallel DBMS win because
  • B-tree indices to speed the execution of
    selection operations,
  • novel storage mechanisms (e.g.,
    column-orientation)
  • aggressive compression techniques with ability to
    operate directly on compressed data
  • sophisticated parallel algorithms for querying
    large amounts of relational data.
  • Ease of installation and use
  • Fault tolerance?
  • Loading data?

21
The Cost of a Cloud Research Problem in Data
Center Networks
  • Albert Greenberg, James Hamilton, David A. Maltz,
    Parveen Patel
  • MSR Redmond

Presented by Saurabh Nangia
22
Overview
  • Cost of cloud service
  • Improving low utilization
  • Network agility
  • Incentive for resource consumption
  • Geo-distributed network of DC

23
Cost of a Cloud?
  • Where does the cost go in todays cloud service
    data centers?

24
Cost of a Cloud
Amortized Costs (one time purchases amortized
over reasonable lifetimes, assuming 5 cost of
money)
45
25
15
15
25
Are Clouds any different?
  • Can existing solutions for the enterprise data
    center work for cloud service data centers?

26
Enterprise DC vs Cloud DC (1)
  • In enterprise
  • Leading cost operational staff
  • Automation is partial
  • IT staff Servers 1100
  • In cloud
  • Staff costs under 5
  • Automation is mandatory
  • IT staff Servers 11000

27
Enterprise DC vs Cloud DC (2)
  • Large economies of scale
  • Cloud DC leverage economies of scale
  • But up front costs are high
  • Scale Out
  • Enterprises DC scale up
  • Cloud DC scale out

28
Types of Cloud Service DC (1)
  • Mega data centers
  • Tens of thousands (or more) servers
  • Drawing tens of Mega-Watts of power (at peak)
  • Massive data analysis applications
  • Huge RAM, Massive CPU cycles, Disk I/O operations
  • Advantages
  • Cloud services applications build on one another
  • Eases system design
  • Lowers cost of communication needs

29
Types of Cloud Service DC (2)
  • Micro data centers
  • Thousands of servers
  • Drawing power peaking in 100s of Kilo-Watts
  • Highly interactive applications
  • Query/response, office productivity
  • Advantages
  • Used as nodes in content distribution network
  • Minimize speed-of-light latency
  • Minimize network transit costs to user

30
Cost Breakdown
31
Server Cost (1)
  • Example
  • 50,000 servers
  • 3000 per server
  • 5 cost of money
  • 3 year amortization
  • Amortized cost 50000 3000 1.05 / 3
  • 52.5 million dollars per year!!
  • Utilization remarkably low, 10

32
Server Cost (2)
  • Uneven Application Fit
  • Uncertainty in demand forecasts
  • Long provisioning time scales
  • Risk Management
  • Hoarding
  • Virtualization short-falls

33
Reducing Server Cost
  • Solution Agility
  • to dynamically grow and shrink resources to meet
    demand, and
  • to draw those resources from the most optimal
    location.
  • Barrier Network
  • Increases fragmentation of resources
  • Therefore, low server utlization

34
Infrastructure Cost
  • Infrastructure is overhead of Cloud DC
  • Facilities dedicated to
  • Consistent power delivery
  • Evacuating heat
  • Large scale generators, transformers, UPS
  • Amortized cost 18.4 million per year!!
  • Infra cost 200M
  • 5 cost of money
  • 15 year amortization

35
Reducing Infrastructure Cost
  • Reason of high cost requirement for delivering
    consistent power
  • Relaxing the requirement implies scaling out
  • Deploy larger numbers of smaller data centers
  • Resilience at data center level
  • Layers of redundancy within data center can be
    stripped out (no UPS generators)
  • Geo-diverse deployment of micro data centers

36
Power
  • Power Usage Efficiency (PUE)
  • (Total Facility Power)/(IT Equipment Power)
  • Typically PUE 1.7
  • Inefficient facilities, PUE of 2.0 to 3.0
  • Leading facilities, PUE of 1.2
  • Amortized cost 9.3million per year!!
  • PUE 1.7
  • .07 per KWH
  • 50000 servers each drawing average 180W

37
Reducing Power Costs
  • Decreasing power cost -gt decrease need of
    infrastructure cost
  • Goal Energy proportionality
  • server running at N load consume N power
  • Hardware innovation
  • High efficiency power supplies
  • Voltage regulation modules
  • Reduce amount of cooling for data center
  • Equipment failure rates increase with temp
  • Make network more mesh-like resilient

38
Network
  • Capital cost of networking gear
  • Switches, routers and load balancers
  • Wide area networking
  • Peering traffic handed off to ISP for end users
  • Inter-data center links b/w geo distributed DC
  • Regional facilities (backhaul, metro-area
    connectivity, co-location space) to reach
    interconnection sites
  • Back-of-the-envelope calculations difficult

39
Reducing Network Costs
  • Sensitive to site selection industry dynamics
  • Solution
  • Clever design of peering transit strategies
  • Optimal placement of micro mega DC
  • Better design of services (partitioning state)
  • Better data partitioning replication

40
Perspective
  • On is better than off
  • Server should be engaged in revenue production
  • Challenge Agility
  • Build in resilience at systems level
  • Stripping out layers of redundancy inside each
    DC, and instead using other DC to mask DC failure
  • Challenge Systems software Network research

41
Cost of Large Scale DC
http//perspectives.mvdirona.com/2008/11/28/CostO
fPowerInLargeScaleDataCenters.aspx
42
Solutions!
43
Improving DC efficiency
  • Increasing Network Agility
  • Appropriate incentives to shape resource
    consumption
  • Joint optimization of Network DC resources
  • New mechanisms for geo-distributing states

44
Agility
  • Any server can be dynamically assigned to any
    service anywhere in DC
  • Conventional DC
  • Fragment network server capacity
  • Limit dynamic growth and shrink of server pools

45
Networking in Current DC
  • DC network two types of traffic
  • Between external end systems internal servers
  • Between internal servers
  • Load Balancer
  • Virtual IP address (VIP)
  • Direct IP address (DIP)

46
Conventional Network Architecture
47
Problems (1)
  • Static Network Assignment
  • Individual applications mapped to specific
    physical switches routers
  • Adv performance security isolation
  • Disadv Work against agility
  • Policy-overloaded (traffic, security,
    performance)
  • VLAN spanning concentrates traffic on links high
    in tree

48
Problems (2)
  • Load Balancing Techniques
  • Destination NAT
  • All DIPs in a VIPs pool be in the same layer 2
    domain
  • Under-utilization fragmentation
  • Source NAT
  • Servers spread across layer 2 domain
  • But server never sees IP
  • Client IP required for data mining response
    customization

49
Problems (3)
  • Poor server to server connectivity
  • Connection b/w servers in diff layer 2 must go
    through layer 3
  • Links oversubscribed
  • Capacity of links b/w access router border
    routers lt output capacity of servers connected to
    access router
  • Ensure no saturation in any of network links!

50
Problems (4)
  • Proprietary hardware scales up, not out
  • Load balancers used in pairs
  • Replaced when load becomes too much

51
DC Networking Design Objectives
  • Location-independent Addressing
  • Decouple servers location in DC from its address
  • Uniform Bandwidth Latency
  • Servers can be distributed arbitrarily in DC
    without fear of running into bandwidth choke
    points
  • Security Performance Isolation
  • One service should not affect others performance
  • DoS attack

52
Incenting Desirable Behavior (1)
  • Yield management
  • to sell the right resources to the right customer
    at the right time for the right price
  • Trough filling
  • Cost determined by height of peaks, not area
  • Bin packing opportunities
  • Leasing committed capacity with fixed minimum
    cost
  • Prices varying with resource availability
  • Differentiate demands by urgency of execution

53
Incenting Desirable Behavior (2)
  • Server allocation
  • Large unfragmented servers Agility
  • Less requests for servers
  • Eliminating hoarding of servers
  • Cost for having a server
  • Seasonal peaks
  • Internal auctions may be fairest
  • But, how to design!

54
Geo-Distribution
  • Speed latency matter
  • Google 20 revenue loss for 500ms delay!!
  • Amazon 1 sales decrease for 100ms delay!!
  • Challenges
  • Where to place data centers
  • How big to make them
  • Using it as a source of redundancy to improve
    availability

55
Optimal Placement Sizing (1)
  • Importance of Geographical Diversity
  • Decreasing latency b/w user and DC
  • Redundancy (earthquakes, riots, outages, etc)
  • Size of data center
  • Mega DC
  • Extracting maximum benefit from economies of
    scale
  • Local factors like tax, power concessions, etc.
  • Micro DC
  • Enough servers to provide statistical
    multiplexing gains
  • Given a fixed budget, place closes to each
    desired population

56
Optimal Placement Sizing (2)
  • Network cost
  • Performance vs cost
  • Latency vs Internet peering dedicated lines
    between data centers
  • Optimization should also consider
  • Dependencies of services offered
  • Email -gt buddy list maintenance, authentication,
    etc
  • Front end micro data centers (low latency)
  • Back end mega data centers (greater resources)

57
Geo-Distributing State (1)
  • Turning geo-diversity to geo-redundancy
  • Distribute critical state across sites
  • Facebook
  • Single master data center replicating data
  • Yahoo! Mail
  • Partitions data across DCs based on user
  • Different solutions for Different data
  • Buddy status replicated weak consistency
    assurance
  • Email mailbox by user ids, strong consistency

58
Geo-Distributing State (2)
  • Tradeoffs
  • Load distribution vs service performance
  • eg Facebooks single master coordinate
    replication
  • Speeds up lookup but loads on master
  • Communication cost vs service performance
  • Data replication-more inter data center
    communication
  • Longer latency
  • Higher cost message over inter DC links

59
Summary
  • Data center costs
  • Server, Infrastructure, Power, Networking
  • Improving efficiency
  • Network Agility
  • Resource Consumption Shaping
  • Geo-diversifying DC

60
Opinions
61
  • Richard Stallman, GNU founder
  • Cloud Computing is a trap
  • .. cloud computing was simply a trap aimed at
    forcing more people to buy into locked,
    proprietary systems that would cost them more and
    more over time.
  • "It's stupidity. It's worse than stupidity it's
    a marketing hype campaign"

62
  • Open Cloud Manifesto
  • a document put together by IBM, Cisco, ATT, Sun
    Microsystems and over 50 others to promote
    interoperability
  • "Cloud providers must not use their market
    position to lock customers into their particular
    platforms and limit their choice of providers,
  • Failed? Google, Amazon, Salesforce and Microsoft,
    four very big players in the area, are notably
    absent from the list of supporters

63
  • Larry Ellison, Oracle founder
  • "fashion-driven" and "complete gibberish
  • What is it? What is it? ... Is it - 'Oh, I am
    going to access data on a server on the
    Internet.' That is cloud computing?
  • Then there is a definition What is cloud
    computing? It is using a computer that is out
    there. That is one of the definitions 'That is
    out there.' These people who are writing this
    crap are out there. They are insane. I mean it is
    the stupidest.

64
  • Sam Johnston, Strategic Consultant Specializing
    in Cloud Computing,
  • Oracle would be out badmouthing cloud computing
    as it has the potential to disrupt their entire
    business.
  • "Who needs a database server when you can buy
    cloud storage like electricity and let someone
    else worry about the details? Not me, that's for
    sure - unless I happen to be one of a dozen or so
    big providers who are probably using open source
    tech anyway,

65
  • Marc Benioff, head of salesforce.com
  • Cloud computing isn't just candyfloss thinking
    it's the future. If it isn't, I don't know what
    is. We're in it. You're going to see this model
    dominate our industry."
  • Is data really safe in the cloud? "All complex
    systems have planned and unplanned downtime. The
    reality is we are able to provide higher levels
    of reliability and availability than most
    companies could provide on their own," says
    Benioff

66
  • John Chambers, Cisco Systems CEO
  • "a security nightmare.
  • cloud computing was inevitable, but that it
    would shake up the way that networks are
    secured

67
  • James Hamilton, VP Amazon Web Services
  • any company not fully understanding cloud
    computing economics and not having cloud
    computing as a tool to deploy where it makes
    sense is giving up a very valuable competitive
    edge
  • No matter how large the IT group, if I led the
    team, I would be experimenting with cloud
    computing and deploying where it make sense

68
To Cloud or Not to Cloud?
69
References
  • Clearing the air on cloud computing,
    McKinseyCompany
  • http//geekandpoke.typepad.com/
  • Clearing the Air - Adobe Air, Google Gears and
    Microsoft Mesh, Farhad Javidi
  • http//en.wikipedia.org/wiki/Hype_cycle
  • A Comparison of Approaches to Large-Scale Data
    Analysis, Pavlo et al
  • MapReduce - a major step backwards, D. DeWitt and
    M. Stonebraker
Write a Comment
User Comments (0)
About PowerShow.com