Title: Paul Watson
1An Introduction to Cloud-based Services
- Paul Watson
- Newcastle University, UK
- paul.watson_at_ncl.ac.uk
2 3Plan
- What is Cloud Computing?
- Potential Advantages
- Lessons from our own experiences
- Cloud Issues
4What is Cloud Computing?
- .. a broad array of
- web-based services aimed at
- allowing users to obtain a wide range of
functional capabilities - on a pay-as-you-go basis
- that previously required tremendous
hardware/software investments - and professional skills to acquire.
- Irving Wladawsky Berger
5Whats New?
- illusion of Infinite computing resources On
Demand - no up-front commitment by users
- Pay for use of resources on a short-term basis as
needed - (from Above the Clouds A Berkeley View of Cloud
Computing)
6Example Amazon Web Services
- Based on Xen VMs
- run any OS software stack
- CPU 1.0Ghz x86 instance _at_ 0.10 /hour
- Blob Storage _at_ 0.12 /GB
month - External Data Transfer _at_ 0.10 /GB
- Also queue, key store, block store, range of
instances
7 Why is this Important (I) Internal IT Problems
(slide by permission of Arjuna Technologies)
Silos Inflexibility
8Why is this Important (II)? Time to put Ideas
into action
- Research
- Have good idea
- Write proposal
- Wait 6 months
- If successful..
- Buy Computers
- Install Computers
- Start Work
- Science Start-ups
- Have good idea
- Write Business Plan
- Ask VCs to fund
- If successful..
- Buy computers
- Install Computers
- Start Work
9Why is this a Good idea using commercial clouds
- Have good idea
- Grab nodes as needed from Cloud provider
- Start Work
- Pay for what you used
10Cloud Services Continuum (based on Robert
Anderson)
http//et.cairene.net/2008/07/03/cloud-services-co
ntinuum/
Software (SaaS)
Google Docs
Salesforce.com
Platform (PaaS)
Flexibility
Complexity
Google AppEngine
Microsoft Azure
Infrastructure (IaaS)
Amazon EC2 S3
11Example Lessons from CARMEN Project
- Design began in 2006
- Commercial clouds not an option
- Designed own private cloud
- Experimenting with Commercial Cloud
12CARMEN Project
- UK EPSRC e-Science Pilot
- 4M (2006-10)
- 20 Investigators
Stirling
St. Andrews
Newcastle
York
Manchester
Sheffield
Leicester
Cambridge
Warwick
Imperial
Plymouth
13Industry Associates
14Research Challenge
- Understanding the brain is the greatest
informatics challenge - Enormous implications for science
- Medicine
- Biology
- Computer Science
15Collecting the Evidence
- 100,000 neuroscientists generate huge quantities
of data - molecular (genomic/proteomic)
- neurophysiological (time-series activity)
- anatomical (spatial)
- behavioural
16Epilepsy Exemplar
Data analysis guides surgeon during
operation Further analysis provides evidence
WARNING! The next 2 Slides show an exposed human
brain
17CARMEN
- enables sharing and collaborative exploitation
of data, analysis code and expertise that are not
physically collocated
18CARMEN e-Science Requirements
- Store
- very large quantities of data (100TB)
- Analyse
- suite of neuroinformatics services
- support data intensive analysis
- Automate
- workflow
- Share
- under user-control
19Background North East Regional e-Science Centre
- 25 Research Projects across many domains
- Bioinformatics, Ageing Health, Neuroscience,
Chemical Engineering, Transport, Geomatics, Video
Archives, Artistic Performance Analysis, Computer
Performance Analysis,.... - Same key needs
20Result e-Science Central
- Integrated Store-Analyse-Automate-Share
infrastructure - Generic
- CARMEN neuroinformatics chemistry as pilots
21e-Science Central
e-Science Central
- Dynamic Resource
- Allocation
- Pay-as-you-Go
- Controlled Sharing
- Collaboration
- Communities
22Science Cloud Architecture
Access over Internet (typically via browser)
- Data storage
- and
- analysis
Upload data services
Run analyses
23Science Cloud Options
?
Users
24(No Transcript)
25Editing and Running a Workflow on the Web
26Workflow
Result File
Viewing the output of Workflow Runs
27Viewing results
28Blogs and links
Communicating Results
Linking to results workflows
29What we learnt Moving into a Cloud
- Moving existing technologies into a cloud can be
difficult - some cant run in a Cloud at all
30Raw Data Exploration with Signal Data Explorer
31What we learnt Scalability
- Clouds offer the potential for scalability
- grab compute power only when needed
- Developers have to manage scalability
- for Infrastructure as a Service Clouds
- scale up as well as down
32Adaptive Dynamic Deployment with Dynasoar
Commercial pay-as-you-go clouds would allow us
to avoid this limit
Adding Processors as you need them optimises
resources and saves money in pay-as-you-go clouds
Ensure system can also release unwanted nodes
33Microsoft Azure Cloud for e-Science Demo
- Recent experiments with Microsoft Azure Cloud
- running Chemical analyses
- Silverlight App
- Thanks to
- - Paul Appleby Team at the Microsoft Technology
Centre, Reading - - MS External Research e-Science Group
34(No Transcript)
35Microsoft Azure Cloud Demo
36When not to use Clouds?
- Large data transfers
- Time Cost
- High Performance
- cpu/io/network bandwidth/low latency
- Predictable Performance
- Confidentiality
- High Availability?
- High Server Utilisation?
- private clouds better?
37Create Private Cloud (slides by
permission of Arjuna Technologies)
38Private Cloud Examples
- Eucalyptus
- Amazon API
- Private Cloud deployments of Microsoft Azure
- Arjuna Agility
39Federating Private Public Clouds
Public Cloud
App1
Public Cloud e.g. Amazon
Service Agreement
Arjuna Agility
App1
App1 2
Service Agreement
Internal Cloud
Dept A
Dept B
40Public Cloud e.g. Amazon
App1
App1
Public Cloud e.g. FlexiScale
Arjuna Agility
App1
App1 2
Internal Cloud
Dept A
Dept B
Arjuna
41Summary
- Cloud computing can revolutionise e-science
- provide sustainable infrastructure
- reduce time from idea to realisation
- Dont underestimate complexity
- building scalable distributed systems is still
hard - can Science Clouds help by lowering the hurdles?
- e-Science Central
- Store-Analyse-Automate-Share e-science platform
- adding content from a range of domains
- CARMEN is evaluating it for neuroinformatics