Title: appscale: open-source platform-level cloud computing
1appscale open-source platform-level cloud
computing
- Chandra Krintz
- Computer Science Dept.
- Univ. of California, Santa Barbara
- OpenFabrics Alliance, Sonoma Workshop
- March 15, 2010
2cloud computing
- Software systems for accessing easily and
transparently scalable CPU/storage/network
resources via a network connection or web
interface as-a-service - Culmination of grid/cluster/utility/elastic
computing - Advances in processor, virtualization, systems
technology - Remote access to distributed and shared cluster
resources - Has experienced a rapid uptake in the commercial
sector - Public clouds your software/apps on others
systems - Users rent a small fraction of vast resource
pools - Advertised service-level-agreements (SLAs)
- Resources are opaque and isolated
- Offer high availability, fault tolerance, and
extreme scale
3cloud computing
- 3 types as-a-Service (aaS)
- Infrastructure Amazon Web Services (EC2, S3,
EBS) - Virtualized, isolated (CPU, Network, Storage)
systems on which users execute entire runtime
stacks - Fully customer self-service
- Open APIs (IaaS standard), scalable services
- Platform Google App Engine, Microsoft Azure
- Scalable program-level abstractions via
well-defined interfaces - Enable construction of network-accessible
applications - Process-level (sandbox) isolation, complete
software stack - Software Salesforce.com
- Applications provided to thin clients over a
network - Customizable
-
4many public cloud features useful on-premise
- Ease of use of local cluster resources
- Development, configuration, deployment
- Broaden participation
- Scale, performance and resource isolation
- Concerns
- Vendor lock-in, pay-per-use, and availability
reliance - Privacy of code and data
- Resource constraints storage, CPU/memory,
communication - Potential for hybrid and customized approaches
- One size may not fit all
5many public cloud features useful on-premise
- Ease of use of local cluster resources
- Development, configuration, deployment
- Broaden participation
- Difficult to test / try things out in public-only
setting - Developers visualize the system / have an
intellectual model - How to support other application domains
scale/availability - HPC, data-intensive
- Clouds are opaque
- Open APIs, closed implementations
- What is your application/workload doing?
6many public cloud features useful on-premise
- Ease of use of local cluster resources
- Development, configuration, deployment
- Broaden participation
- Difficult to test / try things out in public-only
setting - Developers visualize the system / have an
intellectual model - How to support other application domains
scale/availability - HPC, data-intensive
- Clouds are opaque
- Open APIs, closed implementations
- What is your application/workload doing?
- If everyone does their own
- many non-standard APIs!
7cloud computing fabrics from UCSB
- Goal Bring popular cloud fabrics to on-premise
clusters that are easy to use and are transparent - By emulating key cloud layers from the commercial
sector - Engender user community, access to real
applications/users - Standard APIs (Amazon AWS, Google App Engine,
MapReduce) - Leverage extant software technologies
8cloud computing fabrics from UCSB
- Goal Bring popular cloud fabrics to on-premise
clusters that are easy to use and are transparent - By emulating key cloud layers from the commercial
sector - Engender user community, access to real
applications/users - Standard APIs (Amazon AWS, Google App Engine,
MapReduce) - Leverage extant software technologies
- To facilitate investigation of
- Next-generation distributed / cloud computing
software - Services, underlying device technology, support
technologies - Customization (availability, performance,
application behavior) - Hybrid cloud solutions (public and on-premise)
- Not a replacement technology for any Public Cloud
service
9cloud computing fabrics from UCSB
- Goal Bring popular cloud fabrics to on-premise
clusters that are easy to use and are transparent - By emulating key cloud layers from the commercial
sector - Engender user community, access to real
applications/users - Standard APIs (Amazon AWS, Google App Engine,
MapReduce) - Leverage extant software technologies
- Platform-as-a-service (PaaS)
- Open-source implementation of Google App Engine
APIs - Pluggable (services), scalable, fault tolerant
- Runs over virtualization or IaaS layer AWS,
Eucalyptus - appscale.cs.ucsb.edu
10google app engine
private, enterprise data, Google apps
SDC
Google App Engine (GAE)
MyApp.appspot.com
GAE Application (your code here)
Protobuf Data APIs
Users
Images
URL Fetch
Blob store
Administrator Console
IM
Services
Cron
Data Store
Tasks
Memcache
Mail
11sandbox restrictions
Google App Engine (GAE)
MyApp.appspot.com
GAE Application (your code here)
- Pure Python or Java only, white list of library
calls to framework - No thread/subprocess spawning, system calls
- No writes to file system, reads only to static
files uploaded w/app - Storage using key-value, schema-free datastore
(Bigtable-based) - HTTP/S communication only, CGI to handle page
requests - Limit on number of datastore elements accessed
per request - Limit on response duration, task frequency,
request rate - Enforced quotas (BW, CPU, requests/s, files,
app size, )
12google app engine upload to google
private, enterprise data, Google apps
SDC
GAE app users via the Internet
Google App Engine (GAE)
GAE Application (Python, Java)
Administrator Console
Free w/ quotas Pay for additional scale CPU,
BW, emails, data BigTable Automatic scaling High
availability
MyApp.appspot.com
13from gae to appscale
AppScale
GAE Application (your code here)
Open-source Google App Engine Software
Development Kit (SDK)
14from gae to appscale
- GAE SDK extensions
- Pluggable using open-source distributed database
technologies - HBase, Hypertable, Cassandra, Voldemort, MongoDB,
MemcacheDB, MySQL - MemcacheD library
- From console or as background thread
(automatically) - Interface to Hadoop (MapReduce)
- Multi-language support Python, Java, Ruby, Perl,
soon X10 - Translator to Linux Cron job, similar to Tasks
- Pluggable built-in cloud-wide authentication via
Rails, support for Eucalyptus and EC2 credentials
Data store
Mem Cache
Tasks
Cron
Users
15appscales offerings
- Use of the Google APIs as a unifying interface
- Datastore/database backends, services
- Can grow this to include other standard APIs
- Single framework that executes extant GAE
applications - Disparate API implementations automatically
deploy/compare - No restrictions on resource usage, library
support - Multiple high-level language front-ends and
components - Optimized library support in the back-end
16appscales offerings
- Use of the Google APIs as a unifying interface
- Datastore/database backends, services
- Can grow this to include other standard APIs
- Single framework that executes extant GAE
applications - Disparate API implementations automatically
deploy/compare - No restrictions on resource usage, library
support - Multiple high-level language front-ends and
components - Optimized library support in the back-end
- Open-source international user community
- Potential for integration with infrastructure
level (IaaS) - Adaptive SLA re-negotation
- Growth/Shrinkage of the cloud
- Application or workload specific customization of
the platform
17appscale
- Distributed system with four key components
- AppLoadBalancer (ALB) Database
Master/Peer (DB M/P) - AppServer (AS) Database Slave/Peer (DB
S/P) - Automatic deployment over Eucalyptus and EC2
- Released as a single Xen or KVM image
- Adaptive role configuration
Compute tasks (e.g. Map Reduce)
AppScale Cloud
DB M/P
GAE App Developer (AppScale Admin)
ALB
App Controller
DB S/P
HTTPS
GAE App Users
GAE App Users
AS
GAE App Users
18appscale team foci
- Full system monitoring (low-overhead HPM/OS
profiling) - Feedback driven adaptation, grow/shrink the cloud
- Online workload characterization, behavior
pattern recognition - Identify critical tasks, applications, components
- Shared-memory support for co-located components
- PaaS customization, PaaS-IaaS coordination,
adaptation
Compute tasks (e.g. Map Reduce)
AppScale Cloud
DB M/P
GAE App Developer (AppScale Admin)
ALB
App Controller
DB S/P
HTTPS
GAE App Users
GAE App Users
AS
GAE App Users
19google app engine
private, enterprise data, Google apps
SDC
Google App Engine (GAE)
MyApp.appspot.com
GAE Application (your code here)
Protobuf Data APIs
Users
Images
URL Fetch
Blob store
Administrator Console
IM
Services
Cron
Data Store
Tasks
Memcache
Mail
20customized platform-as-a-service
private, enterprise data, other apps public
cloud fabrics
- Specialized platform access to
- standard APIs
- Platforms for other app domains
SDC
MyApp.your.url.gov
GAE, HPC, Data Intensive Application
Protobuf Data APIs
Auth
Messaging
MPI
Custom Libs
GPU
Administrator Console
Services
Stream support
Map Reduce
Data Store
Tasks
Vector support
Collaboration
21appscale and RDMA
- Shared-memory support for co-located components
- PaaS-IaaS coordination, adaptation, scheduling
- Compute/data intensive workloads (HPC, messages,
streams) - Database system support for different workloads
- Programmatic interface and tools for applications
developers - Improve ease of use access
Compute tasks (e.g. Map Reduce)
AppScale Cloud
DB M/P
GAE App Developer (AppScale Admin)
ALB
App Controller
DB S/P
HTTPS
GAE App Users
GAE App Users
AS
GAE App Users
22appscale http//appscale.cs.ucsb.edu
- Thanks!
- Leads Chris Bunch, Navraj Chohan
- Development and research team Jovan Chohan,
Nupur Garg, Matt Hubert, Jonathan Kupferman,
Puneet Lakhina, Yiming Li, Nagy Mostafa,
Yoshihide Nomura (Fujitsu), Michal Weigel - Support
- Google, IBM Research, National Science Foundation
- OpenFabrics Alliance