Based on slides developed by

1 / 27
About This Presentation
Title:

Based on slides developed by

Description:

Based on s developed by. Hakan Hacigumus, Bala Iyer, and Sharad Mehrotra ... Database as a Service is a new model to alleviates the need to. hire professionals ... – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 28
Provided by: HakanHa5
Learn more at: http://www.ics.uci.edu

less

Transcript and Presenter's Notes

Title: Based on slides developed by


1
ICS 214B Transaction Processing and Distributed
Data Management
Lecture 17 Providing Database as a
Service Professor Chen Li
  • Based on slides developed by
  • Hakan Hacigumus, Bala Iyer, and Sharad Mehrotra
  • ICDE 2002, San Jose, CA, USA

2
Talk Outline
  • Software as a Service
  • Database as a Service
  • NetDB2 System
  • Challenges for Database as a Service
  • User Interface Issues
  • Performance Issues
  • Data Privacy Issues
  • Data Encryption in DBMSs for Data Privacy
  • Conclusion

3
Software as a Service
  • Get
  • what you need
  • when you need
  • Pay
  • what you use
  • Dont worry
  • how to deploy, implement, maintain, upgrade

4
Software as a Service
  • Driving forces to paradigm shift
  • Faster, cheaper, more accessible networks
  • Rise of distributed architectures
  • Virtualization in server and storage technologies
  • Established e-business infrastructures
  • Hardware/Software is not the largest in total
    cost of ownership
  • User Operations 46
  • Technical Support 24
  • Capital Cost (HW/SW) 21 (Source Gartner
    Group)
  • Hardware, software, network costs have been
    decreasing more sharply than personnel cost

5
Software as a Service
  • Already in the market as
  • storage services, disaster recovery services,
    e-mail services, rent-a-spreadsheet services etc.
  • Sun ONE, Oracle Online Services, Microsoft .NET
    My Services etc.
  • Why not Database as a Service ?

6
Database as a Service - Why?
  • Organizations need data management
  • DBMSs are complex systems to deploy, setup,
    maintain
  • requires highly skilled people (DBAs etc.) with
    high cost

7
Database as a Service - Offerings
  • Inherits all advantages of software as a service,
    plus
  • Service provider allows mechanisms to
  • create, store, access databases
  • DB management transferred to service provider for
  • backup, administration, restoration, space
    management, upgrades
  • Clients use the services providers HW, SW,
    personnel instead of their own

8
NetDB2 - Database Service Provision
  • Developed in collaboration with University of
    California, Irvine and IBM
  • Deployed on the Internet over a year ago
  • Been used by 15 universities and more than 2500
    students to help teaching database classes
  • Currently offered through IBM Scholars Program

9
NetDB2 System Architecture
  • Three tier architecture
  • Client - as thin as possible - just a browser
  • Java based implementation
  • Backed by fail-over solutions
  • Allows expansions and user driven integration for
    application development

10
Database as a Service - Issues
  • Issues to address
  • User Interface
  • Performance
  • Data Privacy

11
User Interface
  • Simple yet powerful
  • supports SQL queries, scripts, UDFs, stored
    procedures, metadata, data upload
  • Consistent
  • Region-based composition
  • Expansion/Integration
  • User defined interfaces

12
Performance
  • Interaction in a different medium - network
  • Performance should -at least- match what we have
    already
  • Experimented with TPC-H database and queries

13
Data Privacy
  • Users give control of their data to service
    provider
  • Attacks on stored data is a well known problem
  • So, they need data security in place
  • Security of data over the network is well studied
  • SSL,TSL
  • Establish security for stored data
  • even it is stolen should not make sense ?
    Encryption !

14
Encryption Alternatives
  • Implementation Level
  • Software v.s. Hardware encryption
  • Granularity of Data
  • Field (Attribute) level
  • Row (Record) level
  • (Disk) Page level

?
15
Encryption Alternatives (2)
  • Field level encryption
  • Pros
  • Easier to implement and integrate
  • Flexible
  • Allows selective encryption, reduces number of
    bytes to encrypt/decrypt
  • Cons
  • Increases encryption overhead significantly due
    to invocation cost
  • Data size expansion (for block cipher algorithms)
  • Current optimization technologies do not handle
    foreign functions well

16
Encryption Alternatives (3)
  • Row level encryption
  • Pros
  • Reduces the data size expansion problem
  • Reduces invocation cost
  • Better security because of total encryption
  • Cons
  • Does not allow selective encryption, increases
    the number of bytes to encrypt/decrypt
  • Implementation and integration can be hard when
    row functions are not supported

17
Encryption Alternatives (4)
  • Page level encryption
  • Pros
  • Significantly reduces encryption/decryption
    overhead due to reduced invocation cost
  • Eliminates data size expansion problem (for block
    ciphers)
  • Better security because of total encryption
  • Cons
  • Implementation and integration is not
    straightforward
  • Increases the number of bytes to encrypt/decrypt
    each time
  • Higher update/delete cost, requires re-encryption
    of all affected pages

18
Encryption Alternatives Experiments
  • Experimented with TPC-H database and queries
  • Encryption scheme alternatives (V evaluated,
    not evaluated)

Data Granularity Implementation Field
Level Row Level Page Level Software Encryption
V Hardware Encryption
V V
19
Software - Field Level Encryption
  • Block Cipher Algorithm - Blowfish
  • Implemented as foreign function (UDF)
  • Sample insert
  • insert into lineitem (discount) values
    (encrypt(10,key))
  • Sample select
  • select decrypt(discount,key) from lineitem where
    custid 300

20
Software - Field Level Encryption (2)
  • Creator supplies the key
  • Unauthorized person can not get hold of the key
  • protection even from the service provider at some
    level
  • User can easily implement different encryption
    algorithm and check that into the system
  • different encryption algorithm/key can be used
    for different fields

21
Software - Field Level Encryption (3)
  • TPC-H queries, except Q1
  • Only one field (l_discount of lineitem table)
    encrypted
  • Introduced very large overhead
  • Q1 excluded

22
TPC-H Query 1
  • Problem Multiple decryption on same field
  • select
  • l_returnflag, l_linestatus,
  • sum(l_quantity) as sum_qty,
  • sum(l_extendedprice) as sum_base_price,
  • sum(l_extendedprice (1 - l_discount)) as
    sum_disc_price,
  • sum(l_extendedprice (1 - l_discount) (1
    l_tax)) as sum_charge,
  • avg(l_quantity) as avg_qty,
  • avg(l_extendedprice) as avg_price,
  • avg(l_discount) as avg_disc,
  • count() as count_order
  • from tpcd.lineitem
  • where l_shipdate lt date ('1998-12-01') - 90 day
  • group by l_returnflag, l_linestatus
  • order by l_returnflag, l_linestatus

23
Query Rewrite to Improve Performance
  • Problem Multiple decryption on same field (e.g.,
    TPC-H Q1)
  • CSE based algorithm to eliminate redundant
    decryptions
  • Use temporary view

24
Hardware - Row Level Encryption
  • Specialized hardware IBM S/390 Cryptographic
    Coprocessor under IBM OS/390
  • editproc facility
  • invoked for whole row
  • upon read/write request, encrypt/decrypt is
    invoked from hardware for the row

25
SW Field Level v.s. HW Row Level
  • Experimented on TPC-H Q1
  • Software Field Level Only one field is encrypted
  • Hardware Row Level All fields are encrypted

26
Hardware - Page Level Encryption
  • Page level encryption is simulated
  • It gives significant improvement due to reduction
    in start-up cost

27
Conclusion
  • Database as a Service is a new model to
    alleviates the need to
  • hire professionals
  • purchase expensive hardware/software
  • deal with administrative and maintenance tasks
  • It is viable model and can emerge as a successful
    offering
  • Encryption is a solution for privacy -the most
    important issue-
  • Hardware encryption has a clear superiority over
    software
  • Hardware makes encryption practical for databases
  • There are trade-offs for granularity of data
Write a Comment
User Comments (0)