Database Systems - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Database Systems

Description:

From morning to bed time. In the morning. Local Weather Report. List of Reminders ... Overcome blocking. Provides Correctness. Query Optimization ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 36
Provided by: mehme7
Category:

less

Transcript and Presenter's Notes

Title: Database Systems


1
Database Systems Breaking Out of the Box
  • Avi Silberschatz Stan Zdonik
  • Bell Laboratories Brown
    University
  • July 7, 1997

2
The Papers Theme (Strategic Directions)
  • Database Research should be devoted to the
    problems of data management no matter where and
    in what form the data might be found.
  • Database management skills should be applied to
    new data management environments that potentially
    require radically new software architectures.

3
Outline
  • Introduction
  • Background
  • Our Skills
  • Scenarios
  • Barriers
  • Research
  • Conclusions
  • References

4
Introduction
  • The field of database systems research and
    development has been very successful over its 30
    year history.
  • It has led to 10 billion industry that touches
    virtually every major company in the world.
  • Unthinkable to manage large volume of valuable
    information that keeps corporations runing
    without support from commercial database
    management systems (DBMS).
  • DBMS is a very complex system incorporating a
    rich set of technologies.
  • Suited for solving problems of large-scale data
    management in the corporate setting.

5
DBMS
  • DBMS Requirements
  • Execution Overhead.
  • High level of expertise to install and maintain.
  • Only manages data in fairly specific file
    formats.

6
Solution
  • At the same time
  • Data is changing rapidly.
  • Data is stored in different places (e.g. files)
  • Data is obtained in large volumes from external
    sources like sensors.
  • Solution
  • Not full-blown DBMS, a lighter-weight solution
  • Instead of using an existing tool in a new
    application, it is better to embed reusable
    components.
  • Use database system components, techniques and
    experience in new ways.

7
Examples
  • Some examples that could benefit from data
    management techniques but that typically do not
    make heavy use of database products
  • World Wide Web
  • Personal Information Systems (e-mail)
  • News Services
  • Scientific Applications

8
Background
  • Database field born with release of IMS in 60s.
  • IBM Product
  • Managed data as hierarchies
  • Data has value, manage independently of
    application
  • Codasyl, most well known successor
  • Based on graph-based structure.
  • Ted Codd published a paper in 1970
  • Suggested relational model.

9
Background
  • Object Oriented Principles in 80s
  • Allow users to create their own
    application-specific types that can be managed by
    the DBMS.
  • Hybrid model in 90s
  • Embeds object-oriented features in a relational
    context.

10
Our Skills
  • Database Management Systems have been concerned
    with the following problems
  • High Performance
  • Correctness
  • Maintainability
  • Reliability
  • From point of view of slow-memory devices that
    must be shared by multiple concurrent users
  • This approach leads to a set of skills and
    techniques that can be applied and extended to
    other problems.

11
Skills and Techniques
  • Data Modeling
  • Language for defining structure of database
  • Language for manipulating those structures.
  • Query Languages
  • High-level language to retrieve data from the
    database. (SQL)
  • Query Optimization and evaluation
  • State-based views
  • Restricted and reorganized view of database.

12
Skills and Techniques
  • Data Management
  • Automatic maintenance of data structures
  • Efficient Movement of data
  • Transactions
  • A response to correctness problems introduced by
    concurrent access and update
  • Distributed Systems
  • Scalable Systems
  • Database systems have been tuned to efficiently
    and reliably handle data volumes that exceed the
    size of the the physical memory by several orders
    of magnitude.

13
Scenarios
  • The way for future data management systems
  • The technology that would support these scenarios
    constitutes a research agenda for the next
    decade.
  • 1) Instant Virtual Enterprise
  • 2) Personal Information Systems

14
Instant Virtual Enterprise
  • An instant virtual enterprise (IVE) is a group
    of companies, that do not routinely function as a
    unit.
  • Come together to respond to a customer order or
    request for proposal.
  • Computer integrated manufacturing (CIM) is an
    example of an environment requiring IVE
    cooperation.
  • Engineering side
  • Design, Production, Quality Assurance
  • Administrative side
  • Planning, Production Control, Resource Management

15
Instant Virtual Enterprise
  • Companies in IVE needs to exchange and manage
    large amounts of data
  • Companies will have many heterogeneous databases
  • Sharing and exchanging data with coordinating
    information is critical

16
IVE Scenario
Building an oil pipeline
Engineering Firm (IVE)
License their design
Engineering Analysis
17
IVE Scenario
Actual Fabrication
Casting
Design file conversion service
Documentation and Archiving
18
IVE Scenario
  • Database Capabilities Needed
  • Executing a query for the design
  • Data translation services for engineering
    analysis
  • Coordination and configuration management
  • Changes to an object in one subsystem require
    changes to one or more related objects in other
    subsystems.
  • Security and access control over the information
  • Archiving of information, even after the IVE
    disbands

19
Personal Information Systems Scenario
  • Provides information to an individual
  • Uses PID (Personal Information Device)
  • PDA
  • Handheld PC
  • Laptop
  • Equipped with wireless network connection
  • Access to internet Anywhere, Anytime.

20
Personal Information Systems Scenario
  • Tightly integrated with individuals activities.
    From morning to bed time.
  • In the morning
  • Local Weather Report
  • List of Reminders
  • List of Morning Meetings
  • Best Route from home to work
  • Personalized Headlines
  • Personalized Investment Report

21
Personal Information Systems Scenario
  • Throughout the day
  • Tasks for the day
  • List of customers to contact
  • Summary of breaking news
  • Best Driving Routes in the city
  • At the end of the day
  • Next days activities
  • Appointments

22
Personal Information Systems Scenario
  • PID must continuosly query remote databases and
    monitor broadcast information
  • PID will magnify todays client-server
    performance, scalibility and reliability problems
  • Where should data reside, PID or Server?

23
Barriers
  • DBMS provides a tightly controlled and highly
    uniform environment
  • For the new applications, database functionality
    should be provided outside of the limits of a
    DBMS.
  • For the vision represented in the scenarios, a
    number of technical barriers must be removed.

24
Barriers
  • Overhead
  • System requirements, expertise, planning,
    monetary cost
  • Builder of personalized newspaper service do not
    use DBMS because there is no need for many of the
    advanced features.
  • A subset of the traditional database services are
    needed by many new applications
  • Scale
  • Greater volume of data (petabytes)
  • Hundreds of servers, client population even larger

25
Barriers
  • Schema Organization
  • First create a schema to describe the structure
    of the database and populate the database
  • Many applications currently create data
    independently of a database system. (scientific
    applications, web sites)
  • Schema is incomplete or inconsistent.
  • Schema management facilities is needed to adapt
    the dynamic nature of foreign data.
  • Data Quality
  • Information accessed form a WAN may be of varying
    quality.
  • Future information systems must be able to react
    to the quality of the data source.

26
Barriers
  • Heterogeneity
  • Data exists in many forms
  • These dissimilar formats must be integrated to
    allow applications to access data in a high-level
    and uniform way
  • Query Complexity
  • Different characteristics in future environments
  • Conventional, minimize number of disk access
  • Future, minimize total information bill

27
Barriers
  • Ease of Use
  • Highly-trained, full-time staff is assumed to
    manage a DBMS
  • Yet most users have no training in database tech.
  • Simple set of interfaces needed.
  • Security
  • As the amount of shared information grows, the
    need to restrict access to specific users of for
    specific use arises.

28
Barriers
  • Guaranting Acceptable Outcomes
  • Transacation managemnet, a barrier to both system
    performance and ability to specify acceptable
    outcomes
  • New or enchanced transaction technology is needed
  • Making data unavaliable is not acceptable
  • Aborting transactions is unacceptable
  • Technology Transfer
  • Barrier between research and industry
  • Insufficient knowledge of each other

29
Research
  • In order to achieve the vision and overcome these
    barriers, a number of central research topics
    must be addressed
  • Extensibility and Componentization
  • Imprecise Results
  • Schemaless Databases
  • Ease-of Use
  • New transaction Model
  • Query Optimization
  • Data Movement
  • Security
  • Database Mining

30
Research
  • Extensibility and Componentization
  • DBMS in a modular way
  • Lighter-weight applications
  • Imprecise Results
  • In the web search engines do not provide 100
    accuracy
  • A general theory of imprecision must be developed
  • Schemaless Databases
  • Able to work with unstructured data

31
Research
  • Ease-of-use
  • Better database interfaces are required.
  • New transaction Models
  • Overcome blocking.
  • Provides Correctness.
  • Query Optimization
  • New indexing methods, query processing
    strategies.
  • Cheaper but slower response time.
  • Sensitive to bandwidth and power considerations.

32
Research
  • Data Movement
  • In a distributed environment, the cost of moving
    data can be extremely high
  • Asymmetric communication channels, (low bandwidth
    lines)
  • Security
  • Formulation of an authorization model
  • Interoperability between differen security
    policies
  • Database Mining
  • Machine Learning
  • Statistical Analysis
  • Database Technologies

33
Conclusions
  • Database research must be broadly defined.
  • Database community must apply its experience and
    expertise to new areas and new solution packet
    must be found.
  • The vision is an integration that supports the
    application of database functionality in small
    modules that give just the right capability.
  • These modules should also represent a unified
    theory of information that allows for the
    querying information of all types without having
    to switch languages or paradigms.

34
References
  • E. F. Codd, A relational Model for Large Shared
    Databanks, Communications of the ACM, 136,(June
    1970), pp. 377-387.
  • J. Gray,http//www.cs.washington.edu/homes/lazowsk
    a/cra/database.html
  • A. Silberschatz, M. Stonebraker, and J. Ullman,
    Database Systems Achievements and
    Opportunities, SIGMOD Record, 194, pp.6-22.
  • A. Silberschatz, M. Stonebraker, and J. Ullman,
    Database Systems Achievements and Opportunities
    Into the 21st Century, http//www.cs.stanford.edu
    /pub/papers/lagii.ps
  • J. Toole and P. Young, http//www.hpcc.gov/cic/for
    um/CIC_Cover.html

35
Thanks!
  • Any Questions?
Write a Comment
User Comments (0)
About PowerShow.com