Title: O'Brien MIS, 6th ed.
1Module
3
The concept of data processing Major issues in
database management
2Learning Objectives
- Explain the importance of implementing data
resource management processes and technologies in
an organization. - Understand the advantages of a database
management approach to managing the data
resources of a business.
3Learning Objectives (continued)
- Explain how database management software helps
business professionals and supports the
operations and management of a business. - Illustrate each of the following concepts
- Major types of databases
- Data warehouses and data mining
- Logical data elements
- Fundamental database structures
- Database access methods
- Database development
4Section I
5Data Resource Management
- A managerial activity
- Applies information systems technology to
managing data resources to meet needs of business
stakeholders.
6Foundation Data Concepts
- Levels of data
- Character
- Single alphabetical, numeric, or other symbol
- Field
- Groupings of characters
- Represents an attribute of some entity
7Foundation Data Concepts (continued)
- Records
- Related fields of data
- Collection of attributes that describe an entity
- Fixed-length or variable-length
8Foundation Data Concepts (continued)
- Files (table)
- A group of related records
- Classified by
- Primary use
- Type of data
- permanence
9Foundation Data Concepts (continued)
- Database
- Integrated collection of logically related data
elements - Consolidates records into a common pool of data
elements - Data is independent of the application program
using them and type of storage device
10Foundation Data Concepts (continued)
11Types of Databases
- Operational
- Supports business processes and operations
- Also called subject-area databases, transaction
databases, and production databases
12Types of Databases (continued)
- Distributed
- Replicated and distributed copies or parts of
databases on network servers at a variety of
sites. - Done to improve database performance and security
13Types of Databases (continued)
- External
- Available for a fee from commercial sources or
with or without charge on the Internet or World
Wide Web - Hypermedia
- Hyperlinked pages of multimedia
14Data Warehouses and Data Mining
- Data warehouse
- Stores data extracted from operational, external,
or other databases of an organization - Central source of structured data
- May be subdivided into data marts
15Data Warehouses and Data Mining (continued)
- Data mining
- A major use of data warehouse databases
- Data is analyzed to reveal hidden correlations,
patterns, and trends
16Database Management Approach
- Consolidates data records and objects into
databases that can be accessed by many different
application programs
17Database Management Approach (continued)
- Database Management System
- Software interface between users and databases
- Controls creation, maintenance, and use of the
database
18Database Management Approach (continued)
19Database Management Approach (continued)
- Database Interrogation
- Query
- Supports ad hoc requests
- Tells the software how you want to organize the
data - SQL queries
- Graphical (GUI) natural queries
20Database Management Approach (continued)
- Report Generator
- Turns results of query into a useable report
- Database Maintenance
- Updating and correcting data
21Database Management Approach (continued)
- Application Development
- Data manipulation language
- Data entry screens, forms, reports, or web pages
22Implementing Data Resource Management
- Database Administration
- Develop and maintain the data dictionary
- Design and monitor performance of databases
- Enforce database use and security standards
23Implementing Data Resource Management (continued)
- Data Planning
- Corporate planning and analysis function
- Developing the overall data architecture
24Implementing Data Resource Management (continued)
- Data Administration
- Standardize collection, storage, and
dissemination of data to end users - Focused on supporting business processes and
strategic business objectives - May include developing policy and setting
standards
25Implementing Data Resource Management (continued)
- Challenges
- Technologically complex
- Vast amounts of data
- Vulnerability to fraud, errors, and failures
26Section II
- Technical Foundations of Database Management
27Database Structures
- Hierarchical
- Treelike
- One-to-many relationship
- Used for structured, routine types of transaction
processing
28Database Structures (continued)
- Network
- More complex
- Many-to-many relationship
- More flexible but doesnt support ad hoc requests
well
29Database Structures (continued)
- Relational
- Data elements stored in simple tables
- Can link data elements from various tables
- Very supportive of ad hoc requests but slower at
processing large amounts of data than
hierarchical or network models
30Database Structures (continued)
- Multi-Dimensional
- A variation of the relational model
- Cubes of data and cubes within cubes
- Popular for online analytical processing (OLAP)
applications
31Database Structures (continued)
32Database Structures (continued)
- Object-oriented
- Key technology of multimedia web-based
applications - Good for complex, high-volume applications
33Database Structures (continued)
34Accessing Databases
- Key fields (primary key)
- A field unique to each record so it can be
distinguished from all other records in a table
35Accessing Databases (continued)
- Sequential access
- Data is stored and accessed in a sequence
according to a key field - Good for periodic processing of a large volume of
data, but updating with new transactions can be
troublesome
36Accessing Databases (continued)
- Direct access
- Methods
- Key transformation
- Index
- Indexed sequential access
37Database Development
- Data dictionary
- Directory containing metadata (data about data)
- Structure
- Data elements
- Interrelationships
- Information regarding access and use
- Maintenance security issues
38Database Development (continued)
- Data Planning Database Design
- Planning Design Process
- Enterprise model
- Entity relationship diagrams (ERDs)
- Data modeling
- Develop logical framework for the physical design
39Discussion Questions
- How should an e-business enterprise store,
access, and distribute data information about
their internal operations external environment? - What roles do database management, data
administration, and data planning play in
managing data as a business resource?
40Discussion Questions (continued)
- What are the advantages of a database management
approach to organizing, accessing, and managing
an organizations data resources? - What is the role of a database management system
in an e-business information system?
41Discussion Questions (continued)
- Databases of information about a firms internal
operations were formerly the only databases that
were considered to be important to a business.
What other kinds of databases are important for a
business today? - What are the benefits and limitations of the
relational database model for business
applications?
42Discussion Questions (continued)
- Why is the object-oriented database model gaining
acceptance for developing applications and
managing the hypermedia databases at business
websites? - How have the Internet, intranets, extranets, and
the World Wide Web affected the types and uses of
data resources available to business end users?
43Real World Case 1 IBM versus Oracle
- What key business strategies did Janet Perna
implement to help IBM catch up to Oracle in the
database management software market? - What is the business case for both IBMs and
Oracles product strategy for their database
software?
44Real World Case 1 (continued)
- Which approach would you recommend to a company
seeking a database system today? - What do you see as the key factor to IBMs
success?
45Real World Case 1 (continued)
- The case states that database software has
become more of a commodity. Do you agree?
46Real World Case 2 Experian Automotive
- How do the database software tools discussed in
this case help companies exploit their data
resources? - What is the business value of the automotive
database created by Experian?
47Real World Case 2 (continued)
- What other business opportunities could you
recommend to Experian that would capitalize on
their automotive database? - The case states that Experians automotive
database has raised the hackles of privacy
advocates. What legitimate privacy concerns and
safeguard suggestions might be raised about this
database and its use?
48Real World Case 3 Shell Exploration
- Why do companies still have problems with the
quality of the data resources stored in their
business information systems? - What is a data silo?
49Real World Case 3 (continued)
- How do data warehouse approaches help companies
like Shell and OshKosh meet their data resource
management challenges? - What business benefits can companies derive from
a data warehouse approach?
50Real World Case 4 BlueCross BlueShield Warner
Bros.
- What is a storage area network? Why are so many
companies installing SANs? - What are the reasons for the quick payback on SAN
investments?
51Real World Case 4 (continued)
- What are the challenges and alternatives to SANs
as a data storage technology? - What are some advantages of SANs?
52Real World Case 5 Sherwin-Williams Krispy
Kreme
- Tips for Managing External Data
- Purchase external data from a reliable source
that will do most of the refining for you and
will work with you on contingency plans. - Run a test load first. A load of test data can
pave the way for accurate production loads.
53Real World Case 5 (continued)
- Managing external data (continued)
- Dont collect data until business and IT staff
have agreed on the amount, frequency, format, and
content of the data you need. - Dont acquire more data or use more data sources
than you really need.
54Real World Case 5 (continued)
- Managing external data (continued)
- Dont mingle external and homegrown data without
adding unique identifiers to each record, in case
you need to pull it out. - Dont overestimate the datas integrity. Nothing
beats direct customer contact and tactical
details behind the data.
55Real World Case 5 (continued)
- What challenges in acquiring and using data from
external sources are identified in this case? - Do you prefer the Sherwin-Williams or Krispy
Kreme approach to acquiring external data?
56Real World Case 5 (continued)
- What other sources of external data might a
business use to gain valuable marketing and
competitive intelligence?
57CS 317 - Data Management and Information
Processing
58What Is a Database System?
- Database
a very
large, integrated collection of data. - Models a real-world enterprise
- Entities (e.g., teams, games)
- Relationships
(e.g., The Forty-Niners are playing
in The Superbowl) - More recently, also includes active components ,
often called business logic. (e.g., the BCS
ranking system) - A Database Management System (DBMS) is a software
system designed to store, manage, and facilitate
access to databases.
59Database Systems Then
60Database Systems Today
From Friendster.com on-line tour
61Other Ways Databases Make Life Better?
- Players could finally
sign up for the Star
Wars Galaxies game
last week as Sony
opened up registration
to the public. - Once players got in to
the game they found
that the
game servers
were offline because of database
problems. - Some players spent hours tuning their in-game
characters only to find that crashes deleted all
their hard work. - Source BBC News Online, July 1, 2003.
62Other databases you may use
63 Is the WWW a DBMS?
- Fairly sophisticated search available
- crawler indexes pages on the web
- Keyword-based search for pages
- But, currently
- data is mostly unstructured and untyped
- search only
- cant modify the data
- cant get summaries, complex combinations of data
- few guarantees provided for freshness of data,
consistency across data items, fault tolerance, - Web sites typically have a DBMS in the background
to provide these functions. - The picture is changing
- New standards e.g., XML, Semantic Web can help
data modeling - Research groups (e.g., at Berkeley) are working
on providing some of this functionality across
multiple web sites.
64Search vs. Query
- What if you wanted to find out which actors
donated to John Kerrys presidential campaign? - Try actors donated to john kerry in your
favorite search engine.
65 A Database Query Approach
66Why Study Databases??
?
- Shift from computation to information
- always true for corporate computing
- Web made this point for personal computing
- more and more true for scientific computing
- Need for DBMS has exploded in the last years
- Corporate retail swipe/clickstreams, customer
relationship mgmt, supply chain mgmt, data
warehouses, etc. - Scientific digital libraries, Human Genome
project, NASA Mission to Planet Earth, physical
sensors, grid physics network - DBMS encompasses much of CS in a practical
discipline - OS, languages, theory, AI, multimedia, logic
- Yet traditional focus on real-world apps
67Whats the intellectual content?
- representing information
- data modeling
- languages and systems for querying data
- complex queries with real semantics
- over massive data sets
- concurrency control for data manipulation
- controlling concurrent access
- ensuring transactional semantics
- reliable data storage
- maintain data semantics even if you pull the plug
- semantics the meaning or relationship of
meanings of a sign or set of signs
68Describing Data Data Models
- A data model is a collection of concepts for
describing data. - A schema is a description of a particular
collection of data, using a given data model. - The relational model of data is the most widely
used model today. - Main concept relation, basically a table with
rows and columns. - Every relation has a schema, which describes the
columns, or fields.
69Levels of Abstraction
Users
- Views describe how users see the data.
-
- Conceptual schema defines logical structure
- Physical schema describes the files and indexes
used. - (sometimes called the ANSI/SPARC model)
70Example University Database
- Conceptual schema
- Students(sid string, name string,
login string, age integer, gpareal) - Courses(cid string, cnamestring,
creditsinteger) - Enrolled(sidstring, cidstring,
gradestring) - External Schema (View)
- Course_info(cidstring,enrollmentinteger)
- Physical schema
- Relations stored as unordered files.
- Index on first column of Students.
71Data Independence
- Applications insulated from how data is
structured and stored. - Logical data independence Protection from
changes in logical structure of data. - Physical data independence Protection from
changes in physical structure of data. - Q Why are these particularly important for DBMS?
72Queries, Query Plans, and Operators
SELECT eid, ename, title FROM Emp E WHERE E.sal gt
50K
SELECT E.loc, AVG(E.sal) FROM Emp E GROUP BY
E.loc HAVING Count() gt 5
SELECT COUNT DISTINCT (E.eid) FROM Emp E,
Proj P, Asgn A WHERE E.eid A.eid AND P.pid
A.pid AND E.loc ltgt P.loc
- System handles query plan generation
optimization ensures correct execution.
Employees Projects Assignments
- Issues view reconciliation, operator ordering,
physical operator choice, memory management,
access path (index) use,
73Concurrency Control
- Concurrent execution of user programs key to
good DBMS performance. - Disk accesses frequent, pretty slow
- Keep the CPU working on several programs
concurrently. - Interleaving actions of different programs
trouble! - e.g., account-transfer print statement at same
time - DBMS ensures such problems dont arise.
- Users/programmers can pretend they are using a
single-user system. (called Isolation) - Thank goodness! Dont have to program very,
very carefully.
74Transactions ACID Properties
- Key concept is a transaction a sequence of
database actions (reads/writes). - DBMS ensures atomicity (all-or-nothing property)
even if system crashes in the middle of a Xact. - Each transaction, executed completely, must take
the DB between consistent states or must not run
at all. - DBMS ensures that concurrent transactions appear
to run in isolation. - DBMS ensures durability of committed Xacts even
if system crashes. -
- Note can specify simple integrity constraints on
the data. The DBMS enforces these. - Beyond this, the DBMS does not understand the
semantics of the data. - Ensuring that a single transaction (run alone)
preserves consistency is largely the users
responsibility!
75Ensuring Transaction Properites
- DBMS ensures atomicity (all-or-nothing property)
even if system crashes in the middle of a Xact. - DBMS ensures durability of committed Xacts even
if system crashes. - Idea Keep a log (history) of all actions carried
out by the DBMS while executing a set of Xacts - Before a change is made to the database, the
corresponding log entry is forced to a safe
location. - After a crash, the effects of partially executed
transactions are undone using the log. Effects of
committed transactions are redone using the log. - trickier than it sounds!
76The Log
- The following actions are recorded in the log
- Ti writes an object the old value and the new
value. - Log record must go to disk before the changed
page! - Ti commits/aborts a log record indicating this
action. - Log is often duplexed and archived on stable
storage. - All log related activities (and in fact, all
concurrency control related activities such as
lock/unlock, dealing with deadlocks etc.) are
handled transparently by the DBMS.
77Structure of a DBMS
These layers must consider concurrency control
and recovery
- A typical DBMS has a layered architecture.
- The figure does not show the concurrency control
and recovery components. - Each database system has its own variations.
78Advantages of a DBMS
- Data independence
- Efficient data access
- Data integrity security
- Data administration
- Concurrent access, crash recovery
- Reduced application development time
- So why not use them always?
- Expensive/complicated to set up maintain
- This cost complexity must be offset by need
- General-purpose, not suited for special-purpose
tasks (e.g. text search!)
79Databases make these folks happy ...
- DBMS vendors, programmers
- Oracle, IBM, MS, Sybase,
- End users in many fields
- Business, education, science,
- DB application programmers
- Build enterprise applications on top of DBMSs
- Build web services that run off DBMSs
- Database administrators (DBAs)
- Design logical/physical schemas
- Handle security and authorization
- Data availability, crash recovery
- Database tuning as needs evolve
must understand how a DBMS works
80Summary (part 1)
- DBMS used to maintain, query large datasets.
- can manipulate data and exploit semantics
- Other benefits include
- recovery from system crashes,
- concurrent access,
- quick application development,
- data integrity and security.
- Levels of abstraction provide data independence
- Key when dapp/dt ltlt dplatform/dt
81Summary, cont.
- DBAs, DB developers the bedrock of the
informationeconomy
- DBMS RD represents a broad,
- fundamental branch of the science of
computation