Title: Data and Databases
1Data and Databases
2Learning Objectives
- Why organizations need a database system
- Components of a database environment
- How to manage organizational data
- How to represent data requirements
- How to extract information from a database using
SQL - Who's responsible for data management in
organizations - Trends in DB tools and technologies
3Historical Context
- Systems development prior to the 1970s centered
on processes first, and then the data - Problems
- application/data dependence,
- data redundancy,
- data isolation,
- data security,
- Lack of flexibility
4(No Transcript)
5Historical Context, cont
- Centralized Data Management
- Emerged in the 1970s
- Became popular in the 1980s and 1990s
- Helps avoid most of the problems with
process-centered development - Provides integration between applications through
a common database
Inventory
GeneralLedger
Database
Marketing
Personnel
6Database Management Systems
- A database is an organized logical grouping of
related files - One set of software programs (DBMS) provides
access to all of the data - Data redundancy, data isolation, and data
inconsistency are minimized - Leads to application/data independence
7DBMS Benefits
- Improved strategic use of corporate data
- Reduced complexity of the organizations
information systems environment - Reduced data redundancy and inconsistency
- Enhanced data integrity
- Application-data independence
- Improved security
- Reduced application development and maintenance
costs - Improved flexibility of information systems
- Increased access and availability of data and
information
8DBMS Components
- Data dictionary
- Data model
- Data definition language (DDL)
- Language used to specify the physical structure
of the data - Defines the attributes of each record and field,
creates identifiers or keys - Provides means to associate related data
- Data manipulation language (DML)
9Data Dictionary
- Stores definitions of data elements and data
characteristics - Contains descriptive information
- Name of entity(s)
- Identifiers (primary and/or secondary keys)
- Attributes
- Type
- Length
- Optionality
- Defaults
- Form of metadata
- information about information
10Data Dictionary Example
11Data Models Logical vs. Physical
- Logical view of data is how the user sees the
data - Very user-oriented
- Very likely in a format like an ERD
- Physical view of data deals with the actual
physical implementation and location of data on
disk - Database administrators are responsible for these
tasks
12Understanding Data Requirements
- End user involvement is KEY
- Often documented with entity-relationship
diagrams (ER diagrams, or ERDs) - Capture more than just data requirements
captures business rules too
13ERDs Entities
- An entity is something that you wish to track
data about - Person, place, thing, or event
- Attributes are the properties that describe the
characteristics of an entity - E.g., due date, start date, title
- Each entity must have a unique identifier
- Often called the primary key
- E.g., Project ID
Project
14ERDs Relationships
- Entities are usually related to other entities
through relationships - Usually binary relationships, though ternary
relationships can and do exist in the business
world
15ERDs Relationships, cont
- Relationships have names
- User or analyst may name the relationship
- May aid in the understanding of the ERD
- Difficulty naming the relationship (like can
have) may indicate that the relationship isn't
fully understood
Is assigned to complete
Team
Task
Completed by
Each TEAM may be ASSIGNED TO COMPLETE one or more
tasksEach TASK may be COMPLETED BY one team
16ERDs Relationships, cont
- Relationships have cardinalities
- "How many instances of an entity class can
another entity be related to?" - 11
- 1M
- MM
17ERD Example
18ERDs Your Turn
- A local car dealership considers anyone who comes
on their lot to be a customer, whether or not
they actually purchase a car. If they do purchase
one or more cars we certainly want to keep track
of those purchases. In addition, we would like to
keep track of any service that the customer
receives on each car purchased.
19Are ERDs Useful?
20ERDs Things to Consider
- When developing a data model, you must consider
the scope of the system you are modeling - Are you modeling just the system (functional data
model), or the entire organization (enterprise
data model) - When developing a data model, you must understand
the business and its rules - So if you dont know, ASK!
21Evolution of Physical Data Models
22Relational Model
23Relational Model, cont
- Strengths of Relational Model
- Flexibility in ad-hoc reporting
- Power to combine info from different sources
- Simplicity of design
- Ability to add new data without disturbing
existing programs - Weakness
- Not as efficient as other two DBMS models
24Normalization
- Necessary step if using the relational database
model (most systems today do) - Helps eliminate redundant data elements,
many-to-many relationships, and simplifies data
retrieval and improves data integrity
25Steps in Normalization
Table with Repeatinggroups
Remove repeating groups
Firstnormalform
Remove partial dependencies
Secondnormalform
Remove transitive dependencies
Thirdnormalform
26Data Manipulation Language
- Language used to manipulate the data in the
database - Provides users with the ability to retrieve,
sort, display, and delete the contents of a
database - Most common DML for relational databases is the
query language SQL (also can serve as DDL)
27SQL Example
- SELECT first_name, last_name, gender, dept,
net_pay - FROM employee, pos, payroll WHERE
employee.empno pos.empno and employee.empno
payroll.empno and ((pos.end) is null) and
((payroll.month)"2004-01-31")
28Query by Example (QBE)
29Natural Language Example
- User Give me a list of the part numbers,
quantities, and suppliers of Houston parts - System You're requesting, among other things, a
listing of quantities of parts. Do you want - Quantities on hand
- Quantities on order
- Quantities shipped?
30Natural Language Example, cont
- User I want quantities on order
- System Fine! Now, with regard to Houston
parts, do you mean - Parts supplied by Houston suppliers
- Parts supplied to Houston projects
- Parts associated with Houston in some other way?
31Natural Language Example, cont
- User I mean item 2
- System Good! I need one more piece of
information. In the listing of part numbers,
quantity on order, and suppliers, how do you want
suppliers to be identified - By number
- By name
- Some other way?
32Natural Language Example, cont
- User by name and by city
- System O.K. This is what I understand your query
to be Find all combinations of part number,
quantity on order, supplier name, and supplier
city such that the supplier supplied the part to
a project located in Houston.
33Roles in Database Creation/Maintenance
- Data administrator
- Plans organizational data resource
- Information policy
- Data planning
- Maintenance of data dictionaries
- Data quality standards
- Database administrator
- Maintains physical database(s)/DBMS
- Data steward
- Business person with ownership of data resource
34Database Trends
- Centralization/Decentralization
- Geographical Information Database
- Multidimensional Data Analysis
- Data Warehousing
- Data Mining
- Databases and the Web
35Centralization/Decentralization
- A centralized database has all the related files
in one physical location - Personal
- Organizational
- Distributed database has complete copies of a
database (or portions of the database) in more
than one location - Replicated complete copies of the entire
database in multiple locations - Partitioned each location has a portion of the
entire database
36(No Transcript)
37Geographical Information Databases
http//www.esri.com/software/arcgis/concepts/gis-d
ata.html
38(No Transcript)
39Databases and the Web
- Linking databases to the web
- Web user connects to vendor database
- Special software in HTML converted to SQL
- SQL finds data, server converts result to HTML
for display in browser
40Databases and the Web
- Content Management Systems
- firms can maintain web content in databases
- ASP (Active Server Pages)
- code can be used to determine which content to
display - HTML or ASP or others!
- users can update their pages through simple
on-line editor rather than having to learn