Title: Managing Data Resources
1Managing Data Resources
2Learning Objectives
- Explain the problems of the traditional file
environment - Describe how a database management system
organises information - Identify the three principal database models and
some principles of database design - Discuss new database trends
3File Organisation Terms Concepts
- Field
- Individual piece of information
- e.g. student registration number
- Record
- A logically connected set of one or more fields
- e.g. An individual student
- File
- A collection of records which contain logically
related data - e.g. all students taking a course
- Database
- A group of related files
- the actual phrase database' was first used by
the USA military in 1963 during a symposium
titled Development and Management of a
Computer-centered Data Base
4Traditional File Organisation
- Good information systems require well organised
file organisation - Many organisations do not have this due to the ad
hoc way files have been developed over time. - Traditional file system sometimes called a
file-based system - A collection of application programs that perform
services for the end-users - Each program defines and manages its own data
- Uses flat files
- One-dimensional storage system in that it
presents its information from a single point of
view
5File-oriented information system
Source Brookshear, J.G., Computer Science An
Overview, p.399
6Limitations of a File-Based Approach
- Separation and isolation of data
- With data being held in separate files it is more
difficult to get the information you want - Need temporary files and special programs to
extract data - Duplication of data
- With each department maintaining its own data
there is bound to be duplication - This is a problem
- Wasteful of resources storing the same data in
more than one place - Can lead to loss of data integrity, i.e. the data
is not consistent and inaccurate in one or more
applications.
7Limitations of a File-Based Approach
- Data dependence (program-data dependence)
- The physical structure and storage of the data
files and records are defined in the code of each
application - Therefore if you change one file structure all
the programs that access that file need to be
modified to reflect the change - Incompatibility of files
- Just as the structure of files is embedded in the
application programs, the structures are
dependent on the application programming language - Makes processing files generated by different
programming languages difficult
8Limitations of a File-Based Approach
- Fixed queries/proliferation of programs
- File-based systems represented a big improvement
over manual systems so users wanted the increased
information they could provide - The reports to produce the information had to be
programmed - Gave rise to a menu of fixed reports
- Not possible to get ad hoc reports to meet a
specific need - With the creation of lots of files and associated
programs it became very difficult to maintain
them all - Needed something simpler to ease maintenance
problems
9Database Systems
- Tries to overcome the problems of a file-based
approach - Collection of data that is multidimensional
- Contains links between entries so that its
information is accessible from a variety of
perspectives - Definition of a database
- A shared collection of logically related data
(and a description of this data), designed to
meet the information needs of an organisation.
(Connolly et al) - collection of data organised to serve many
applications efficiently by centralising the data
and minimising redundant data. (Laudon and
Laudon) - Increased control over information for the
organisation - Database Administrator
- Responsible for the control of the information in
the database
10Database-oriented Information System
Source Brookshear, J.G., Computer Science An
Overview, p.399
11Database Systems
- Data integration can bring a number of
disadvantages - Access to sensitive data
- To provide for a distinction of access privileges
database systems often use schemas and
subschemas - Schema A description of the entire database
structure that is used by the database software
to maintain the database - Subschema A description of only that portion of
the database pertinent to a particular users
needs - Increased availability of information gives
potential for misinformation and misapplication
of information - Potential violation of rights to privacy
12Conceptual Layers of a Database
User
Application software
Database Management system
Actual database
Source Brookshear, J.G., Computer Science An
Overview, p.401
13Database Management Systems
- Definition
- A software system that enables users to define,
create, and maintain the database and provides
controlled access to this database. (Connolly et
al) - Software that permits an organisation to
centralise data, manage them efficiently, and
provide access to the stored data by application
programs. (Laudon and Laudon) - The DBMS actually manipulates the data in the
database - This has the following benefits
- Makes it easier to develop applications as dont
have to worry about data manipulation - Provides a means for controlling access to the
data - Achieves data independence , i.e. the
organisation of the database can be changed
without affecting the applications - Allows applications to be written in terms of a
simplified, conceptual view of the database
instead of the actual, complex structure
14Components of a DBMS
- Data Definition Language
- Allows users to define the database
- Specifies the data types and structures and the
constraints on the data to be stored in the
database - Used to define a schema or modify an existing one
- Cannot manipulate data
15Components of a DBMS
- Data Manipulation Language
- Allows users to insert, update, delete and
retrieve data from the database - Provides a general enquiry facility to the
database - Used in conjunction with a programming language
to manipulate the data in the database - Most prominent is Structured Query Language, SQL
- Data dictionary
- a collection of descriptions of the data objects
or items in a data modelling for the benefit of
programmers and others who might need to use
them. www.whatis.com - Contains metadata, i.e. data about data
- Sometimes called a system catalogue
- The DMBS consults the data dictionary before
accessing the actual data in the database.
16Database Terminology
- Entity
- A distinct object (a person, place or thing,
concept or event) in the organisation that is to
be represented in the database - Attribute
- A property that describes some aspect of the
object which you wish to record - Relationship
- An association between several entities
17Designing Databases
- Three principal logical database models for
keeping track of entities, attributes and
relationships - First Generation
- Hierarchical
- Roots in 1960s space exploration
- Network
- Sometimes referred to as the CODASYL Data Model
after the Conference on Data Systems Language
that was responsible for its development - First model to allow one to many relationships
- Second Generation
- Relational
- 1970 E.F.Codd from IBM Research Laboratory
18Relational Data Model
- By far the most popular model to-day
- Represents all data in the database as simple
two-dimensional tables called relations - Relations are similar to how you might store data
in a spreadsheet - A row in a relation is called a tuple
- Columns in a relation are called attributes
- These relations can easily be combined and
extracted to access data and produce reports,
provided that they share a common data element
19Creating a Relational Database
- Two parts
- Conceptual design
- Design of the relations making up the database
- Physical design
- How the database will be arranged on the storage
devices - Carried out by database specialists
20Conceptual Design
- Model of the database from a business perspective
- Document model with an Entity-Relationship
diagram - Complex groupings of data must be streamlined to
eliminate redundant data elements and
many-to-many relationships - This is called normalisation and there exists a
hierarchy of relation classes - First Normal Form
- Second Normal Form
- Third Normal Form
21Database Trends
- Distributed Databases
- Data held over more than one machine
- e.g Napster
- Object-Oriented Database
- Useful where application accessing the database
is also object oriented - Data Warehouse
- A means for organisation to turn their archives
of data into sources of knowledge for competitive
advantage