Title: FILES AND DATABASES
1FILES AND DATABASES
2A FILE is a collection of records with similar
characteristics, e.g A Sales Ledger Stock
Records A Price List Customer Records Files
may be manual, or computer.
3A RECORD is a collection of descriptive fields,
e.g. A Customer record Customer
Name Customer Number Customer Address Customer
Credit Limit
4A FIELD is a collection of characters,
e.g. Customer Name Gordon Brittas A KEY field
is a field by which a record can be uniquely
located, e.g. Customer Number
5A CHARACTER (or BYTE) may be alphabetic or
numeric and is made up of two nibbles A nibble
is made up of four bits (binary digits) A bit is
either 0, or 1
6A TRANSACTION file, not surprisingly, holds
transactions e.g. An addition - a new customer A
deletion - removing a customer An amendment -
increasing a customer's credit limit The
contents of these files are constantly
changing These files are mainly used to update
master files
7A MASTER file contains important data that is
relatively static e.g. The Customer File A
REFERENCE file contains reference
information that also seldom changes e.g. Price
List Department Codes A SPOOL file is an
intermediate store for data before it is written
out to a printer or a disc. A WORKFILE or
TEMPORARY file is used as working space during
operations such as sorting.
8- WHY DATABASES?
- More applications redundant data
- e.g. Master File for Sales
- Master File for Accounts
- Master File for Support
- all require Customer details
- 2. Add/Change applications change files
- need to change other applications accessing
those files - (fields may have moved)
- 3. More Management Information required
- data needed from more than one file
9A DATABASE is a collection of data supporting the
operations of an organisation. It is a means of
creating and maintaining data on a computer such
that it is useable for many purposes. In order
to be efficient, it must have the following
characteristics 1. Non-Redundancy 2. Data
Independence 3. Usable by all Programs 4. Support
all necessary linkages 5. Adopt a common approach
across data Details to follow ..
10- Non-Redundancy
- E.g. Customer name and address should appear
once only in the database -
- Possible problems if it appears more than once
are - a. The values of that data item may differ
- b. The waste of storage space
- c. The complexities of updating several
occurrences - However, for reasons of security or speed of
access, - some redundancy may be acceptable
112. Data Independence If you need to change the
program, you don't have to change the data AND If
you need to change the data, you don't have to
change the program
3. Program Usage The database must be usable by
all existing programs AND where possible, all
possible future programs
124. Support all necessary linkages One
application may require employee pension
contribution (one file) Another may require
employee tax payment (different file)
135. Common approach to data e.g. common length
for names common abbreviations Data
Dictionary
14Logical vs. Physical Data LOGICAL data is the
way datais seen by the program PHYSICAL data is
the way the datais stored on computer disc The
physical storage of the data has implications on
the speed of access and on the number of linkages
possible The piece of software that interfaces
between the logical requirements of the program
and the physical layout of the data is called
the DBMS (Database Management System)
15Types of Databases Hierarchical Network Relationa
l
16Hierarchical Top-down or tree arrangement Rapid
access of data PROVIDING it involves descending
the hierarchy If higher level details are not
known, it involves searching every lower level
record e.g. IMS (Information Management System)
from IBM
17Network Similar to tree structure but with
additional links, using pointers Gives faster
access, but occupies more space and requires a
more complex DBMS e.g. IDMS and TOTAL
18Relational Consists of a number of tables or
two-dimensional arrays with links between
entries e.g. dBaseIV, Oracle, Ingres, Access
The flexibility in use requires a large number of
'virtual' tables and hence a large amount of
storage space
19Advantages 1. Avoids duplication of data, i.e.
waste 2. Easier to keep it up-to-date and
consistent 3. Permits multiple usage of data 4.
Centralisation of data causes it to be regarded,
and treated, as a major resource 5.
Independent of programs 6. New applications
easier, programmer not responsible for file
organisations
20Disadvantages 1. All data held together,
therefore possible problems regarding
access and security 2. Only one set of data,
therefore accuracy and integrity essential 3.
Must be available continuously 4. Communications
considerations for distributed systems
21Choosing a database package - Questions to ask
before 1. How much data will be stored 2. How
will the data be accessed - processing
requirements 3. How complex need the database
be 4. Will users be computer literate 5. Will
people who set up the database be computer
literate 6. Does it have to be multi-user 7. Are
future modifications allowed 8. Security
adequate 9. Speed adequate 10. Report writing
facilities adequate 11. Supplier support adequate