Information Retrieval and Use - PowerPoint PPT Presentation

About This Presentation
Title:

Information Retrieval and Use

Description:

Title: Systems Analysis & Design with Databases Author: cmtsh Last modified by: Leese,Geoff Created Date: 12/4/2000 12:38:55 PM Document presentation format – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 29
Provided by: cmt73
Category:

less

Transcript and Presenter's Notes

Title: Information Retrieval and Use


1
Information Retrieval and Use
  • Data Analysis Data Modeling, Relational Data
    Analysis and Logical Data Modeling
  • Geoff Leese September 2009

2
Relational Data Analysis
  • Captures the detailed knowledge of the meaning of
    the data.
  • Ensures that the data is logically easy to
    maintain and extend.
  • Data inter-dependencies have been identified
  • Ambiguities have been resolved.
  • Eliminate unnecessary duplication of data.
  • Forms the data into optimum groups.
  • Validates the Logical Data Model (LDM).

3
Logical Data Modelling
  • Basic Rules for converting 3NF to a LDM
  • Create an entity type for each data relation
  • Mark qualifying foreign keys
  • Check compound key relations
  • Make foreign/primary key relations

4
Guidelines for logical modelling
  • Entity type names are singular nouns,
    descriptive, concise and organisation specific.
  • Attribute names are unique descriptive nouns of
    standard format.
  • Relationship names are descriptive, precise verb
    phrases.

5
Simple Master-Detail relationships
  • Where a single foreign key of a relation
    corresponds to the primary key of another
    relation
  • See next slide for example.

6
Simple Master-Detail relationships
Shows SINGLE primary key at MASTER entity
(Organisation) connected to SINGLE foreign key at
DETAIL entity (Contact people)
7
Multiple level Master-Detail Relationships
  • Example five entities

8
Identifying Recursive (Unary) Relationships
  • Is a relation where a foreign key references the
    same relation.
  • Example Employee
  • Employee-number
  • Employee-name
  • Employee-manager-number

Employee
9
Relationships Student/Module
  • At this point we need to identify the data items
    that describe or identify each entity
  • Entity attributes are also known as data items
  • What are the data items associated with the
    following LDS diagram?

10
The Student
Entity Type Attribute Name Attribute
Student Student Name Jones Street
Address Leek Road Town Stoke-on-Trent Po
st Code ST4 2DE Telephone 294303
11
The Module
  • Entity Type Attribute Type Attribute
  • Module Module Number CM5111-1
  • Module Name SSAT
  • Module Leader A Lecturer
  • Level 1
  • Cats Points 10

12
The Data Items
13
Identifying occurrences of entities
  • Each occurrence of an entity must be uniquely
    identified in some way
  • Imagine the British Gas data base that used only
    surnames to identify account holders
  • There would be 100,000 account holders called
    Jones in this country
  • Even if we used the given names there would still
    be considerable duplication
  • It would be impossible to find the right account
    by name alone

14
Adding a Primary Key
15
Relationships Getting it right
Is this right?
The real situation is surely
16
Putting it right Intersection entity
We need a link entity - less ambiguity
17
Normalisation - steps
  • Start with a set of un-normalised tables
  • Entity/attribute list
  • Step 1 - remove ambiguity and repeating data
  • Step 2 - remove shared data

18
Normalisation - step 1
  • Break down ALL attributes into smallest
    meaningful parts
  • EG student name becomes student surname, student
    firstname, student title
  • Remove REPEATED information to form a new table
  • EG a course may be composed of MANY modules (but
    assume that each module is only on one course!) -
    so form a MODULE table

19
Normalisation - step 2
  • Remove SHARED data to form new tables
  • EG modules may share tutors - so form a TUTORS
    table.

20
Normalisation
  • FIRST NORMAL FORM - a relation (table) is in 1NF
    if it contains atomic values and all repeating
    groups have been removed

21
Normalisation
  • SECOND NORMAL FORM - a relation(table) is in 2NF
    if it is in 1NF and every non-key attribute is
    fully dependent on the primary key

22
Normalisation
  • THIRD NORMAL FORM - a relation(table) is in 3NF
    if it is in 2NF and every non-key attribute is
    not dependent on any other non-key attribute

23
Relational Data Analysis Form
  • Validates the LDM against the relations.
  • Consists of
  • Unnormalised Form
  • attributes
  • First Normal Form (1NF)
  • Second Normal Form (2NF)
  • Third Normal Form (3NF)
  • Relations
  • Attributes

24
RDA Form
25
Data Dictionary
  • lists, for every field in every table
  • Tablename
  • Fieldname
  • Field Type
  • Field size (if variable)
  • Decimal places (if applicable)
  • Description (if required)
  • Other significant field properties

26
Data Dictionary example
27
The domain
  • Is the set of items, and the definition thereof
    to which an attribute belongs
  • Define domain once, saves time when defining
    attributes belonging to it.
  • For example - Date of Birth, Course Start Date
    and Enrolment Date all belong to the DATE domain
    - data type is date/time, format dd/mm/yyyy,
    non-unique, non-null.

28
Further reading
  • Rolland chapters 3 and 4
  • Hoffer chapters 10 and 12
  • Kendall Kendall chapter 17
Write a Comment
User Comments (0)
About PowerShow.com