Title: Data Warehousing
1Data Warehousing
Virtual University of Pakistan
Ahsan Abdullah Assoc. Prof. Head Center for
Agro-Informatics Research www.nu.edu.pk/cairindex.
asp FAST National University of Computers
Emerging Sciences, Islamabad
2Multi-Campus University
3Degree Programs
4Disciplines for BS
5Disciplines for MS
6The need
- Head Office wants a central data repository for
decision support i.e. a DWH -
7Students Record Keeping Mgmt.
8Data from Lahore Campus
9Data from Lahore Campus Sample
10Lahore Header of Student Table
11Lahore Header of Student Table
- Gender
- Address
- Date of Birth
- Reg Date
12Lahore Header of Student Table
- Reg Status
- Degree Status
- Last Degree
13Lahore Header of Course Reg. Table
- SID
- Degree
- Semester
- Course
- Marks
- Discipline
14Lahore Facts About Data
15Data from Karachi Campus
16Data from Karachi Campus Sample
17Karachi Header of Student Table
- St_ID
- Name
- Father
- DoB
- M/F
- DoReg
- RStatus
- DStatus
- Address
- Qualification
18Karachi Header of Course Reg. Table
- SID
- Courses
- Score
- Sem
- Disp
Degree (BS/MS) is missing because separate
books are maintained, but the issue is critical
while loading data
19Karachi Facts About Data
20Data from Islamabad Campus
21Data from Islamabad Campus Sample
22Islamabad Header of Student Table
- Roll Num
- Name
- Father
- Reg Date
- Reg Status
- Degree Status
- Date of Birth
- Education
- Gender
- Address
23Islamabad Header of Course Reg. Table
- Roll Num
- Course
- Marks
- Discipline
- Session
Degree (BS/MS) is missing, whereas same table
contains records for both. Only way to
differentiate is through discipline attribute.
24Islamabad Facts About Data
25Exercise
26Problems with Adhoc Approach
27Problem-1 Non-Standard data sources
28Problem-2 Non-standard attributes
29Problem-3 Non Normalized database
30Notepad Issues
31MS-Excel Issues
32MS-Access Issues
33Problem Statement
34Data from Peshawar Campus
- Data at Peshawar campus is stored in Text files
- To store data regarding one complete batch 2 text
files are used - Lhr_Student_batch (Student record)
- Lhr_Detail_batch (Course Reg. record)
- 22 text files for 11 BS batches
- 8 text files for 4 MS batches
35Data from Peshawar Campus Sample
36Peshawar Header of Student Table
- Reg Student identity
- Name Student name
- Father Father name
- Address Permanent address
- Date of Birth Date of Birth
- lastDeg Last degree achieved
- Reg Date Date of Enrollment
- Reg Status Status of Enrollment (A/T)
- Degree Status Status of Degree (C/I)
37Peshawar Header of Course Reg. Table
- Reg
- Courses Course code
- Score Out of 100
- Program CS/TC/SE/CE
- Sem Fall/Spring
- Year YYYY e.g. 1999
- We need to identify semester session (fall04)
through combination of Sem and Year