Title: Building a Data Warehouse
1Building a Data Warehouse
- By Chuck Richardson Kathy Anderson
Learn how Santa Fe Community College selected the
Extract, Transformation and Load (ETL) and
Business Intelligence (BI) tools for the
Colleges new enterprise Decision Support System.
In addition to data selection and modeling for
the Data Warehouse, this presentation will take
you from developing Microsofts SQL Server data
transformations to designing Crystal Decisions
dynamic reports for Web-based distribution.
2Introduction of Presenters
- Chuck Richardson16 Years in the IT field 5
Years DBA experience in Oracle, Sybase and SQL
Server 1 ½ Years at Santa Fe Community College - Kathy Anderson26 Years in the IT field15 Years
at Central Florida Community College11 Years VSE
DB2 experience 1 Year at Santa Fe Community
College
KK
3 Title III Grant
- February 2000 - SFCCs Title III Proposal
Goal Improved AA student performance,
persistence and graduation.
- October 2000 - Title III grant awarded to SFCC
- SFCCs Five-Year Plan to build the Data Warehouse
and Decision Support System is fully supported
by ? The SFCC Administration ? College and
Title III grant funded resources
KC
4Data Warehouse Tool Evaluation
SFCC spent 6 months evaluating the components and
tools needed to build a Data Warehouse.
- Server Operating System
- Database Engine
- Extract, Transform and Load (ETL)
- Business Intelligence (BI)
CC
5Data Warehouse Tool Evaluation
Server Operating System Database Engine Extract
Transform Load (ETL) Business Intelligence (BI)
Server Operating Systems considered -
- Linux
- Windows NT Server
- Windows 2000 Server
- Windows 2000 Advanced Server
CC
6Data Warehouse Tool Evaluation
Server Operating System Database Engine Extract
Transform Load (ETL) Business Intelligence (BI)
Database Engines considered -
- Oracle
- IBM DB2
- Microsoft SQL Server
CC
7Data Warehouse Tool Evaluation
Server Operating System Database Engine Extract
Transform Load (ETL) Business Intelligence (BI)
Extract, Transform and Load (ETL) Definition
- Three separate functions combined into one
development tool - Extract Reads data from a specified source and
extracts a desired subset of data. - Transform Uses rules or lookup tables, or
creating combinations with other data, to
convert source data to the desired state. - Load Writes the resulting data to a target
database.
CC
8Data Warehouse Tool Evaluation
Server Operating System Database Engine Extract
Transform Load (ETL) Business Intelligence (BI)
Extract, Transform and Load (ETL) Tools
considered
- IBM DB2 Warehouse Manager
- Informatica PowerMart
- Microsoft Data Transformation Services (DTS)
CC
9Data Warehouse Tool Evaluation
Server Operating System Database Engine Extract
Transform Load (ETL) Business Intelligence (BI)
CK
10Data Warehouse Tool Evaluation
Server Operating System Database Engine Extract
Transform Load (ETL) Business Intelligence (BI)
- Business Intelligence (BI) Definition -
BI represents a broad category of applications
and technologies for providing access to data to
help enterprise users make better business
decisions. BI includes
- Decision Support Systems Forecasting
- Reporting Data Mining
- Statistical Analysis
- Online Analytical Processing (OLAP)
KK
11Data Warehouse Tool Evaluation
Server Operating System Database Engine Extract
Transform Load (ETL) Business Intelligence (BI)
- Business Intelligence (BI) Tools considered
- Brio
- Business Objects
- Crystal Decisions
KK
12Data Warehouse Tool Evaluation
Server Operating System Database Engine Extract
Transform Load (ETL) Business Intelligence (BI)
KK
13Data Warehouse Tool Evaluation
Server Operating System Database Engine Extract
Transform Load (ETL) Business Intelligence (BI)
Selected by SFCC
Windows 2000 Advanced Server
Database Engine
Microsoft SQL Server
ETL
Microsoft Data Transformation Services (DTS)
BI
Crystal Decisions
KK
14Data Warehouse SFCC Sources of Data
College personnel need direct access to decision
support information from a variety of data
sources
- IBM VSE VSAM and DB2 operational data Student
Records Financial Aid Human Resources Finance
- State Database submitted data
- State Common Course Numbering data
- Student assessment data
- Student surveys
- Network Server logs
KK
15SFCC Data Warehouse Data Extract Schedule
Snapshots of student and SDB data are -
- Extracted from the VSE mainframe 500 data
elements from more than 50 sources - Transformed Six target snapshot tables -
All rows contain the same Snapshot Timestamp
Add new and modified data to three history
tables - Loaded into the Data Warehouse
The process takes several hours. It is run
during the Spring, Summer and Fall terms at -
End of Drop/Add
Mid Term
End of Term
KK
16Data WarehouseStudent Snapshot Data
The Data Warehouse snapshot data is designed to
be
- Consistent Captured at comparable times each
term Data will not be updated - Flexible Ability to handle information needs not
yet defined - Easy to access Data is in a single
database Denormalized, redundant data fewer
tables to join - Expandable New data elements may be added in the
future
KC
17Data Warehouse SFCC Servers
- Two Dell PowerEdge 6400 servers 1 server for
Development 1 server for Production
- Both servers currently contain 2 Gig Memory
100 Gig Storage Dual Processors - Future expansion includes Four
Processors Storage Array
CC
18Data Warehouse SFCC Server Components
- IBM HTTP Server
- SQL Server 2000 / DTS / Analysis Services
- DB2 Connect / CM2 / ODBC
- FTP Server
- E-mail notification via Collaborative Data
Objects (CDO)
CC
19CC
20Data Warehouse Data Transformation Services
CC
21Data Warehouse Data Transformation Services
C?
22Data WarehouseBI Tool Crystal Decisions
Report Development -
- Crystal Reports Create compelling views of data
without relying on IT - Crystal Analysis OLAP reporting with
multi-dimensional cubes
Web Delivery -
- Crystal Enterprise Web-based delivery of
information via ePortfolio Full report
drill-down capabilities
?K
23Data WarehouseCrystal Reports
SFCC Crystal Reports Developers -
- 25 Crystal Reports licenses
- 1 developer per Division within the college
- Commitment, Aptitude
- Job Description changed to include report
development - On-site 2-day Crystal Report Design training
classes - Camp Crystal training
- ODBC connection to Data Warehouse
- SFCC Crystal Users Group
- Publish reports to Crystal Enterprise
KK
24Crystal Report
Cross-Tab and Chart
KK
25Crystal Report
Cross-Tab and Chart
KK
26Crystal Report
World Map
KK
27Crystal Report
Report Drill-Down
K?
28Data WarehouseBI Tool Crystal Decisions
Crystal Enterprise includes
- Crystal Management Console User, content and
server management - ePortfolio A web interface for end users to
view, schedule, and monitor published
reports User authentication LDAP, NT or
Enterprise - Report Application Server (RAS) Allows users to
modify reports
?K
29Crystal Enterprise ePortfolio
KK
30SFCC Data Warehouseand Decision Support System
Future Development Plans include
- Automatic scheduling of reports
- Build OLAP cubes with Microsoft Analysis Service
- Develop Crystal Analysis Reports
- Daily ETL of Enrollment and Financial data
- Upgrade to Crystal 9.0
Questions? chuck.richardson_at_sfcc.edu
kathy.anderson_at_sfcc.edu
KK