Title: Dagsorden
1Dagsorden
- Hvorfor Datawarehouse
- Sammenhængen i et DW
- Produkter og platform
- Faser i opbygning af DW
- Schema
- Scource
- Target
- Øvelser
2Definitions
- Data Warehouse
- A multi-subject information store
- Designed specifically for decision support
- Data Mart
- A subject-specific data warehouse
- Designed specifically for decision support
3Characteristics
Warehouse Data Mart Scope Corporate Line-of-Busi
ness Subjects multi-subject subject-specific Data
Sources many few Size 100GB - 1TB lt100GB Implemen
tation months - years months Platform Unix NT and
Unix
Warehouse
Data Marts
4Kontekst for Datawarehouse
5Produkter - Data Mart
- Database
- Entreprise manager
- Designer
- Builder
- Discoverer
- Hver har sit eget repository
6Faser i DW
Data strukturer
Base view
Meta view
Rela- tionel model
Meta view
DW/ DM
OLTP
OLAP
Økonomi- systemer
Produktions- systemer
Personale- systemer
Internet- data
7Schema typer
- Starschema
- Snowflakeschema
8Construction Methodology
- Data Modeling
- Identify data sources
- Identify source subset
- Model Star Schema
- Process Modeling
- Build Plans
- Dimension tables
- Time Dimension
- Fact table
- Populate database
- Business Modeling
- Define end-user layer
- OLAP
Any Data
Any Source
Any Access
Metadata
Design and Management
9Datamart Designer
10Builder Process Overview
Target
Source
Extraction, Transformation and Transport
11Data Quality Anomalies in Source Data
Metadata
Field Level
Data Entry Errors
Anomalies
Account Number Myopia
Free-Form Fields
Legacy Data Surprises
Data Quality requires that you look at the data
that exists, not just the data definitions.
12Data Quality Encountering Surprises
CUSNUM NAME ADDRESS
TYPE
90328574 Digital Equipment 187 N. PARK St. Salem
NH 01458 OEM
90328575 DEC 187 N. Pk. St. Sarem NH 01458 OEM
90238475 Digital 187 N. Park StSalem NH
01458
90233479 Digital Corp 187 N. Park Ave. Salem NH
01458 Comp
90233489 Digital Consulting 15 Main Street
Andover MA 02341 Consult
90234889 Digital Info Services PO Box 9 Boston MA
02210 Mail List
90345672 Digital Integration Park Blvd. Boston MA
04106 SYS INT
Noise in blank fields
No standardization
Anomalies
No unique key
Spelling
- How do you correctly identify and consolidate
anomalies from millions of records?
13Oracle Data Mart Builder
Visual Data Flow Technology allows automation of
Dimension tables
Product - Extract SQL Query
KeyGeneration
Direct Path Loader
14Oracle Data Mart Builder
Automating the Fact table
Orders - Extract SQL Query
ProductLookup
MarketLookup
Promotion Lookup
TimeLookup
Direct Path Loader
- Extract, transform and cleanse fact data from
source - Integrity checking of dimension values
- Toolbox of predefined transforms
15Data scource
- Relationelle tabeller
- Ascii filer
- Fra alle kilder
16Opdatering af DW - planer
17Øvelser
- The cookbook - gennemgå eksemplerne heri