Dagsorden - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Dagsorden

Description:

Data Quality: Anomalies in Source Data. Data Entry Errors. Account Number Myopia. Anomalies. Free-Form. Fields. Legacy Data Surprises. Field Level ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 18
Provided by: larsol
Category:
Tags: dagsorden | myopia

less

Transcript and Presenter's Notes

Title: Dagsorden


1
Dagsorden
  • Hvorfor Datawarehouse
  • Sammenhængen i et DW
  • Produkter og platform
  • Faser i opbygning af DW
  • Schema
  • Scource
  • Target
  • Øvelser

2
Definitions
  • Data Warehouse
  • A multi-subject information store
  • Designed specifically for decision support
  • Data Mart
  • A subject-specific data warehouse
  • Designed specifically for decision support

3
Characteristics
Warehouse Data Mart Scope Corporate Line-of-Busi
ness Subjects multi-subject subject-specific Data
Sources many few Size 100GB - 1TB lt100GB Implemen
tation months - years months Platform Unix NT and
Unix
Warehouse
Data Marts
4
Kontekst for Datawarehouse
5
Produkter - Data Mart
  • Database
  • Entreprise manager
  • Designer
  • Builder
  • Discoverer
  • Hver har sit eget repository

6
Faser i DW
Data strukturer
Base view
Meta view
Rela- tionel model
Meta view
DW/ DM
OLTP
OLAP
Økonomi- systemer
Produktions- systemer
Personale- systemer
Internet- data
7
Schema typer
  • Starschema
  • Snowflakeschema

8
Construction Methodology
  • Data Modeling
  • Identify data sources
  • Identify source subset
  • Model Star Schema
  • Process Modeling
  • Build Plans
  • Dimension tables
  • Time Dimension
  • Fact table
  • Populate database
  • Business Modeling
  • Define end-user layer
  • OLAP

Any Data
Any Source
Any Access
Metadata
Design and Management
9
Datamart Designer
10
Builder Process Overview
Target
Source
Extraction, Transformation and Transport
11
Data Quality Anomalies in Source Data
Metadata
Field Level
Data Entry Errors
Anomalies
Account Number Myopia
Free-Form Fields
Legacy Data Surprises
Data Quality requires that you look at the data
that exists, not just the data definitions.
12
Data Quality Encountering Surprises
CUSNUM NAME ADDRESS
TYPE
90328574 Digital Equipment 187 N. PARK St. Salem
NH 01458 OEM
90328575 DEC 187 N. Pk. St. Sarem NH 01458 OEM
90238475 Digital 187 N. Park StSalem NH
01458
90233479 Digital Corp 187 N. Park Ave. Salem NH
01458 Comp
90233489 Digital Consulting 15 Main Street
Andover MA 02341 Consult
90234889 Digital Info Services PO Box 9 Boston MA
02210 Mail List
90345672 Digital Integration Park Blvd. Boston MA
04106 SYS INT
Noise in blank fields
No standardization
Anomalies
No unique key
Spelling
  • How do you correctly identify and consolidate
    anomalies from millions of records?

13
Oracle Data Mart Builder
Visual Data Flow Technology allows automation of
Dimension tables
Product - Extract SQL Query
KeyGeneration
Direct Path Loader
14
Oracle Data Mart Builder
Automating the Fact table
Orders - Extract SQL Query
ProductLookup
MarketLookup
Promotion Lookup
TimeLookup
Direct Path Loader
  • Extract, transform and cleanse fact data from
    source
  • Integrity checking of dimension values
  • Toolbox of predefined transforms

15
Data scource
  • Relationelle tabeller
  • Ascii filer
  • Fra alle kilder

16
Opdatering af DW - planer
17
Øvelser
  • The cookbook - gennemgå eksemplerne heri
Write a Comment
User Comments (0)
About PowerShow.com