Copy of 35 - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Copy of 35

Description:

make sure that ETL supports - - record level processing - parallel streaming ... mainframe or server based. transformation - robust transformation - native dbms ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 26
Provided by: billi155
Category:
Tags: copy | make | run | up

less

Transcript and Presenter's Notes

Title: Copy of 35


1
a presentation by W H Inmon
2
no wholesale movement of data into the data
warehouse reading directly from the
native operational dbms
3
movement of incremental changes of data into
the edw in a snapshot format
4
using the log tape to find delta data then move it
5
the log tape processing can be run on a processor
separate from the oltp processor
6
there is no impact on the online window
7
another approach is to capture changes as they
occur through the BMC approach
8
another opportunity for a performance gain is
to move the data during physical I/O operations
9
note that etl processing still needs to be done
after physical disk movement
10
make sure that ETL supports -
- record level processing - parallel streaming -
mainframe or server based transformation -
robust transformation - native dbms record
selection - incremental selection of oltp data
11
dbms selection throughout the architecture
is very important
12
it is HIGHLY unlikely that any one dbms will
be optimal for all processing
13
project/ad hoc warehouses
project or temporary warehouses can save a lot of
development and analytical effort
14
exploration
project
sample
using sampling techniques for initial
analysis can save HUGE resources
15
iteration 1 iteration 2 iteration 3 .
final analysis
doing iterative analysis against samples of data
then doing the final analysis against the large
data base can save LOTS of resources
16
end user education can save huge amounts of
resources and is simple to do
17
metadata can help performance
if an analyst knows what has already
been created, there is no need to recreate it
18
but how does an analyst know what has
already been created?
an analyst knows through looking at metadata..
19
do not use referential integrity
infrastructure that was designed for the
operational environment
20
if relationships are defined, enforce
through audit programs
21
monthly
yearly
weekly
hourly
daily
a rolling summary structure of data can save huge
resources
22
condensation of data maximizes on I/O and buffer
hits
23
the physical colocation of data can
optimize performance
24
perform inserts and deletes during off hours
25
using 3rd party utilities for standard data
base operations - - back up - recovery -
indexing - relationship management
Write a Comment
User Comments (0)
About PowerShow.com