Title: Data Warehousing
1Data Warehousing
- Joves Luo
- Mohammad S Islam
- Shilpa Nemani
- Sergio Gonzalez
2Outline
- Introduction
- Product categories and features
- Market overview
- Technical details of Oracle DWH tools
- Sample application
- Future outlook
3Outline
- Introduction
- Product categories and features
- Market overview
- Technical details of Oracle DWH tools
- Sample application
- Future outlook
4What is Data Warehousing?
- A warehouse is a subject-oriented, integrated,
time-variant and non-volatile collection of data
in support of management's decision making
process. - Subject-oriented
- Integrated
- Time-variant
- Non-volatile
5A data Warehouse is ..
- an environment that provides a vehicle to deliver
meaningful information to the business community
to enable effective operations oversight and
business decision support. - the software that manages data on physical
storage devices. The software provides the
ability to store, access and modify the data.
A DBMS is ..
6 Benefits of a Data Warehouse
- Reliable reporting
- Rapid access to data
- Integrated data
- Flexible presentation of data
- Better decision making
7Why Build a Data Warehouse?
End Result
Disparate Data sources
Lack of data Credibility
Unclean Data
Cost
Different Summarisation
Sales
Cost
Inconsistent Definitions
Sales
8Data Warehousing Tools Market Definition
- Data Warehouse Generation
- Data Warehouse Management
- Data Warehouse Access
-
9Data Warehouse Generation
- ETL Extract, Transform, Load
- Extraction is the process of extracting the data
from different source systems . - Common formats include relational databases ,
flat files or non-relational database structures
such as IMS or other database structures such as
VSAM. - Transformation The transform phase applies a
series of rules or functions to the extracted
data to derive the data to be loaded. - Loading The load phase loads the data into the
data warehouse
10Data Warehouse Management
- RDBMS
- A type of DBMS in which the database is organized
and accessed according to the relationships
between data values. - Relational Model is based on the principles of
relational algebra. - Example RDBMS Systems Oracle, SQL Server, DB2,
Sybase, etc. - http//www.dwinfocenter.org/casefor.html
11Data Warehouse Access
- These tools are used to enable end
- users to access and analyze information stored
in the data warehouse. - Example SQL, Siebel analytics
12Outline
- Introduction
- Product categories and features
- Market overview
- Technical details of Oracle DWH tools
- Sample application
- Future outlook
13Features-Oracle Warehouse Builder-
- Data profiling and data quality
- Mapping editor
- Pluggable mappings
- Data correction
- Can keep historical data over time
- Experts with graphical editor and scripting
language - ERP integration
- SAP
- PeopleSoft
- Programming API available
14Features-Oracle Database-
- Multiple OS support Windows, Linux, UNIX
- Data compression
- Partitioning
- XML support
- Data encryption
- Unlimited database size
- Basic ETL tools
15Features-Oracle Warehouse Access-
- Data Sources
- SQL, PL/SQL, XML, JDBC, Oracle OLAP, text files
- Wizards
- Output
- File Formats PDF, Word, Excel, web page, XML
- FTP, WebDAV
- Printer
- Email
- Java API
- Customizable reports
16Outline
- Introduction
- Product categories and features
- Market overview
- Technical details of Oracle DWH tools
- Sample application
- Future outlook
17Marketing Data
- 9.6 billion in revenue (11.3)
- Fragmented market.
- Larger vendors have expanded their product
portfolios. - Specialty vendors must find niches.
- Market is expected to continue growing at a
steady rate.
18Main Players in Data Warehousing
- Oracle
- Dominates the Data Warehouse Management (39.8).
- Is acquiring companies (PeopleSoft, Siebel) to
help gain market share in other areas. - Bundles basic ETL functionality with the RDBMS.
- IBM
- Most of its data warehousing tools revenue is
derived from its RDBMS products. - Acquisition of Ascential instantly made it one of
the top 3 players in the data warehouse
generation market. - SAS Institute
- Revenue is more evenly distributed across the
three segments of the market - Data warehouse generation 19.1
- Data warehouse management 5.9
- Data warehouse access 12.7
- Microsoft
- 22.8 growth rate.
- Increasingly a key player in data warehouse
access through its embedded and standalone BI
tools. - Data warehouse generation.
19Market Share by Market Segment
20Outline
- Introduction
- Product categories and features
- Market overview
- Technical details of Oracle DWH tools
- Sample application
- Future outlook
21- Oracle Warehouse builder 10gR2
- Oracle 10g database
- Siebel Business Analytics Applications
22Oracle Warehouse builder 10gR2
- Written in Java
- Supports Windows and various flavors of Unix
- Extract data from databases, flat files, SAP.
- Exchange metadata (Oracle Designer, CA ERWin)
- Metadata change management
23Oracle Warehouse builder 10gR2Architecture
- Graphical Object Definition (source modules)
- Source-to-Target Data Mappings
- Automated Code Generation (optimized)
24Demo
- www.oracle.com/technology/products/warehouse/SHORT
_intro_to_owb10gR2/SHORT_intro_to_owb10gR2_viewlet
_swf.html
25Siebel Business Analytics Applications
- Pre-built
- Siebel Sales Analytics
- Siebel Service and Contact Center Analytics
- Siebel Marketing Analytics
- Siebel Supply Chain Analytics
- Siebel Financial Analytics
- Siebel Workforce Analytics
- Ad-hoc
- Siebel Answers
26Pre-built Siebel Financial Analytics
27Ad-hoc Siebel Answers
28Ad-hoc Siebel Answers
29Ad-hoc Siebel Answers
30Outline
- Introduction
- Product categories and features
- Market overview
- Technical details of Oracle DWH tools
- Sample application
- Future outlook
31Wal-Mart
- Largest retailer in the world with 6,640 retail
stores - Associates 1.8 million- worldwide
- Sales 315.65 billion
- Net Income- 11.23 billion
32Wal-Mart DWH
- DWH HW 32 node NCR WorldMark 5100M
- DB HW 365 AMP NCR 3600 server
- DBMS NCR Teradata
- Size 423 TB
- Customer CRM and Retail Link
- Process 65 million transactions per week
- 18,000 users have access
- Including 5,000 vendor partners
33Benefits to Wal-Mart
- Wal-Mart uses its data to track demand and set
inventory targets for specific products. - Identifies where a new store should be built
based on customer demand - Identifies how stores are performing across the
nation - Reduced excess inventory in individual stores
- Avoided wasted funds in building stores which
would fail
34Outline
- Introduction
- Product categories and features
- Market overview
- Technical details of Oracle DWH tools
- Sample application
- Future outlook
35Evolution
- Speed and Performance
- Retrieve data quicker
- Support more users
- Support more data
- Usage and Architecture
- Live data insertion and retrieval
- Platform independence
36Hardware Advancement
- Processor faster queries and calculations
- Faster processors
- Multi processors
- RAM improves overall performance and speed
- Faster RAM
- Multi channel RAM
- Hard Drive improves data retrieval speed
- Faster Hard Drive (RPM)
- Multi HD implementation
37Parallel Servers
- What is it?
- Single database spread across a computer cluster
- Storage area network (SAN)
- Network-attached storage (NAS)
- Partitioned so that data overlaps across each
computer - Background processes lock, update, and merge data
- Why use it?
- Speed increased transaction response time
- Scale increased transaction volume
- Dependence data redundancy lowers downtime
38Parallel Servers cont.
- Why not use it?
- Data needs to be partitioned correctly
- Constant changes need constant merges
- If data is accessed from many different nodes
- Costs requires special hardware setup
- Computers, Networks, Shared Disk Subsystem
- When is it available?
- Now Oracle Parallel Server (OPS)
- First of its kind
39Active Data Warehouses
- What is it?
- Combines the source data, data transformation,
and data warehouse layers - Real-time reporting of up to the minute data
- Report tools on live database
- Why use it?
- Able to make critical decisions based on most
current data - No more yearly, quarterly, or weekly reports
- Reports are live and fluid
40Active Data Warehouses cont.
- Why not use it?
- Accuracy of data can not be determined
- No middle layer to protect data (insertion and
deletion) - Security is questionable
- Accountability for data is fuzzy
- When is it available?
- Now Teradata Enterprise Data Warehouse
- First of its kind
41Pure Java Data Warehouses
- What is it?
- Developed by Sun Micro Systems (SMS) and partners
- Java application in all four areas data access,
data extraction, data management, and data mining - Acts as both solution and a standard
- Why use it?
- Cost effective no rewriting of code needed
- Ease everything written in one language
- Deploy across all systems and environments
- Compatibility everything works without
translation
42Pure Java Data Warehouses cont.
- Why not use it?
- Limited choices can only use SMS partners
- Performance same limitations as traditional
java programs - Gain in compatibility, loss in optimization
- When is it available?
- Soon to Now SMS and partners are working on the
different tools used in the new standard
43Pure Java Data Warehouses cont.
- D2K The Tapestry data warehouse solution
automates the process of extracting, moving,
transforming, and loading data from heterogeneous
sources into a data warehouse or data mart. - Fiserv Featuring InformEnt software, enabling
everyone within an organization access to
information necessary to perform more informed
and better business decisions. - Informix Dynamic Server with Universal Data and
Relational OLAP options demonstrates an on-line
analytical processing (OLAP) application
accessing object relational database, focusing on
portfolio management and analysis. - InterNetivity, Inc. dbProbe 3.0 combines
award-winning data analysis capabilities with a
sleek new look and powerful features such as a
direct manipulation user interface, support for
OLAP sources such as Microsoft OLE DB for OLAP
and Informix Metacube, and a complete graphical
cube builder.
44Pure Java Data Warehouses cont.
- Seagate Software Seagate Crystal Info is an
enterprise-scale reporting and data analysis
system that lets users schedule and analyze
reports on their desktop or via their browser
using Java technology. - SRA International, Inc. SRA's KDD Explorer
toolset, enables users to discover significant
patterns and indicators of fraudulent activity
and analyze the results through an interface
certified by Sun as 100 Pure Java. - SQRIBE Technologies Displaying ReportMart 1.0,
an integrated framework for the creation,
management, and secure distribution of all
enterprise information, and PowerSQRIBE 1.0,
which provides thin-client interactive analysis
for massive enterprise deployment. - Telran Technologies Telran System is an
adaptive data warehouse management system that
delivers performance, manageability and
scalability to data support, data warehouses, and
mixed-use (OLTP/DSS) ERP applications.
45Thank you!