Data Warehousing - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Data Warehousing

Description:

Example: SQL, Siebel analytics. Outline. Introduction. Product ... Siebel Business Analytics Applications. Oracle Warehouse builder 10gR2. Written in Java ... – PowerPoint PPT presentation

Number of Views:129
Avg rating:3.0/5.0
Slides: 46
Provided by: saif9
Category:

less

Transcript and Presenter's Notes

Title: Data Warehousing


1
Data Warehousing
  • Joves Luo
  • Mohammad S Islam
  • Shilpa Nemani
  • Sergio Gonzalez

2
Outline
  • Introduction
  • Product categories and features
  • Market overview
  • Technical details of Oracle DWH tools
  • Sample application
  • Future outlook

3
Outline
  • Introduction
  • Product categories and features
  • Market overview
  • Technical details of Oracle DWH tools
  • Sample application
  • Future outlook

4
What is Data Warehousing?
  • A warehouse is a subject-oriented, integrated,
    time-variant and non-volatile collection of data
    in support of management's decision making
    process.
  • Subject-oriented
  • Integrated
  • Time-variant
  • Non-volatile

5
A data Warehouse is ..
  • an environment that provides a vehicle to deliver
    meaningful information to the business community
    to enable effective operations oversight and
    business decision support.
  • the software that manages data on physical
    storage devices. The software provides the
    ability to store, access and modify the data.

A DBMS is ..
6
Benefits of a Data Warehouse
  • Reliable reporting
  • Rapid access to data
  • Integrated data
  • Flexible presentation of data
  • Better decision making

7
Why Build a Data Warehouse?
End Result
Disparate Data sources
Lack of data Credibility
Unclean Data
Cost
Different Summarisation
Sales
Cost
Inconsistent Definitions
Sales
8
Data Warehousing Tools Market Definition
  • Data Warehouse Generation
  • Data Warehouse Management
  • Data Warehouse Access

9
Data Warehouse Generation
  • ETL Extract, Transform, Load
  • Extraction is the process of extracting the data
    from different source systems .
  • Common formats include relational databases ,
    flat files or non-relational database structures
    such as IMS or other database structures such as
    VSAM.
  • Transformation The transform phase applies a
    series of rules or functions to the extracted
    data to derive the data to be loaded.
  • Loading The load phase loads the data into the
    data warehouse

10
Data Warehouse Management
  • RDBMS
  • A type of DBMS in which the database is organized
    and accessed according to the relationships
    between data values.
  • Relational Model is based on the principles of
    relational algebra.
  • Example RDBMS Systems Oracle, SQL Server, DB2,
    Sybase, etc.
  • http//www.dwinfocenter.org/casefor.html

11
Data Warehouse Access
  • These tools are used to enable end
  • users to access and analyze information stored
    in the data warehouse.
  • Example SQL, Siebel analytics

12
Outline
  • Introduction
  • Product categories and features
  • Market overview
  • Technical details of Oracle DWH tools
  • Sample application
  • Future outlook

13
Features-Oracle Warehouse Builder-
  • Data profiling and data quality
  • Mapping editor
  • Pluggable mappings
  • Data correction
  • Can keep historical data over time
  • Experts with graphical editor and scripting
    language
  • ERP integration
  • SAP
  • PeopleSoft
  • Programming API available

14
Features-Oracle Database-
  • Multiple OS support Windows, Linux, UNIX
  • Data compression
  • Partitioning
  • XML support
  • Data encryption
  • Unlimited database size
  • Basic ETL tools

15
Features-Oracle Warehouse Access-
  • Data Sources
  • SQL, PL/SQL, XML, JDBC, Oracle OLAP, text files
  • Wizards
  • Output
  • File Formats PDF, Word, Excel, web page, XML
  • FTP, WebDAV
  • Printer
  • Email
  • Java API
  • Customizable reports

16
Outline
  • Introduction
  • Product categories and features
  • Market overview
  • Technical details of Oracle DWH tools
  • Sample application
  • Future outlook

17
Marketing Data
  • 9.6 billion in revenue (11.3)
  • Fragmented market.
  • Larger vendors have expanded their product
    portfolios.
  • Specialty vendors must find niches.
  • Market is expected to continue growing at a
    steady rate.

18
Main Players in Data Warehousing
  • Oracle
  • Dominates the Data Warehouse Management (39.8).
  • Is acquiring companies (PeopleSoft, Siebel) to
    help gain market share in other areas.
  • Bundles basic ETL functionality with the RDBMS.
  • IBM
  • Most of its data warehousing tools revenue is
    derived from its RDBMS products.
  • Acquisition of Ascential instantly made it one of
    the top 3 players in the data warehouse
    generation market.
  • SAS Institute
  • Revenue is more evenly distributed across the
    three segments of the market
  • Data warehouse generation 19.1
  • Data warehouse management 5.9
  • Data warehouse access 12.7
  • Microsoft
  • 22.8 growth rate.
  • Increasingly a key player in data warehouse
    access through its embedded and standalone BI
    tools.
  • Data warehouse generation.

19
Market Share by Market Segment
20
Outline
  • Introduction
  • Product categories and features
  • Market overview
  • Technical details of Oracle DWH tools
  • Sample application
  • Future outlook

21
  • Oracle Warehouse builder 10gR2
  • Oracle 10g database
  • Siebel Business Analytics Applications

22
Oracle Warehouse builder 10gR2
  • Written in Java
  • Supports Windows and various flavors of Unix
  • Extract data from databases, flat files, SAP.
  • Exchange metadata (Oracle Designer, CA ERWin)
  • Metadata change management

23
Oracle Warehouse builder 10gR2Architecture
  • Graphical Object Definition (source modules)
  • Source-to-Target Data Mappings
  • Automated Code Generation (optimized)

24
Demo
  • www.oracle.com/technology/products/warehouse/SHORT
    _intro_to_owb10gR2/SHORT_intro_to_owb10gR2_viewlet
    _swf.html

25
Siebel Business Analytics Applications
  • Pre-built
  • Siebel Sales Analytics
  • Siebel Service and Contact Center Analytics
  • Siebel Marketing Analytics
  • Siebel Supply Chain Analytics
  • Siebel Financial Analytics
  • Siebel Workforce Analytics
  • Ad-hoc
  • Siebel Answers

26
Pre-built Siebel Financial Analytics
27
Ad-hoc Siebel Answers
28
Ad-hoc Siebel Answers
29
Ad-hoc Siebel Answers
30
Outline
  • Introduction
  • Product categories and features
  • Market overview
  • Technical details of Oracle DWH tools
  • Sample application
  • Future outlook

31
Wal-Mart
  • Largest retailer in the world with 6,640 retail
    stores
  • Associates 1.8 million- worldwide
  • Sales 315.65 billion
  • Net Income- 11.23 billion

32
Wal-Mart DWH
  • DWH HW 32 node NCR WorldMark 5100M
  • DB HW 365 AMP NCR 3600 server
  • DBMS NCR Teradata
  • Size 423 TB
  • Customer CRM and Retail Link
  • Process 65 million transactions per week
  • 18,000 users have access
  • Including 5,000 vendor partners

33
Benefits to Wal-Mart
  • Wal-Mart uses its data to track demand and set
    inventory targets for specific products.
  • Identifies where a new store should be built
    based on customer demand
  • Identifies how stores are performing across the
    nation
  • Reduced excess inventory in individual stores
  • Avoided wasted funds in building stores which
    would fail

34
Outline
  • Introduction
  • Product categories and features
  • Market overview
  • Technical details of Oracle DWH tools
  • Sample application
  • Future outlook

35
Evolution
  • Speed and Performance
  • Retrieve data quicker
  • Support more users
  • Support more data
  • Usage and Architecture
  • Live data insertion and retrieval
  • Platform independence

36
Hardware Advancement
  • Processor faster queries and calculations
  • Faster processors
  • Multi processors
  • RAM improves overall performance and speed
  • Faster RAM
  • Multi channel RAM
  • Hard Drive improves data retrieval speed
  • Faster Hard Drive (RPM)
  • Multi HD implementation

37
Parallel Servers
  • What is it?
  • Single database spread across a computer cluster
  • Storage area network (SAN)
  • Network-attached storage (NAS)
  • Partitioned so that data overlaps across each
    computer
  • Background processes lock, update, and merge data
  • Why use it?
  • Speed increased transaction response time
  • Scale increased transaction volume
  • Dependence data redundancy lowers downtime

38
Parallel Servers cont.
  • Why not use it?
  • Data needs to be partitioned correctly
  • Constant changes need constant merges
  • If data is accessed from many different nodes
  • Costs requires special hardware setup
  • Computers, Networks, Shared Disk Subsystem
  • When is it available?
  • Now Oracle Parallel Server (OPS)
  • First of its kind

39
Active Data Warehouses
  • What is it?
  • Combines the source data, data transformation,
    and data warehouse layers
  • Real-time reporting of up to the minute data
  • Report tools on live database
  • Why use it?
  • Able to make critical decisions based on most
    current data
  • No more yearly, quarterly, or weekly reports
  • Reports are live and fluid

40
Active Data Warehouses cont.
  • Why not use it?
  • Accuracy of data can not be determined
  • No middle layer to protect data (insertion and
    deletion)
  • Security is questionable
  • Accountability for data is fuzzy
  • When is it available?
  • Now Teradata Enterprise Data Warehouse
  • First of its kind

41
Pure Java Data Warehouses
  • What is it?
  • Developed by Sun Micro Systems (SMS) and partners
  • Java application in all four areas data access,
    data extraction, data management, and data mining
  • Acts as both solution and a standard
  • Why use it?
  • Cost effective no rewriting of code needed
  • Ease everything written in one language
  • Deploy across all systems and environments
  • Compatibility everything works without
    translation

42
Pure Java Data Warehouses cont.
  • Why not use it?
  • Limited choices can only use SMS partners
  • Performance same limitations as traditional
    java programs
  • Gain in compatibility, loss in optimization
  • When is it available?
  • Soon to Now SMS and partners are working on the
    different tools used in the new standard

43
Pure Java Data Warehouses cont.
  • D2K The Tapestry data warehouse solution
    automates the process of extracting, moving,
    transforming, and loading data from heterogeneous
    sources into a data warehouse or data mart.
  • Fiserv Featuring InformEnt software, enabling
    everyone within an organization access to
    information necessary to perform more informed
    and better business decisions.
  • Informix Dynamic Server with Universal Data and
    Relational OLAP options demonstrates an on-line
    analytical processing (OLAP) application
    accessing object relational database, focusing on
    portfolio management and analysis.
  • InterNetivity, Inc. dbProbe 3.0 combines
    award-winning data analysis capabilities with a
    sleek new look and powerful features such as a
    direct manipulation user interface, support for
    OLAP sources such as Microsoft OLE DB for OLAP
    and Informix Metacube, and a complete graphical
    cube builder.

44
Pure Java Data Warehouses cont.
  • Seagate Software Seagate Crystal Info is an
    enterprise-scale reporting and data analysis
    system that lets users schedule and analyze
    reports on their desktop or via their browser
    using Java technology.
  • SRA International, Inc. SRA's KDD Explorer
    toolset, enables users to discover significant
    patterns and indicators of fraudulent activity
    and analyze the results through an interface
    certified by Sun as 100 Pure Java.
  • SQRIBE Technologies Displaying ReportMart 1.0,
    an integrated framework for the creation,
    management, and secure distribution of all
    enterprise information, and PowerSQRIBE 1.0,
    which provides thin-client interactive analysis
    for massive enterprise deployment.
  • Telran Technologies Telran System is an
    adaptive data warehouse management system that
    delivers performance, manageability and
    scalability to data support, data warehouses, and
    mixed-use (OLTP/DSS) ERP applications.

45
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com