An overview of Data Warehousing and OLAP Technology - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

An overview of Data Warehousing and OLAP Technology

Description:

The data should be stored in single globally acceptable fashion. 6. Explanation (continues... Industry of product. Year of introduction. Front end tools ... – PowerPoint PPT presentation

Number of Views:196
Avg rating:3.0/5.0
Slides: 18
Provided by: mbd3
Category:

less

Transcript and Presenter's Notes

Title: An overview of Data Warehousing and OLAP Technology


1
An overview of Data Warehousing and OLAP
Technology
  • Presented By
  • Manish Desai

2
  • Introduction
  • What is data warehouse ?
  • Explanation of definition
  • Data warehouse Vs. Operational Database
  • Data warehouse architecture
  • Back end tools
  • Conceptual model
  • Database design
  • Warehouse servers
  • Index structures
  • Meta data
  • Conclusion
  • References

3
Introduction
  • Essential elements of decision support
  • Enables The Knowledge Worker to make better and
    faster decisions
  • Used in many industries like
  • Manufacturing (for order shipment)
  • Retail (for inventory management)
  • Financial Services (claims and risk analysis)
  • Every major database vendor offers product in
    this area

4
What is Data Warehouse ?
  • A data warehouse is a subject-oriented,
    integrated, time-varying, non-volatile collection
    of data that is used primarily in organizational
    decision making
  • Typically maintained separately from operational
    databases

5
Explanation of definition
  • Subject-Oriented
  • Designed around subject such as customer, vendor,
    product and activity
  • Does not includes data that are not needed for
    Decision support system (DSS)
  • Integrated
  • Most important feature
  • Consistent naming convention, measurement of
    variables and so forth
  • The data should be stored in single globally
    acceptable fashion

6
Explanation (continues)
  • Time Varying
  • All data in the warehouse should be accurate as
    of some moment in time
  • Data stored over a long time horizon (5 10
    years)
  • Key structure contains element of time
    (implicitly or explicitly)
  • Data once correctly recorded cant be updated
  • Non Volatile
  • No Update of data allowed
  • only loading and access of data operations

7
Data Warehouse Vs. Operational Database
8
Architecture
  • Data sourcing,migration,cleanup tools
  • Meta data repository
  • Data marts
  • Data query, reporting, analysis and mining tools
  • Data warehouse administration and management

9
Architecture (continues)
  • Distributed Data warehouse
  • Load balancing, scalability,higher availability
  • Meta data replicated and centrally administrated
  • Too expansive
  • Data marts
  • Departmental subset focused on selected subjects
  • example marketing department includes customer,
    sales and product tabels
  • Has own repository and administration
  • May lead to complex integration problems if not
    designed properly

10
Back end tools and Utilities
  • Data cleaning, loading, refreshing tools
  • Cleaning
  • Multiple source, possibility of errors
  • Example replace string sex by gender
  • Loading
  • Building indices, sorting and making access paths
  • Large amount of data
  • Incremental loading
  • Only updated tuples are inserted ,Process hard to
    manage
  • Refresh
  • Propagating updates
  • When to refresh ?
  • Set by administrator depending on user needs and
    traffic

11
Conceptual Model and front end tools
  • Multi dimensional view
  • Dimensions together uniquely determine the
    measure
  • Example Sales can be represented as
    city,product, data
  • Each dimension is described by set of attribute
  • Example product consist of
  • Category of product
  • Industry of product
  • Year of introduction
  • Front end tools
  • Multi dimensional spreadsheet
  • Supports Pivoting-reorientation
  • Roll_up - summarized data
  • Drill_down - go from high level to low level
    summary

12
Database design
  • Two ways to represent Multi dimensional model
  • Star schema
  • Database consist of single fact table and single
    table for each dimension
  • Each tuples in fact table consist of pointer to
    each of dimension
  • Snowflake schema
  • Refinement over star schema
  • Dimensional hierarchy is explicitly represented
    by normalizing dimension tables

13
Warehouse Servers
  • Specialized SQL servers
  • Provides advanced query language and query
    processing support for SQL queries over star and
    snowflake schemas
  • Example Redbrick
  • ROLAP
  • Between relational back end and client front end
    tools
  • Extend traditional relational servers to support
    multidimensional queries
  • Example Microstratergy
  • MOLAP
  • Multidimensional storage engine
  • Direct mapping
  • Example Essbase from Arbor Inc.

14
Index structures
  • Bit map indices
  • Use single bit to indicate specific value of
    attribute
  • Example
  • instead of storing eight characters to record
    engineer as skill of employee use single bit
  • id Name Skill
  • 1000 John 1
  • Join indices
  • Maintains the relationship between foreign key
    with its matching primary keys

15
Meta data and warehouse management
  • Its data about data
  • Used for building, maintain, managing and using
    data warehouse
  • Administrative meta data
  • Information about setting up and using warehouse
  • Business meta data
  • Business terms and definition
  • Operational meta data
  • Information collected during operation of
    warehouse

16
Conclusion
  • Data warehouse is the technology for the future.
  • data warehouse enables knowledge worker to make
    faster and better decisions

17
References
  • Inmon W. H.,Building the data warehouse
  • www.olapcouncil.org
  • www.pwp.starnetinc.com
  • www.arborsoft.com
  • Kimball, R. The data warehouse toolkit.
Write a Comment
User Comments (0)
About PowerShow.com