Kumar Neti - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Kumar Neti

Description:

... often has no preconceived idea how the data inside the exploration warehouse will be accessed. ... heuristic mode with no preconceived idea of what may be ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 51
Provided by: netik
Category:
Tags: kumar | neti

less

Transcript and Presenter's Notes

Title: Kumar Neti


1
Group 4
  • Kumar Neti
  • Amresh Mohanlal
  • Angela Daniels
  • Susan Shanlever

MIS 6443 Database Concepts Data Warehousing Dr.
Richard Segall December 8, 2003
2
Exploration Warehousing
  • Presented by
  • Kumar Neti

3
What is exploration Warehousing?
  • Exploration Warehouse"is an exciting and new
    data warehouse construct.
  • An exploration warehouse is a structure devoted
    solely to data exploration and data mining.
  • It is a DSS architectural structure whose purpose
    is to provide a safe haven for exploratory and
    very ad hoc processes away from the primary
    enterprise data warehouse.

4
Why is it Needed?
  • The typical Business Analysts are Explorers and
    in most cases, are the brightest and the most
    motivated of
  • employees in a corporation
  • The business perspective of these corporate
    explorers had promise of huge savings and profit
    potential.
  • Not providing these business explorers a place to
    test their theoriesa proving groundhas limited
    the intellectual growth of many corporations.
  • Exploration Warehousing provides that ground to
    those explorers to use their entrepreneurial
    skills.

5
Exploration Warehousing- Structure
  • The Needs of an Analytical Approach are very
    unstructured, which, calls for a separate
    architectural structure.
  • Criticality to the mission of business justifies
    the need for a permanent entity
  • The data inside the Exploration warehouse is very
    historical, granular and integrated
  • The size of the exploration warehouse
    accommodates the many analytical cases that will
    be analyzed.

6
Structure contd..
  • The database structure commonly deployed in an
    exploration warehouse is the normalized
    structure.
  • The normalized structure is optimal because the
    exploration warehouse is servicing people who do
    not know what they want
  • A financial explorer can build one exploration
    warehouse and a marketing explorer can build
    another exploration warehouse. As long as the
    different exploration warehouses have the same
    foundation, the enterprise data warehouse, there
    is always a single point of reconciliation.

7
Kinds of Queries
8
How is it useful?
  • The exploration warehouse provides the ability
    to
  • To quickly load data, the ability to structure
    the data in a flexible form because the analyst
    often has no preconceived idea how the data
    inside the exploration warehouse will be
    accessed.
  • To access and analyze very large amounts of data
    because the analyst needs to work with detail and
    history
  • To execute queries in a heuristic mode with no
    preconceived idea of what may be found at the
    outset.
  • To handle many iterations of analysis quickly in
    order to allow the analyst to make subtle
    refinements to queries to help shape
    understanding and allow the pursuit of different
    trains of thought, the ability to look for
    associations between types of data and patterns
    that are useful

9
Positioning of the exploration warehouse
10
Exploration warehouse Stand Alone
  • Also called the prototype mode, the exploration
    warehouse can pull data directly from operational
    sources, data repositories or external sources.
    Its ability to structure the data "on-the-fly"
    means that the exploration warehouse does not
    necessarily have to physically store the data in
    the form of the model itself.
  • It often makes sense to store the data in the
    exploration warehouse at a very low level of
    granularity, then allow the exploration warehouse
    to recreate the data in the form desired by the
    end user.

11
Stand Alone contd..
  • When the exploration warehouse is used in a
    prototype mode, it serves as a "trial balloon"
    for the testing of the initial design of the data
    warehouse.
  • The nature of the prototype warehouse in this
    mode is one that allows an enterprise warehouse
    to be constructed and reconstructed quickly when
    the designer finds data that is not quite right
    or that relationships just dont add up.
  • In many cases, the same technology deployed in
    the prototype warehouse will serve to house the
    enterprise warehouse provided appropriate
    scalability exists. This can further shorten the
    cycle of enterprise warehouse creation.

12
Exploration and Data mining
  • Data mining is the exploration and analysis of,
    by automatic means of large volumes of data in
    order to discover meaningful patterns and rules.
  • Data mining algorithms are computationally
    intensive, and require multiple passes over huge
    quantities of data. An exploration warehouse that
    supports full volume analysis without the need
    for sampling or extensive data manipulation is
    preferable.
  • So, exploration warehousing acts as a bridge to
    Data mining.

13
Need For A Separate Warehouse
  • Performance in the enterprise data warehouse is
    not affected when the explorer builds an
    exploration warehouse and does the exploration
    against it.
  • Explorers might run an unlimited amount of
    processes against an unlimited amount of data in
    an unpredictable manner.Then the enterprise data
    warehouse does not serve as a viable foundation
    for corporate exploration. It is into these
    circumstances that the exploration warehouse
    plays exceedingly well.

14
Housing the Exploration Warehouse
  • The exploration and the prototype warehouses can
    be housed in standard DBMS technology, but a much
    better alternative is for the exploration and the
    prototype warehouses to be housed in token
    database technology
  • Token database technology differs radically from
    standard database technology. Because data is
    greatly condensed in a token based database,
    entire databases can be placed in memory which
    enhances processing speed
  • The possibility of indexing all attributes exists
    in a token based data base. Once all attributes
    are indexed, heuristic analysis is unlimited.

15
Relationship with Meta data
  • Meta data plays an important role in all parts of
    the DSS environment, and the exploration and
    prototype warehouse environments are no exception
  • Because explorers and designers are looking at
    the exploration and prototype warehouses in many
    ways, some of which have never been examined
    before, Meta data plays an especially important
    role.
  • There needs to be an effective Meta data layer at
    the enterprise data warehouse. That layer needs
    to be able to be transported to the exploration
    and prototype warehouse environment every time
    there is a reconstruction of the exploration or
    prototype warehouse

16
Some Essential Characteristics
  • Must have the ability to store and manage data in
    a manner that is optimal for the access and
    analysis of details of data. The cost of storage
    must be able to accommodate many details of data
    and do so at a sensible price.
  • It must allow the analyst to be able to easily
    change the content and the structure of the data
    in the warehouse
  • Must have the ability to accommodate a wide
    variety of analytical interfaces. The analyst
    needs an elegant and robust set of analytical
    capabilities.

17
Essential Characteristics contd..
  • Should provide a substantial reduction of the
    technical "bits and grits" that go with database
    and data warehouse optimization and maintenance.
    The introduction of
  • the exploration warehouse must allow experts
    to focus on business satisfaction.
  • The explorer should be able to select and reshape
    data into and out of the exploration warehouse at
    will, changing the structure and the content of
    the data as the requirements become more focused
    during the exploration process

18
How is it useful to business?
  • Exploration warehousing brings new meaning and
    new capabilities to business empowerment. Free
    from the IT constraints of first generation OLTP
    database systems, this new breed of data
    warehousing provides the freedom for business
    analysts to create and analyze very large
    databases in real time without imposing the
    burdens of complex mathematical or statistical
    theory.
  • It frees the business analysts from the
    limitations imposed by transaction and
    operational processing.
  • Gives Information processing to an entirely new
    audience of information consumers who have to
    date not been able to actively participate in the
    decision making process.

19
Kumars Question
  • What are some of the characteristics that are
    essential in an Exploration Warehouse?

20
  • Thank you
  • For your Attention

21
THE GLOBAL DATA WAREHOUSE
  • Presented By
  • Amresh Mohanlal

22
The Global Data Warehouse
  • The global Data Warehouse is one that is
    geographically distributed, usually over multiple
    countries and multiple time zones
  • Global Data Warehouse will also bring about
    profound changes in Information Logistics as the
    Internet dramatically reduces global boundaries
  • Global Data Warehouse is a warehouse in all
    respects and has a Centralized Data Warehouse
  • A global Data warehouse integrates data from
    multiple distributed heterogeneous databases and
    other information sources

23
Uses of Global Data Warehouse
  • Global risk management
  • Global consolidated financial reporting
  • Global customer aggregation

24
Global Data Warehouse
  • The global Data Warehouse will be housed at a
    site designated as the central site and will be
    fed from distant sites designated as outlying
    sites

25
Technological Heterogeneity
  • One of the basic assumptions made about a Global
    Data Warehouse and its outlying sites that
    contribute data to the warehouse is that the
    environment is technologically heterogeneous
  • The technological heterogeneity across the
    central site and the outlying sites extends to
    the hardware platform and the software i.e. the
    DBMS and the Operating Systems

26
Transfer Of Data
  • Data is transported from the outlying site to the
    headquarters site on a regular basis in order to
    refresh the Global Data Warehouse with the
    relevant activities

27
Issues To Be Resolved While The Data is
Transferred
  • The speed of transfer
  • The volume of data that is transferred
  • The reliability of transfer
  • The protocol of transfer
  • The timing of transfer
  • The cost of transfer

28
Levels of Granularity
  • The data that flows through the Global Data
    Warehouse environment has different levels Of
    Granularity
  • The least granular data is found at the global
    Data Warehouse

29
Drill Down Processing
  • One of the essential techniques for DSS
    processing
  • In drill down processing the analyst begins at
    the highest level of summarization and works to
    successively lower levels of detail, until the
    analyst discovers what data is of interest
  • The steps that are followed in a drill down
    process are
  • Finding the path of summary
  • Discover the algorithm used for calculation
  • The data that has been included in the
    calculation is identified
  • The analyst determines if a yet lower level of
    drill down needs to be done

30
Drill Down Processing Cont..
  • Drill down starts at the global Data Warehouse
    and goes to the outlying sites

31
Documentation Of Processing
  • Central to the process of drill down is the
    documentation of the processing that occurs as
    the data moves from the outlying site to the
    global Data Warehouse
  • This documentation is captured in the metadata
    that describes the global Data Warehouse
  • Metadata is the glue that holds the global data
    environment together
  • Distributed metadata is required across the globe

32
Supporting More Than One Global Data Warehouse
  • The outlying sites can support more than one
    global Data Warehouse

33
Diversity Of Outlying Sites
  • Global Data Warehouse will draw its source data
    from data whose sites are very diverse

34
Local Warehouses
  • Each of the outlying sites can have its own local
    Data Warehouse
  • The local Data Warehouse has only a coincidental
    relationship with the global Data Warehouse
  • The local Data Warehouse may or may not serve as
    part or all of the system of record of the global
    Data Warehouse
  • The local Data Warehouses at the outlying site
    can feed the global Data Warehouse

35
Amreshs Question
  • What are the Issues that are to be resolved while
    the data is Transferred from the outlying sites
    to the global data warehouse?

36
Managing The Date Warehouse
  • Angela Daniels

37
Why Manage the Data Warehouse?
  • The faster the data warehouse grow the more data
    becomes dormant
  • To keep the cost of the data warehouse at an
    acceptable level.
  • To be able to take a proactive approach instead
    of a reactive one

38
Why is there so much data in the data warehouse?
  • The data warehouse
  • contains a robust amount of history
  • Contains summary data as well as detailed data
  • Detailed data is the most atomic data the
    corporation has

39
Data Warehouse Monitoring Makes Management Easier
  • What data is the data warehouse is being uses
  • Who is doing the activity
  • What kind of activities are being submitted
  • When is the activity occurring

40
Tuning the data warehouse
  • Once the data warehouse administrator understands
    what data is being used, he or she can take
    measures to cause the data ware house to perform
    better.
  • These measures consist of
  • Creating extra indexes
  • Summarizing data
  • Separating data to make it easily and efficiently
    accessed

41
How does the data warehouse administrator use
monitoring to tune the data warehouse.
42
Implementing a Data Warehouse
Presentation by Susan Shanlever December 8,
2003 MIS 6443 Dr. Richard Segall
43
Overview
  • Summary of Important Topics in Term Paper
  • Relation to Class Subjects
  • Reason for this Paper
  • Question

44
Important Topics
  • Critical Success Factors for Implementing a Data
    Warehouse
  • Operational
  • Technical
  • Compare SDLC with Proposed Three Phase
    Implementation (Proposal by Mukherjee and DSouza)

45
Systems Development Life Cycle
Project Identification and Selection
Project Initiation and Planning
Analysis
Logical Design
Physical Design
Implementation
Maintenance
Source Hoffer, Prescott, McFadden. Modern
Database Management. Prentice Hall.
46
Phased Logic
Proposed Phased Logic of DW Implementations
Pre-Implementation Phase
Implementation Phase
Post-Implementation Phase
Source Mukherjee, Debasish, DSouza, Derrick.
Think Phased Implementation for successful Data
Warehousing. Information Systems Management.
(Spring 2003) 82-90.
47
Comparison
Applies to
Project Identification and Selection
Pre-Implementation Phase
Project Initiation and Planning
Analysis
Implementation Phase
Logical Design
Physical Design
Post-Implementation Phase
Implementation
Maintenance
48
Relation to Class Topics
  • System Development Life Cycle (Rob and Coronel)
  • Data as an Asset within a Company (Rob and
    Coronel)
  • The Development Life Cycle (Inmon Chapter 1)

49
Reason for Topic
  • This class taught us how to setup a data
    warehouse. As managers, an understanding of the
    complete picture is required.

50
Susans Question
  • What are the critical success factors for
    implementation of a data warehouse? Give
    examples of each.
Write a Comment
User Comments (0)
About PowerShow.com