Title: Kumar Neti
1Group 4
- Kumar Neti
- Amresh Mohanlal
- Angela Daniels
- Susan Shanlever
MIS 6443 Database Concepts Data Warehousing Dr.
Richard Segall December 8, 2003
2Exploration Warehousing
3What is exploration Warehousing?
- Exploration Warehouse"is an exciting and new
data warehouse construct. - An exploration warehouse is a structure devoted
solely to data exploration and data mining. - It is a DSS architectural structure whose purpose
is to provide a safe haven for exploratory and
very ad hoc processes away from the primary
enterprise data warehouse.
4Why is it Needed?
- The typical Business Analysts are Explorers and
in most cases, are the brightest and the most
motivated of - employees in a corporation
- The business perspective of these corporate
explorers had promise of huge savings and profit
potential. - Not providing these business explorers a place to
test their theoriesa proving groundhas limited
the intellectual growth of many corporations. - Exploration Warehousing provides that ground to
those explorers to use their entrepreneurial
skills.
5Exploration Warehousing- Structure
- The Needs of an Analytical Approach are very
unstructured, which, calls for a separate
architectural structure. - Criticality to the mission of business justifies
the need for a permanent entity - The data inside the Exploration warehouse is very
historical, granular and integrated - The size of the exploration warehouse
accommodates the many analytical cases that will
be analyzed.
6Structure contd..
- The database structure commonly deployed in an
exploration warehouse is the normalized
structure. - The normalized structure is optimal because the
exploration warehouse is servicing people who do
not know what they want - A financial explorer can build one exploration
warehouse and a marketing explorer can build
another exploration warehouse. As long as the
different exploration warehouses have the same
foundation, the enterprise data warehouse, there
is always a single point of reconciliation.
7Kinds of Queries
8How is it useful?
- The exploration warehouse provides the ability
to - To quickly load data, the ability to structure
the data in a flexible form because the analyst
often has no preconceived idea how the data
inside the exploration warehouse will be
accessed. - To access and analyze very large amounts of data
because the analyst needs to work with detail and
history - To execute queries in a heuristic mode with no
preconceived idea of what may be found at the
outset. - To handle many iterations of analysis quickly in
order to allow the analyst to make subtle
refinements to queries to help shape
understanding and allow the pursuit of different
trains of thought, the ability to look for
associations between types of data and patterns
that are useful
9Positioning of the exploration warehouse
10Exploration warehouse Stand Alone
- Also called the prototype mode, the exploration
warehouse can pull data directly from operational
sources, data repositories or external sources.
Its ability to structure the data "on-the-fly"
means that the exploration warehouse does not
necessarily have to physically store the data in
the form of the model itself. - It often makes sense to store the data in the
exploration warehouse at a very low level of
granularity, then allow the exploration warehouse
to recreate the data in the form desired by the
end user.
11Stand Alone contd..
- When the exploration warehouse is used in a
prototype mode, it serves as a "trial balloon"
for the testing of the initial design of the data
warehouse. - The nature of the prototype warehouse in this
mode is one that allows an enterprise warehouse
to be constructed and reconstructed quickly when
the designer finds data that is not quite right
or that relationships just dont add up. - In many cases, the same technology deployed in
the prototype warehouse will serve to house the
enterprise warehouse provided appropriate
scalability exists. This can further shorten the
cycle of enterprise warehouse creation.
12Exploration and Data mining
- Data mining is the exploration and analysis of,
by automatic means of large volumes of data in
order to discover meaningful patterns and rules. - Data mining algorithms are computationally
intensive, and require multiple passes over huge
quantities of data. An exploration warehouse that
supports full volume analysis without the need
for sampling or extensive data manipulation is
preferable. - So, exploration warehousing acts as a bridge to
Data mining.
13Need For A Separate Warehouse
- Performance in the enterprise data warehouse is
not affected when the explorer builds an
exploration warehouse and does the exploration
against it. - Explorers might run an unlimited amount of
processes against an unlimited amount of data in
an unpredictable manner.Then the enterprise data
warehouse does not serve as a viable foundation
for corporate exploration. It is into these
circumstances that the exploration warehouse
plays exceedingly well.
14Housing the Exploration Warehouse
- The exploration and the prototype warehouses can
be housed in standard DBMS technology, but a much
better alternative is for the exploration and the
prototype warehouses to be housed in token
database technology - Token database technology differs radically from
standard database technology. Because data is
greatly condensed in a token based database,
entire databases can be placed in memory which
enhances processing speed - The possibility of indexing all attributes exists
in a token based data base. Once all attributes
are indexed, heuristic analysis is unlimited.
15Relationship with Meta data
- Meta data plays an important role in all parts of
the DSS environment, and the exploration and
prototype warehouse environments are no exception
- Because explorers and designers are looking at
the exploration and prototype warehouses in many
ways, some of which have never been examined
before, Meta data plays an especially important
role. - There needs to be an effective Meta data layer at
the enterprise data warehouse. That layer needs
to be able to be transported to the exploration
and prototype warehouse environment every time
there is a reconstruction of the exploration or
prototype warehouse
16Some Essential Characteristics
- Must have the ability to store and manage data in
a manner that is optimal for the access and
analysis of details of data. The cost of storage
must be able to accommodate many details of data
and do so at a sensible price. - It must allow the analyst to be able to easily
change the content and the structure of the data
in the warehouse - Must have the ability to accommodate a wide
variety of analytical interfaces. The analyst
needs an elegant and robust set of analytical
capabilities.
17Essential Characteristics contd..
- Should provide a substantial reduction of the
technical "bits and grits" that go with database
and data warehouse optimization and maintenance.
The introduction of - the exploration warehouse must allow experts
to focus on business satisfaction. - The explorer should be able to select and reshape
data into and out of the exploration warehouse at
will, changing the structure and the content of
the data as the requirements become more focused
during the exploration process
18How is it useful to business?
- Exploration warehousing brings new meaning and
new capabilities to business empowerment. Free
from the IT constraints of first generation OLTP
database systems, this new breed of data
warehousing provides the freedom for business
analysts to create and analyze very large
databases in real time without imposing the
burdens of complex mathematical or statistical
theory. - It frees the business analysts from the
limitations imposed by transaction and
operational processing. - Gives Information processing to an entirely new
audience of information consumers who have to
date not been able to actively participate in the
decision making process.
19Kumars Question
- What are some of the characteristics that are
essential in an Exploration Warehouse?
20- Thank you
- For your Attention
21THE GLOBAL DATA WAREHOUSE
- Presented By
- Amresh Mohanlal
22The Global Data Warehouse
- The global Data Warehouse is one that is
geographically distributed, usually over multiple
countries and multiple time zones - Global Data Warehouse will also bring about
profound changes in Information Logistics as the
Internet dramatically reduces global boundaries - Global Data Warehouse is a warehouse in all
respects and has a Centralized Data Warehouse - A global Data warehouse integrates data from
multiple distributed heterogeneous databases and
other information sources
23Uses of Global Data Warehouse
- Global risk management
- Global consolidated financial reporting
- Global customer aggregation
24Global Data Warehouse
- The global Data Warehouse will be housed at a
site designated as the central site and will be
fed from distant sites designated as outlying
sites
25Technological Heterogeneity
- One of the basic assumptions made about a Global
Data Warehouse and its outlying sites that
contribute data to the warehouse is that the
environment is technologically heterogeneous - The technological heterogeneity across the
central site and the outlying sites extends to
the hardware platform and the software i.e. the
DBMS and the Operating Systems
26Transfer Of Data
- Data is transported from the outlying site to the
headquarters site on a regular basis in order to
refresh the Global Data Warehouse with the
relevant activities
27Issues To Be Resolved While The Data is
Transferred
- The speed of transfer
- The volume of data that is transferred
- The reliability of transfer
- The protocol of transfer
- The timing of transfer
- The cost of transfer
28Levels of Granularity
- The data that flows through the Global Data
Warehouse environment has different levels Of
Granularity - The least granular data is found at the global
Data Warehouse
29Drill Down Processing
- One of the essential techniques for DSS
processing - In drill down processing the analyst begins at
the highest level of summarization and works to
successively lower levels of detail, until the
analyst discovers what data is of interest - The steps that are followed in a drill down
process are - Finding the path of summary
- Discover the algorithm used for calculation
- The data that has been included in the
calculation is identified - The analyst determines if a yet lower level of
drill down needs to be done
30Drill Down Processing Cont..
- Drill down starts at the global Data Warehouse
and goes to the outlying sites
31Documentation Of Processing
- Central to the process of drill down is the
documentation of the processing that occurs as
the data moves from the outlying site to the
global Data Warehouse - This documentation is captured in the metadata
that describes the global Data Warehouse - Metadata is the glue that holds the global data
environment together - Distributed metadata is required across the globe
32Supporting More Than One Global Data Warehouse
- The outlying sites can support more than one
global Data Warehouse
33Diversity Of Outlying Sites
- Global Data Warehouse will draw its source data
from data whose sites are very diverse
34Local Warehouses
- Each of the outlying sites can have its own local
Data Warehouse - The local Data Warehouse has only a coincidental
relationship with the global Data Warehouse - The local Data Warehouse may or may not serve as
part or all of the system of record of the global
Data Warehouse - The local Data Warehouses at the outlying site
can feed the global Data Warehouse
35Amreshs Question
- What are the Issues that are to be resolved while
the data is Transferred from the outlying sites
to the global data warehouse?
36Managing The Date Warehouse
37Why Manage the Data Warehouse?
- The faster the data warehouse grow the more data
becomes dormant - To keep the cost of the data warehouse at an
acceptable level. - To be able to take a proactive approach instead
of a reactive one
38Why is there so much data in the data warehouse?
- The data warehouse
- contains a robust amount of history
- Contains summary data as well as detailed data
- Detailed data is the most atomic data the
corporation has
39Data Warehouse Monitoring Makes Management Easier
- What data is the data warehouse is being uses
- Who is doing the activity
- What kind of activities are being submitted
- When is the activity occurring
40Tuning the data warehouse
- Once the data warehouse administrator understands
what data is being used, he or she can take
measures to cause the data ware house to perform
better.
- These measures consist of
- Creating extra indexes
- Summarizing data
- Separating data to make it easily and efficiently
accessed
41How does the data warehouse administrator use
monitoring to tune the data warehouse.
42Implementing a Data Warehouse
Presentation by Susan Shanlever December 8,
2003 MIS 6443 Dr. Richard Segall
43Overview
- Summary of Important Topics in Term Paper
- Relation to Class Subjects
- Reason for this Paper
- Question
44Important Topics
- Critical Success Factors for Implementing a Data
Warehouse - Operational
- Technical
- Compare SDLC with Proposed Three Phase
Implementation (Proposal by Mukherjee and DSouza)
45Systems Development Life Cycle
Project Identification and Selection
Project Initiation and Planning
Analysis
Logical Design
Physical Design
Implementation
Maintenance
Source Hoffer, Prescott, McFadden. Modern
Database Management. Prentice Hall.
46Phased Logic
Proposed Phased Logic of DW Implementations
Pre-Implementation Phase
Implementation Phase
Post-Implementation Phase
Source Mukherjee, Debasish, DSouza, Derrick.
Think Phased Implementation for successful Data
Warehousing. Information Systems Management.
(Spring 2003) 82-90.
47Comparison
Applies to
Project Identification and Selection
Pre-Implementation Phase
Project Initiation and Planning
Analysis
Implementation Phase
Logical Design
Physical Design
Post-Implementation Phase
Implementation
Maintenance
48Relation to Class Topics
- System Development Life Cycle (Rob and Coronel)
- Data as an Asset within a Company (Rob and
Coronel) - The Development Life Cycle (Inmon Chapter 1)
49Reason for Topic
- This class taught us how to setup a data
warehouse. As managers, an understanding of the
complete picture is required.
50Susans Question
- What are the critical success factors for
implementation of a data warehouse? Give
examples of each.