Creating the Warehouse Environment - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Creating the Warehouse Environment

Description:

Platform Considerations The Sterling Airlines case study uses a three-level directory structure. DW root level Support level Data Warehouse Environment level. – PowerPoint PPT presentation

Number of Views:138
Avg rating:3.0/5.0
Slides: 57
Provided by: BillFe3
Category:

less

Transcript and Presenter's Notes

Title: Creating the Warehouse Environment


1
Chapter 2
  • Creating the Warehouse Environment

2
Section 2.1
  • Supporting the Warehouse Environment

3
Objectives
  • Peruse the physical directory structures that
    support the Warehouse Environment.
  • Define Warehouse Environment metadata items.

4
Business Scenario
  • The structure of the metadata reflects the
    business process(es) being modeled.

5
Platform Considerations
  • SAS/Warehouse Administrator software executes on
    Windows or UNIX platforms.
  • Data structures are hierarchical and may require
    multiple levels in directory structures.

6
Platform Considerations
  • The Sterling Airlines case study uses a
    three-level directory structure.
  • DW root level
  • Support level
  • Data Warehouse Environment level.

7
Portability Considerations
  • The multi-level directory structure aides in
    porting the data warehouse from development to
    production.
  • Relative path specifications reduce editing of
    the metadata when switching from development to
    production.

8
Infrastructure of a Data Warehouse
  • Directory structures are used to store
  • metadata
  • data tables
  • specialty data stores
  • source code libraries.
  • Most directories contain SAS files and are
    accessed as SAS libraries.

9
Metadata Repository
Data Warehouse Metadata
Warehouse Environment Metadata
10
Data Tables
DW Foundation Layer Data Tables
Source Data Tables
11
Specialty Data Stores
Information Mart Items and Files
Summary Tables and MDDBs
12
Addin Tools, Scripts, and Process Entries
Control Scripts
Source Code Library
13
Navigating the Data Warehouse Environment
  • SAS/Warehouse Administrator has two primary user
    interfaces
  • Data Warehouse Environment (referred to as WA
    Explorer window)- display and edit metadata for
    warehouse elements
  • Process Editor- diagram and edit metadata for
    warehouse processes.

14
Navigating the Data Warehouse Environment
  • Start an interactive SAS session.
  • Invoke the Data Warehouse application.
  • Open the Sterling Airline Warehouse Environment.

15
Creating a Warehouse Environment
16
Creating the Environment Metadata Repository
  • The metadata that you enter in the Environment
    Property window is used to
  • 1. generate a LIBNAME statement for the
    environment
  • 2. assign a name and description to the
    environment.
  • After entering metadata in this window, verify
    your field entries before you select OK. After
    selecting OK, you CANNOT change these values,
    except for the description.

17
Getting Started
  • Step 1 Create a subdirectory structure.
  • Step 2 Start a SAS interactive session.
  • Step 3 Invoke the SAS/WA software application.
  • Step 4 Create a Warehouse Environment object on
    the SAS/WA desktop.

18
Creating a Warehouse Environment
  • Start an interactive SAS session.
  • Invoke the Data Warehouse application.
  • Define a new Data Warehouse Environment.

19
Section 2.2
  • Globally Defined Items

20
Environment Global Metadata
  • There are five types of global metadata
  • SAS libraries
  • hosts
  • DBMS connections
  • contacts
  • scheduling servers.

21
Global Metadata in a Data Warehouse Environment
  • Global information is shared among metadata
    objects in the warehouse environment.
  • Data object locations are defined with hosts, SAS
    libraries, and DBMS connections
  • All objects have contacts, that is, an owner and
    an administrator.
  • Jobs execute on hosts and may be executed by
    scheduling servers.

22
Global Items for the Case Study
  • 1. On which hosts
  • does the source data reside?
  • will the warehouse tables be stored?
  • Are connections to a DBMS needed?
  • Are SAS librefs needed to reference the
  • source data?
  • warehouse tables?
  • source code?
  • metadata?

continued...
23
Global Items for the Case Study
  • 4. Who are the contacts (owner and administrator)
    associated with the various data stores and
    warehouse processes?
  • 5. What scheduling servers will be used to
    schedule Warehouse Administrator jobs? (For the
    case study, this item is deferred to a later
    time.)

24
Define Items Used Globally Window
  • To open this window, select File ? Setup from the
    pull-down menu.
  • Select the type of item to be defined.
  • Select Add or Edit to define or modify a
    definition. Select Remove to delete a definition.
  • Global metadata can also be updated by accessing
    its property sheet from any attribute selector
    that requires the metadata.

25
Add Global Metadata
  • This demonstration illustrates the use of the
    Define Items Used Globally window and its
    property sheets.
  • It also explains the attributes of each type of
    global item.

26
Exercises
These exercises reinforce the concepts discussed
previously.
27
The Next Step
  • The environment with global metadata is now
    ready to supporta data warehouse.
  • Next, you add the operational data definitions
    (source data).

Environment
Global Metadata
28
Section 2.3
  • Defining theOperational Data Sources

29
Objectives
  • Define Operational Data Definition Groups to
    organize Operational Data Definitions.
  • Define Operational Data Definitions.
  • Use SAS/Warehouse Administrator software to add
    and modify metadata attributes.

30
ODD Groups

Environment
  • Create the grouping elements to contain the
    Operational Data Definitions.

Global metadata

ODD Group

ODD Group

ODD Group

ODD Group
31
ODD Groups
  • One or more Operational Data Definition (ODD)
    Groups are contained within a Warehouse
    Environment.
  • An Operational Data Definition Group organizes
    Operational Data Definitions (data sources).
  • ODD Groups may also contain Information Marts.

32
The Process
  • Step 1 Add an Operational Data Definition Group
    to the warehouse environment.
  • STEP 2 Complete the ODD Group Properties window.

33

Creating an ODD Group
  • Add an initial ODD group for data sources
    maintained by the Operations Department.
  • In the process, work with two interfaces
  • the Metadata Explorer
  • the property sheet editor

34
Current Status

Environment
  • The Environment contains
  • global metadata
  • an ODD group.

global metadata
ODD group
35
Exercises
  • Add three additional ODD group
  • Customer Relations
  • IT Technology
  • Reservations Department

36
Current Status

Environment
  • The environment contains
  • global metadata
  • four ODD groups.
  • Next, define the source data.

Global metadata

ODD Group

ODD Group

ODD Group

ODD Group
37
Data Sources for Data Warehouses within the
Warehouse Environment
  • Operations ODD Group
  • Flight Information ODD (Oracle)
  • Actual Flight Times ODD (OSHost)
  • Customer Relations ODD Group
  • Passenger Delay Compensation ODD (OSHost)
  • IT ODD Group
  • Mileage ODD (Local Host)
  • Reservations ODD Group
  • Holiday Calendar ODD (Local Host)
  • Flight Schedule ODD (OSHost)

38
Operational Data Definition (ODD)
  • An Operational Data Definition identifies a data
    source.
  • a SAS data file, or
  • one of several SAS views.
  • The first step is to locate or create the SAS
    data file or SAS view.

A DBMS table that is accessed via a SAS/ACCESS
libname engine is considered a SAS data file.
39
Adding an ODD Metadata Object
  • In the upcoming demo, we
  • add an ODD to an ODD Group
  • complete the properties of the ODD.

40
ODD Properties
41
Importing Column Metadata
  • Column attributes can be imported
  • existing table or view
  • DDL file created with data modeling software.
  • Cobol file description
  • PROC CONTENTS output data set
  • and they can be
  • specified manually.

42
Add an Operational Data Definition
  • Create an ODD for the actual flight times data
    source owned by the Operations Department.
  • Import column metadata

43
Current Status

Environment
  • The first Operational Data Definition has been
    added.
  • One data source is now fully defined in the
    metadata.

Global metadata
Operations ODD Group
Actual Flight Times
44
Exercises
Add four additional ODDs.
  • Operations ODD Group
  • Flight Information ODD
  • Actual Flight Times ODD (Added in Demo)
  • Customer Relations ODD Group
  • Passenger Delay Compensation ODD
  • IT ODD Group
  • Mileage ODD
  • Reservations ODD Group
  • Flight Schedule ODD

45
Current Status

Environment
  • Four ODD groups are defined.
  • Five ODDs each reference a data source.
  • One additional data source is an external file.

Global metadata

ODD Group

ODD Group

ODD Group

ODD Group
46
Section 2.4
  • Input Sources for ODDs

47
Objectives
  • Name examples of ODD input sources.
  • Define an external file as an input source.

48
Introduction
  • Input sources can be
  • one or more external files
  • one or more data files (SAS data files)
  • a combination of data and external files.

49
ODD Input Source
  • When an ODD is
  • a DATA step view with a fileref on the INFILE
    statement, or
  • an SQL view with a libref on the FROM clause
  • the fileref or libref must be defined before the
    view can be used.
  • This does not include pass-through SQL views or
    SQL views with an embedded libname.

50
External File Input Source
  • data cargo/viewcargo infile freight
    input _at_1 id 6. _at_9 weight 9.2 run
  • In this example, the cargo DATA step view
    represents an ODD. Metadata must be defined for
    the Warehouse Administrator to assign the freight
    fileref.

51
SAS File Input Source
  • proc sql create view planes as
    select manuf, name, model from
    sterling.aircraft quit
  • In this example, the planes SQL view represents
    an ODD. Metadata must be defined for the
    Warehouse Administrator to assign the sterling
    libref.

52
Add an ODD to Extract Data from an External File
  • 1. Analyze the columns in the input data to be
    included.
  • 2. Determine the host and library for the ODD
    and the user-written code to generate the Data
    step view.
  • 3. Create the code to generate the SAS view or
    table. Methods included the DATA step, SQL
    procedure, Query window, External File
    Interface, SAS/ASSIST, SAS/TOOLKIT software.

53
The External File DateCodes.dat
  • Record Layout
  • columns 1-10 Date
  • column 12 Holiday (Y/N)
  • column 13 Hflag
  • HHoliday
  • WWeekend
  • Aday after holiday/weekend
  • column 14 PriceWar (Y/N)
  • column 15 BlackOut (Y/N)

54
DATA Step to Create a SAS View
Fileref DATES
Input 3 fields
55

Add an ODD to Extract Data from an External File
  • Add the Holiday Calendar ODD to the Reservations
    ODD group.
  • Use the Process Editor to create a view of the
    external file.

56
Current Status

Environment
  • Four ODD Groups have been added.
  • Six data sources are now defined as ODDs in the
    metadata.

Global metadata

ODD Group

ODD Group

ODD Group

ODD Group
Write a Comment
User Comments (0)
About PowerShow.com