Title: Michael Watson
1Team 3
- Michael Watson
- Alicia Stevenson
- Tamya Stallings
- Steve Gibson
2Topics of Discussion
- Summary Data Michael Watson
- Security Alicia Stevenson
- Encryption Tamya Stallings
- Budget and Cost Justification Steve Gibson
3Summary Data in the Data Warehouse/ODS Environment
- There is a split between the operational and
informational environments - Operational transaction and day-to-day
processing - Informational analytical data
4Information Systems Architecture
5Detailed/Summary Data
- Detailed data has existed for a long time and
forms the backbone of operational processing - Summary data has came about in recent years with
the implementation of data warehouses
6Framework of Summary Data
- Whether summary data is dynamic or static,
- Whether the usage of summary data is predictable
or unpredictable, and - Whether the summarization results in a low level
of summarization or a high level of summarization.
7Framework of Summary Data
8Implementation Dimensions
- Whether the data is permanently stored or stored
on a temporary basis, and - Whether a single summary snapshot of data is
produced or whether a series of time sequenced
summary snapshots are produced.
9Elements of Summarization
10Monitoring/Sweeping Summary Data
11Platforms the Summary Data Resides On
12Security
13Security
- Gartner Study
- 80 not secure by 2005
- Security is often ignored
- Security Example Teksouth
- access and analysis system with 128-bit
encryption
14Corporate Information Factory (CIF) Security
15Two Threats to Security
- External
- Those outside the enterprise that would misuse
information found inside the CIF - Through the Internet
- Internal
- Someone inside the organization who is familiar
with the structure and the workings of the CIF,
and unauthorized access is made - Seldom come over the Internet
16Corporate Information Factory
- Components
- Data marts
- DSS applications
- Alternative storage
- Data mining warehouses
- ODS
17Data Marts
- The data warehouse is the center of the CIF
- Lower level of security
- Large central facility belonging to the central
IT organization - Protects data through standard system facilities
like network monitors, firewalls, and standard
operating system facilities
- The data marts are the departmental components of
the CIF - Higher level of security
- Owned and managed by departments themselves
- The department can use firewalls, network
security, DBMS security and operating system
security.
18Data Mining
- Least amount of security
- Derived information
- Token based technology
19ODS
- Need for security and protection
- ODS data is used for online decision making
- Highest level of security throughout the entire
CIF - Central IT organization
20Encryption
- Definition scrambling a message so no one can
read it except for the person who its intended - Plaintext vs. ciphertext
- More profound level of security
- Very time consuming to manage
21Data Warehouse Encryption
- Data cant be encrypted in variable lengths
- Must be decrypted in the same format
- Domain of data must be preserved
- Example if the data contains a number then it
must be encrypted as a number - Vital to ensuring the system doesnt collapse
- Protects data from dumps
22Unique requirements
- Ability to encrypt some columns and not others
- Part of the data could be encrypted one way with
the other half of the data encrypted a different
way (interleaving affect)
23Encryption restrictions
- Keys are not encrypted
- Columns that use a WHERE clause are not encrypted
- Foreign key relationships arent encrypted
- Data used in standard database management system
processing are not encrypted
24Encryption abuses
- Example Employee, manager, and data clerk all
need same data - Encryption doesnt distinguish between users
- Degrades performance and lessens the data
availability to users - Requires complicated key administration
- Performance degradation
- One or two elements may not cause degradation
- Entire table of elements performance degradation
is considerable
25Encryption abuses (cont.)
- Good practice
- Laptops
- Encrypt all sensitive data
- Protects is stolen
- Main priority
- Select encryption
- Various organizational steps for protecting
physical site and media
26- Budget is important because it shows where the
true priorities are. - It shows where the money is being spent in the
data warehouse - Two types of cost
- 1) one-time
- 2) continuous
27One time costs
- Hardware
- Disk storage
- Software
- Integration and transformation interface
- Processor
- Network communication
- System management tools
- Metadata creation and population
28Continuous Costs
- Refreshment of data
- End user training
- Maintenance and update of data
- Administration costs
- Periodic verification of the conformance to the
enterprise data model - Servicing data mart request
- Reorganization and restructuring of data
- Archiving of data
29These items cost differently depending on many
factors
- Size of the organization
- Amount of historical data
- Level of detail
- Sophistication of the end user
- Company is in a competitive market
- How fast the data warehouse grows
- Number of data marts stemming from it
- Centralized or distributed
- Constructed manually or automated
30One time costs
- Hardware
- Disk storage ...30
- Processor costs ... 20
- Network communication costs ...10
- Software
- DBMS...10
- Access/analysis tools ... 6
- Systems management tools
- Activity monitor 2
- Data monitor .2
- Integration and transformation interface
creation.15 - Metadata creation and population..5
31Continuous costs
- Refreshment of the data warehouse data 55
- Maintenance of infrastructure... 3
- End user training .6
- Data warehouse administration
- Conformance to the enterprise data model2
- Servicing data mart requests for data21
- Capacity planning .. 1
- Monitoring of activity and data7
- Occasional reorganization/restructuring of data
1 - Archiving of data ..1
- Summary table usage analysis 2
- Security administration ..1
32Hard to justify cost
- The benefits cannot be seen until after the data
warehouse is built - Many factors contribute to the benefits, not just
the data warehouse
33Compare data warehouse to legacy1 report 10 in
data warehouse 1000 in legacy environment
- Computer resources from the legacy environment
cost more the warehouse - Legacy environment is sensitive to response time
unlike warehouse - Much data in legacy environment that can get in
the way - Internal structure of data warehouse was built
for reports, legacy was not - Technology in data warehouse was optimized for
reports, legacy was not - Data in warehouse is integrated, legacys data is
not
34Compare continued
- Data warehouse has metadata to locate data
quickly - Metadata lets DSS analyst know if report was
already made - More historical data in data warehouse to use
- More summary data in data warehouse
- Quicker to create one in data warehouse
- Data warehouse is ideal for discovery mode and
changing a report is easy
35Controlling costs
- Use an automation tool for creation and
maintenance of the code used to move data from
legacy to data warehouse - Store the warehouse on a combination of disk
storage and near line storage - Build the data warehouse iteratively- spiral
method best - Use log journals and tapes as a source of
refreshment
36Controlling costs can't
- Use a consultant
- Use a data model as the basis of the design
- Capacity analysis done at the beginning of
development - Research what vendors to buy software and
hardware from - Purge data occasionally
- Never build a data mart first
37Mikes Question
- What are the three important dimensions of
summary data?
38Alicias Question
- What are the different components of the
- Corporate Information Factory?
39Tamyas Question
- What are the four restrictions for encryption
data in a data warehouse?
40Steves Question
- Q What are two types of cost and give at
least two examples of each