Title: Informatica overview
1Informatica Overview
2 Contents
- Introduction
- Clients
- Server(s)
- Source, Target, Repository
- Connectivity
3What is Informatica?
- Allows you to load data into a centralized
location, such as a datamart or data warehouse. - ETL Tool
- Extract data from multiple sources
- Transform the data according to business logic
and needs - Load the transformed data into file and
relational targets
4Example
EMPLOYEE
EMP_DETAILS
EMP_ID EMP_NAME EMP_CITY EMP_STATE
EMP_COUNTRY EMP_DATE_OF_JOINING
EMP_ID EMP_NAME EMP_CITY EMP_STATE EMP_COUNTRY
YRS_OF_SRV
Transform Date of Joining to Yrs of Service
5Data Warehousing
Developer
End User
Metadata
Extract Transform Load
Data Warehouse
Operational Sources
6- Informatica Architecture
- Design Process
- Client Tool Review
- Repository Manager
- Designer
- Server Manager
7Informaticas Architecture
Data Models
Designer
Server Manager
Repository Manager
Targets 1-n
Sources 1-n
PowerPlugs
Repository
Server
8Informatica Design Process
2.
1.
3.
Source Def
5.
Mapping
Sessions
Target Def
4.
1. Create Repository 2. Import Source
Definitions 3. Create Target Schema 4. Create
Mappings 5. Load Data
9Informatica Client
- Repository Manager Can view much of the
metadata in the Repository through the Repository
Manager. - Designer Create Source-to-Target mappings that
contain transformation instructions for the
Informatica Server. - Server Manager Create, schedule, and monitor
sessions. You create a session based on a
transformation and schedule it to run on the
Informatica Server.
10Informatica Client
11Metadata Repository
- Information about the data mart system
- Catalogs the repository
- Directs the server
- Contains record of user access
- Can be shared
- Can be searched and reported
- Bridged through Metadata Exchange
12Repository Manager
13Folder Attributes
- FOLDER OWNER - user who serves as focal point for
folder permissions - PERMISSIONS - rights to read, write, and/or
execute objects in a folder - SHARED - property that allows you to make
shortcuts to objects in a folder - SHORTCUT - a dynamic link to an object stored in
a shared folder - VERSIONS - folder iterations that indicate
development stages
14Informatica Client
15Designer Workspace
Open Folder List
Navigator
Workspace
Workbook Tabs
Output Window
Status Bar
16Designer Options
Tables columns viewed column size object
size object colors
Format workspace colors import keys automatic
Source Qualifier creation
General workspace options reload objects on
open group source definitions
17Informatica Client
18Server Manager
19Designer
- Source Analyzer
- Warehouse Designer
- Transformation Developer
- Mapplet Designer
- Mapping Designer
20Source Analyzer
- Identify the sources used to build the warehouse.
- Create repository definitions for these sources
21Analyzing Sources
- Relational Oracle, Sybase, Informix, IBM DB2,
Microsoft SQL Server, and Teradata - File Fixed and delimited flat file, COBOL file,
and XML - Other Microsoft Excel, Microsoft Access
- Extended PeopleSoft, SAP R/3, Sieble, and IBM
MQSeries (need to purchase additional products
for these sources) - Mainframe Need to purchase additional products.
22Warehouse Designer
- Create relational tables in Target database
- Edit target definitions
- Preview relational target data
23Targets
- Relational Oracle, Sybase, Sybase IQ, Informix,
IBM DB2, Microsoft SQL Server, and Teradata - File Fixed and delimited flat files and XML
- Extended SAP BW, IBM MQ Series (need to
purchase additional products for these targets) - Other - Microsoft Access
24Mixing Sources and Targets
- You can combine data from different platforms and
source types.
Oracle
Sybase
Flat File
25Transformation Developer
- Generates ,modifies, passes data through ports
- 12 objects for transforming data
26 Transformations Types
- Source Qualifier represents all data queried from
the source - Normalizer normalizes records from VSAM or
relational sources - Expression performs simple calculations
- Filter serves as a conditional filter
- Aggregator performs aggregate calculations
- Rank limits records to top or bottom range
27Transformations Types contd
- Update Strategy allows for logic to insert,
update, delete, or reject data - Lookup looks up values and passes to other
objects - Stored Procedure calls a stored procedure and
captures return values - External Procedure calls a procedure defined in a
shared library - Sequence Generator generates unique ID values
- Joiner allows for heterogeneous joins
28SourceQualifier Transformation
- Represents records that Informatica server reads
when it runs a session - Automatically attached when a Source is added to
a mapping
29Use a Source Qualifier to
- Filter Records when the Informatica Server reads
source data - Specify sorted ports
- Order by clause
- Select only distinct values from a source
- Create a custom query for the Informatica Server
to read source data
30Expression Transformation
- Calculate values in a single row
- Adjust employee salaries, concatenate first and
last names, convert string to number - Perform Any Non-Aggregate Calculations
- Test conditional statements before you output to
target
31Example
EMPLOYEE
EMP_SALARY
EMP_ID EMP_NAME ROLE_CODE BASIC_SALARY
EMP_ID EMP_NAME ROLE_CODE GROSS_SALARY
Gross Salary Basic Salary 3.5
32Aggregator Transformation
- Allows you to perform aggregate calculations,
such as averages and sums - While the Expression is on a row-by-row basis,
the aggregator can perform calculations on groups
33Example
REVENUE
PU_REVNUE
PU_CODE PROJECT_CODE REVENUE
PU_CODE MAX_REVENUE MIN_REVENUE AVG_REVENUE
Aggregator Transformation
Max Revenue Max (Revenue) Min Revenue
Min(Revenue) Avg Revenue Avg (Revenue)
34Filter Transformation
- Provides the means for filtering rows in a
mapping - Employees who are currently working in the
project NML of WENA as SE - Only rows that meet the condition pass through
the mapping.
35Filter Transformation
- All ports are input/output
- Returns TRUE or FALSE for each row passed through
the mapping based on the condition - Discarded rows do not appear in the session log
or reject files - The input ports must only come from one
transformation
36Filter vs Source Qualifier (SQ)
- SQ provides better performance
- SQ only lets you filter rows from relational
sources, Filter Transformation filters rows from
any source - SQ only uses standard SQL, Filter can use any
statement or function that returns True/False
37Example
EMPLOYEE
NM_EMP_DETAILS
EMP_ID EMP_NAME PROJECT_CODE PU_CODE ROLE_CODE
EMP_ID EMP_NAME PROJECT_CODE PU_CODE ROLE_CODE
Filter Transformation
Where Project NML and PU WENA and Role
SE
38Router Transformation
- Groups data into many groups
- Routes rows of data that do not meet any
condition to a default group - Can enter any expression that returns a single
value - Condition returns True or False for each row
- If the condition NULL, the row is assumed as
FALSE
39Router Transformation
- One Group can be connected to One transformation
or target - One Output Port can be connected to multiple
transformations or targets - Multiple Output ports in One Group can be
connected to multiple transformations or targets - CANNOT Connect more than One Group to One
Transformation or Target
40Lookup Transformation
- Looks up data in a relational table
- Can be the Source, Target, or any database that
the Informatica Client and Server can connect to - Lookup table can be a single table or can join
multiple tables - Lookups can
- Get a related value (your source include
Employee_ID and you want Employee_Name), - Perform a calculation
- Update a slowly changing dimension table (check
if records exist on a target)
41Lookup Transformation
- For each input row, the Informatica Server
queries the lookup table based on the lookup
ports and the condition in the transformation - The Informatica Server can return values from
that lookup (static cache) - The Informatica Server inserts a row into the
cache to flag rows as new or existing (dynamic
cache)
OR
42Connected and UnconnectedLookup Transformations
43Example
EMPLOYEE_PROJECT
NM_EMP_DETAILS
EMP_ID EMP_NAME PROJECT_CODE PROJECT_DESC
EMP_ID EMP_NAME PROJECT_CODE
LOOK UP Transformation
PROJECT
Get PROJECT.PROJECT_DESC Where PROJECT.PROJECT_COD
E NM_EMP_DETAILS.PROJECT_CODE
PROJECT_CODE PROJECT_DESC
44Update Strategy
- Two Ways Of doing
- Within a Session
- Within a Mapping
45Update Strategy
- Within a Session
- Instruct the Informatica Server how to treat the
rows when the session is configured - Within a Mapping
- Use the update strategy transformation to flag
records for insert, delete, update, or reject.
46Constraint for each Database Operation
Operation Constant Numeric Value
Insert DD_INSERT 0
Update DD_UPDATE 1
Delete DD_DELETE 2
Reject DD_REJECT 3
47Joiner Transformation
- Active Transformation
- Join two flat files
- Join two tables from different databases
- Join a flat file with a relational table
48Transformation Overview
- Three views
- Iconized View -- shows transformation in relation
to mapping - Normal View -- shows data flow through
transformation - Edit View -- shows transformation properties and
allows for editing -
49Transformation Overview
Normal view shows data flow through the
transformation
Data passes through I/O ports unchanged
- DATE_ENTERED passes into transformation through
an input port. - It is used in MTH port to extract month.
- Month is passed through MTH output port to
another transformation.
50Transformation Overview
Edit view provides flexibility in defining
transformation rules
51Transformations and Expressions
- Calculation or conditional statement
- Used in Expression, Aggregator, Rank, Filter,
Update Strategy - Performs calculation based on ports, functions,
operators, variables, literals, constants, and
return values from other transformations
52Mapplets
- Reusable Object
- Include multiple transformations
- Include Source definitions
- Multiple groups of output ports
53Mapping
- Move and transform data from sources to targets
- Includes
- source definitions
- target definitions
- transformations.
-
Source
Target
Transformations
Mapping
54Mapping Designer
Transformation Toolbar
Mapping List
Iconized
Status Bar
55Validation
- Three different levels of validation
- Connection validation
- Expression validation
- Mapping validation
56Connection Validation
- Connecting ports with mismatched datatypes
- Connecting output ports to a source
- Connecting a source to anything but a Source
Qualifier or Normalizer - Connecting an output to a output, or an input to
a input - Connecting more than one active transformation to
another transformation - Copying columns to a target definition
57Expression Validation
- Parse the current expression, with remote port
searching (references to a port in another
transformation are resolved) - Parse expression attributes such as filter
condition, lookup condition, SQL Query, etc. - Parse default values
58Mapping Validation
- Mapping validation will take place with menu
commands - Mapping Validate
- Repository Save
- Mapping validation will
- Perform connection validation
- Perform expression validation
- Check the mapping flow validation
- Data from Source Qualifier mapped to a target
- Targets are connected to transformations
59Informatica Server
- Reads information from the Repository
- Extracts data from the Sources and stores the
data in memory while it applies the
transformation rules you created - Loads the transformed data into the mapping
targets
60Transformation Process
Repository
Session Metadata
Source Def
source information target information mapping sche
duling error handling pre- / post-session
scripts tuning parameters output log
information transformation overrides
Mapping
Target Def
Session
Server Manager
Server
Targets
Sources
61Definitions
- Session - A set of instructions that tells the
Informatica Server how and when to move data from
sources to targets - Batch - A group of Sessions which are to be run
together
62Server Manager
63Process
- Configure server
- Create session
- Run session
- Monitor session
- View logs
- Tune session
64Server Configuration
- Server Variables
- Establish default directories for files and
caches. - Variables are server specific.
- Allows for easy deployment.
- Can be overridden at session level.
- Changing the variable updates sessions.
- Directories must exist prior to session launch.
65Server Output
? .dat
? Control
? .perf
? .idx
? E-mail
Target Data
? .bad
Event Log
? .log
66Source Settings - Session Wizard
- Source
- Select source type
- File, Relational, Heterogeneous
- Treat rows as
- Source Type Relational, File
- Tells server how to treat source rows
- Insert, update, delete, data driven
- Works in conjunction with Target Options
- Source Options...
- Source Type Relational
- Database name
- Source Type File
- fixed / delimited properties
- file list
- FTP properties
- Source Database
- Source Type Relational
- Database connection
67Target Settings - Session Wizard
- Target
- Select target type
- File, Relational
- Flat File Options
- File properties
- FTP properties
- Loader properties
- Target Options
- Target type Relational
- Tell server how to load target
- Insert, Update Delete
- Truncate target
- Bulk, Normal, Test
- Target Database
- Target Type Relational
- Database connection
68Launching Sessions
- Manual start - Manually launch a session from
within the Server Manager - Session Schedule - Schedule a session using
business cycle start, stop, and repeat intervals - Batching - Use batches to run session
concurrently or sequentially - Event based - Configure a session to launch based
upon the appearance of an indicator file - Command Line - Launch a session from the command
line prompt
69After the Session Launches...
- Poll/Refresh Session Status
- View Session Details
- View Performance Monitor
- View Logs
- Tune the Session
- Correct Session Problems
70Monitor Session
- Select Server Requests Session Details
- Number of rows loaded/failed
- Read/Write throughput
- Most current Server message
- Audit trail in repository or session log
71Monitor Session
- Performance Monitor
- Select Server Requests Session Performance
Details or - open file ltsessionnamegt.perf
- Need to configure session properties to save the
performance detail counters - Help determine where session performance can be
improved
72Log Files - Session Wizard
- Log Files...
- Log file path and name
- Reject file path and name
- Session log archive options
73Connectivity Overview