AGGREGATOR TRANSFORMATION - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

AGGREGATOR TRANSFORMATION

Description:

... incremental aggregation, it passes new source data through the mapping ... If data is not in strict ascending or descending order based on the session sort ... – PowerPoint PPT presentation

Number of Views:220
Avg rating:3.0/5.0
Slides: 25
Provided by: BSN4
Category:

less

Transcript and Presenter's Notes

Title: AGGREGATOR TRANSFORMATION


1
AGGREGATOR TRANSFORMATION
  • Basics and Advanced Concepts

2
Transformations
  • Transformations help to transform the source data
    according to the requirements of target system
    and it ensures the quality of the data being
    loaded into target.
  • Transformations are of two types
  • 1) Active
  • 2) Passive

3
Overview
  • Aggregator transformation allows you to perform
    aggregate functions, such as averages and sums.
  • The aggregator transformation is unlike the
    Expression transformation, We can use aggregator
    transformation to perform calculations on groups.
  • Expression transformation permits you to perform
    calculations on a row-by-row basis only

4
Aggregator Transformation
  • We can use conditional clauses to filter rows,
    providing more flexibility
  • After we create a session that includes an
    aggregator transformation ,we can enable the
    session option, Incremental aggregation. While
    performing incremental aggregation, it passes new
    source data through the mapping and uses
    historical cache data to perform new aggregation
    calculations incrementally.

5
Creating an Aggregator Transformation
  • To use an Aggregator transformation in a mapping,
    you add the Aggregator transformation to the
    mapping, then configure the transformation with
    an aggregate expression and group by ports, if
    desired.
  • To create an Aggregator transformation
  • In the Mapping Designer, choose
    Transformation-Create. Select the Aggregator
    transformation.
  • Enter a name for the Aggregator, click Create.
    Then click Done. The Designer creates the
    Aggregator transformation.
  • Drag the desired ports to the Aggregator
    transformation.
  • The Designer creates input/output ports for each
    port you include.

6
Creating an Aggregator Transformation
  • Double-click the title bar of the transformation
    to open the Edit Transformations dialog box.
  • Select the Ports tab.
  • Click the group by option for each column you
    want the Aggregator to use in creating groups.
  • You can optionally enter a default value to
    replace null groups.
  • Click Add and enter a name and data type for the
    aggregate expression port. Make the port an
    output port by clearing Input (I). Click in the
    right corner of the Expression field to open the
    Expression Editor. Enter the aggregate
    expression, click Validate, then click OK.
  • Make sure the expression validates before closing
    the Expression Editor.

7
Continuation
  • Select the Properties tab.

8
(No Transcript)
9
Continuation
  • Click OK.
  • Choose Repository-Save to save changes to the
    mapping

10
Components of the Aggregator Transformation
  • The Aggregator is an active transformation,
    changing the number of rows in the pipeline. The
    Aggregator transformation has the following
    components and options
  • Aggregate expression. Entered in an output port.
    Can include non-aggregate expressions and
    conditional clauses.
  • Group by port. Indicates how to create groups.
    The port can be any input, input/output, output,
    or variable port. When grouping data, the
    Aggregator transformation outputs the last row of
    each group unless otherwise specified.
  • Sorted input. Use to improve session performance.
    To use sorted input, you must pass data to the
    Aggregator transformation sorted by group by
    port, in ascending or descending order.
  • Aggregate cache. The PowerCenter Server stores
    data in the aggregate cache until it completes
    aggregate calculations. It stores group values in
    an index cache and row data in the data cache.
  • 6

11
Aggregate Caches
  • When you run a session that uses an Aggregator
    transformation, the PowerCenter Server creates
    index and data caches in memory to process the
    transformation. If the PowerCenter Server
    requires more space, it stores overflow values in
    cache files.
  • You can configure the index and data caches in
    the Aggregator transformation or in the session
    properties.
  • No need to configure cache memory for Aggregator
    transformations that use sorted ports

12
Aggregate Functions
  • You can use the following aggregate functions
    within an Aggregator transformation. You can nest
    one aggregate function within another aggregate
    function.
  • The transformation language includes the
    following aggregate functions
  • AVG
  • FIRST
  • COUNT
  • LAST
  • MAX
  • PERCENTILE
  • STDEV
  • SUM
  • VARIANCE
  • ?When you use any of these functions, you must
    use them in an expression within an Aggregator
    transformation.

13
Group By Ports
  • The Aggregator transformation allows you to
    define groups for aggregations, rather than
    performing the aggregation across all input data.
    For example, rather than finding the total
    company sales, you can find the total sales
    grouped by region.
  • To define a group for the aggregate expression,
    select the appropriate input, input/output,
    output, and variable ports in the Aggregator
    transformation. You can select multiple group by
    ports, creating a new group for each unique
    combination of groups.
  • To define a group for the aggregate expression,
    select the appropriate input, input/output,
    output, and variable ports in the Aggregator
    transformation. You can select multiple group by
    ports, creating a new group for each unique
    combination of groups.

14
Group By Ports
  • The following Aggregator transformation groups
    first by STORE_ID and then by ITEM

15
Continuation
  • If you send the following data through this
    Aggregator transformation

16
Continuation..
  • The PowerCenter Server performs the aggregate
    calculation on the following unique groups

17
Continuation..
  • The PowerCenter Server then passes the last
    row received, along with the results of the
    aggregation, as follows

18
Using Sorted Input
  • ?You can improve Aggregator transformation
    performance by using the sorted input option.
    When you use sorted input, the PowerCenter Server
    assumes all data is sorted by group. As the
    PowerCenter Server reads rows for a group, it
    performs aggregate calculations. When necessary,
    it stores group information in memory. To use the
    Sorted Input option, you must pass sorted data to
    the Aggregator transformation.
  • ?When you do not use sorted input, the
    PowerCenter Server performs aggregate
    calculations as it reads. However, since data is
    not sorted, the PowerCenter Server stores data
    for each group until it reads the entire source
    to ensure all aggregate calculations are accurate

19
Sorted Input Conditions
  • Do not use sorted input if either of the
    following conditions are true
  • The aggregate expression uses nested aggregate
    functions.
  • The session uses incremental aggregation.
  • If you use sorted input and do not sort data
    correctly, the session fails.
  • Pre-Sorting Data
  • To use sorted input, you pass sorted data through
    the Aggregator.
  • Data must be sorted as follows
  • By the Aggregator group by ports, in the order
    they appear in the Aggregator transformation.
  • Using the same sort order configured for the
    session. If data is not in strict ascending or
    descending order based on the session sort order,
    the PowerCenter Server fails the session

20
Mapping with a Sorter transformation configured
to sort the source data in descending order by
ITEM_NAME
21
Optimizing Aggregator Transformations
  • Aggregator transformations often slow performance
    because they must group data before processing
    it. Aggregator transformations need additional
    memory to hold intermediate group results.
  • You can optimize Aggregator transformations by
    performing the following tasks
  • Group by simple columns.
  • Use sorted input.
  • Use incremental aggregation.

22
Performance Tuning Tips
  • Group By Simple Columns
  • You can optimize Aggregator transformations when
    you group by simple columns. When possible, use
    numbers instead of string and dates in the
    columns used for the GROUP BY. You should also
    avoid complex expressions in the Aggregator
    expressions.
  • Use Sorted Input
  • You can increase session performance by sorting
    data and using the Aggregator Sorted Input
    option.
  • The Sorted Input decreases the use of aggregate
    caches. When you use the Sorted Input option, the
    PowerCenter Server assumes all data is sorted by
    group. As the PowerCenter Server reads rows for a
    group, it performs aggregate calculations. When
    necessary, it stores group information in memory.
  • The Sorted Input option reduces the amount of
    data cached during the session and improves
    performance. Use this option with the Source
    Qualifier Number of Sorted Ports option to pass
    sorted data to the Aggregator transformation.

23
Performance tuning tips
  • Use Incremental Aggregation
  • If you can capture changes from the source that
    changes less than half the target, you can use
    Incremental Aggregation to optimize the
    performance of Aggregator transformations.
  • When using incremental aggregation, you apply
    captured changes in the source to aggregate
    calculations in a session. The PowerCenter Server
    updates your target incrementally, rather than
    processing the entire source and recalculate the
    same calculations every time you run the session.
  • Filter before aggregating
  • If you use a filter transformation in the
    mapping,place the transformation before the
    aggregator transformation to reduce unnecessary
    aggregation.

24
  • Thank You
Write a Comment
User Comments (0)
About PowerShow.com