Monday, August 29, 2011

Joining Data from a single source with itself

A lot of times we have to join the data set from an aggregator with the source data itself. One way to achieve this is to put in the SQL over-ride, and push the grouping and the joining to the DB level. i.e

SELECT emp.emp_name, emp.emp_sal
FROM emp,
(SELECT MAX (sal), dep
FROM emp
GROUP BY dep) emp2
WHERE emp.dep = emp2.dep;


Or we could use Oracle analytic funtions. i.e

SELECT MAX (sal) OVER (PARTITION BY loc) max_sal, empno, sal, loc
FROM emp_test
ORDER BY loc, sal


But, if you truely wanted an informatica solution, the following is how you would achieve this.

1 - Create a sorter transformation, and order by columns you want to join
2 - From the sorter, send the data to an Aggregator, and also an Expression.
3 - Create a joiner transformation, and check the "Sorted Input" option
4 - Join the columns from the Expression and the Aggregator

If you dont put the sorter before the Joiner, and select the Sorted input option, informatica will not allow you to join the two data streams.

2 comments:

  1. Hi,
    This article was very helpful.But just wondering how the sorted input option solves the problem.In short,what happens behind the scenes?
    Thanks,
    Roshni

    ReplyDelete