This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like these
GROUP_KEY VAL ---------- ---------- Group 3 a Group 3 b Group 3 c
and yields a string like this.
GROUP_KEY STRING ---------- --------------- Group 3 a,b,c
The solution to follow, which we will refer to as MODEL Method 2, is a good choice when the GROUP_KEY values we wish to report on come from one table and the string values come from another. Note that it performs and scales better than the solution in MODEL Method 1, though the results differ slightly with respect to null values.
Test Data
desc t Name Null? Type ----------------------------- -------- -------------------- GROUP_KEY VARCHAR2(10) VAL VARCHAR2(10) set null "(null)" break on group_key skip 1 duplicates select * from t order by group_key, val ;
GROUP_KEY VAL ---------- ---------- Group 1 a Group 2 a Group 2 b Group 3 a Group 3 b Group 3 c Group 4 a Group 4 a Group 4 b Group 4 b Group 5 a Group 5 b Group 5 d Group 5 e Group 5 (null) Group 6 (null) Group 7 (null) Group 7 (null) Group 8 a Group 8 a Group 8 b Group 8 b Group 8 (null) Group 8 (null) (null) (null)
clear breaks desc t_group_keys Name Null? Type ----------------------------------------- -------- ---------------------------- GROUP_KEY VARCHAR2(10) select * from t_group_keys order by 1 ;
GROUP_KEY ---------- Group 1 Group 2 Group 3 Group 4 Group 5 Group 6 Group 7 Group 8 (null)
Solution
This solution uses the SQL MODEL clause which was introduced in Oracle 10g. If you are unfamiliar with the MODEL clause you may wish to review the SQL Features Tutorials: MODEL Clause tutorial before proceeding.
select group_key, string from t_group_keys model reference r on ( select group_key , row_number() over ( partition by group_key order by val ) - 1 as position , val from t ) dimension by ( group_key, position ) measures ( val ) main m dimension by ( group_key ) measures ( cast( null as varchar2(65) ) as string ) -- note 1 rules update iterate( 6 ) -- note 2 ( string[ any ] = string[cv()] || case presentv( r.val[cv(group_key),iteration_number],1,0) when 0 then null else case iteration_number when 0 then null else ',' end || r.val[cv(group_key),iteration_number] end ) order by group_key ;
GROUP_KEY STRING ---------- --------------- Group 1 a Group 2 a,b Group 3 a,b,c Group 4 a,a,b,b Group 5 a,b,d,e, Group 6 (null) Group 7 , Group 8 a,a,b,b,, (null) (null)
Notes:
- 65 = v * n + (n-1)
- 6 = n
- v = maximum length of VAL values = 10
- n = maximum number of rows with the same GROUP_KEY = 6
To control the sort order of the delimited values within the STRING we can change the ORDER BY clause in the expression for the POSITION column.
select group_key, string from t_group_keys model reference r on ( select group_key , row_number() over ( partition by group_key ORDER BY VAL DESC ) - 1 as position , val from t ) dimension by ( group_key, position ) measures ( val ) main m dimension by ( group_key ) measures ( cast( null as varchar2(65) ) as string ) -- note 1 rules update iterate( 6 ) -- note 2 ( string[ any ] = string[cv()] || case presentv( r.val[cv(group_key),iteration_number],1,0) when 0 then null else case iteration_number when 0 then null else ',' end || r.val[cv(group_key),iteration_number] end ) order by group_key ;
GROUP_KEY STRING ---------- --------------- Group 1 a Group 2 b,a Group 3 c,b,a Group 4 b,b,a,a Group 5 ,e,d,b,a Group 6 (null) Group 7 , Group 8 ,,b,b,a,a (null) (null)
To filter duplicate VAL values, like those in Groups 4, 7, and 8, from the end result we change the definition of the LAST_POSITION column and add an inline view to the reference query.
select group_key, string from t_group_keys model reference r on ( select group_key , row_number() over ( partition by group_key order by val ) - 1 as position , val from ( SELECT DISTINCT GROUP_KEY, VAL FROM T ) TA ) dimension by ( group_key, position ) measures ( val ) main m dimension by ( group_key ) measures ( cast( null as varchar2(65) ) as string ) -- note 1 rules update iterate( 6 ) -- note 2 ( string[ any ] = string[cv()] || case presentv( r.val[cv(group_key),iteration_number],1,0) when 0 then null else case iteration_number when 0 then null else ',' end || r.val[cv(group_key),iteration_number] end ) order by group_key ;
GROUP_KEY STRING ---------- --------------- Group 1 a Group 2 a,b Group 3 a,b,c Group 4 a,b Group 5 a,b,d,e, Group 6 (null) Group 7 (null) Group 8 a,b, (null) (null)
To filter null VAL values, like the ones in Groups 5 and 8 from the end result we add
WHERE VAL IS NOT NULL
to the two inline views.
select group_key, string from t_group_keys model reference r on ( select group_key , row_number() over ( partition by group_key order by val ) - 1 as position , val from t WHERE VAL IS NOT NULL ) dimension by ( group_key, position ) measures ( val ) main m dimension by ( group_key ) measures ( cast( null as varchar2(65) ) as string ) -- note 1 rules update iterate( 6 ) -- note 2 ( string[ any ] = string[cv()] || case presentv( r.val[cv(group_key),iteration_number],1,0) when 0 then null else case iteration_number when 0 then null else ',' end || r.val[cv(group_key),iteration_number] end ) order by group_key ;
GROUP_KEY STRING ---------- --------------- Group 1 a Group 2 a,b Group 3 a,b,c Group 4 a,a,b,b Group 5 a,b,d,e Group 6 (null) Group 7 (null) Group 8 a,a,b,b (null) (null)
Unlike the solution in MODEL Method 1, here we see that rows are returned for groups 6, 7, and "null". The solution's design effectively performs a left outer join between T_GROUP_KEYS and T.
In the next tutorial, MODEL Method 2 - Explained we look at how this technique works.