This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes a table that looks like this
GROUP_KEY VAL ---------- ---------- Group 3 a Group 3 b Group 3 c
and yields a set of strings that look like this.
GROUP_KEY STRING ---------- --------------- Group 3 a,b,c
The following solution, which we refer to as MODEL Method 3, is a good choice when GROUP_KEY and VAL values are only available in the same table. (If GROUP_KEY and VAL are available in separate tables see MODEL Method 2).
Test Data
desc t Name Null? Type ----------------------------- -------- -------------------- GROUP_KEY VARCHAR2(10) VAL VARCHAR2(10) set null "(null)" break on group_key skip 1 duplicates select * from t order by group_key, val ;
GROUP_KEY VAL ---------- ---------- Group 1 a Group 2 a Group 2 b Group 3 a Group 3 b Group 3 c Group 4 a Group 4 a Group 4 b Group 4 b Group 5 a Group 5 b Group 5 d Group 5 e Group 5 (null) Group 6 (null) Group 7 (null) Group 7 (null) Group 8 a Group 8 a Group 8 b Group 8 b Group 8 (null) Group 8 (null) (null) (null)
Solution
This solution uses the SQL MODEL clause which was introduced in Oracle 10g. If you are unfamiliar with the MODEL clause you may wish to review the SQL Features Tutorials: MODEL Clause tutorial before proceeding.
clear breaks select group_key, substr( string, 2 ) as string from t where 1=2 model reference r on ( select row_number() over (order by val) as val_number , group_key , val from t ) dimension by( val_number ) measures ( group_key, val ) main t_main dimension by( cast(null as integer) as i, group_key ) measures ( cast(null as varchar2(65)) as string ) -- note 1 rules upsert iterate( 1000000 ) -- note 2 until ( presentv(r.val[iteration_number+1],1,0) = 0 ) ( string [ for i from 1 to sign(iteration_number) increment 1, r.group_key[iteration_number] ] = string[cv(),cv()] || ',' || r.val[iteration_number] ) order by 1 ;
GROUP_KEY STRING ---------- --------------- Group 1 a Group 2 a,b Group 3 a,b,c Group 4 a,a,b,b Group 5 a,b,d,e, Group 6 (null) Group 7 , Group 8 a,a,b,b,, (null) (null)
Notes:
- 65 = v * ng + (ng-1) where
- v = maximum length of VAL values, i.e. 10
- ng = maximum number of VALs with the same GROUP_KEY, i.e. 6
- nt = total number of rows in T
- use any number >= nt + 1
To control the order of the delimited values, change the ORDER BY argument in the ROW_NUMBER() function.
select group_key, substr( string, 2 ) as string from t where 1=2 model reference r on ( select row_number() over (ORDER BY VAL DESC) as val_number , group_key , val from t ) dimension by( val_number ) measures ( group_key, val ) main t_main dimension by( cast(null as integer) as i, group_key ) measures ( cast(null as varchar2(65)) as string ) -- note 1 rules upsert iterate( 1000000 ) -- note 2 until ( presentv(r.val[iteration_number+1],1,0) = 0 ) ( string [ for i from 1 to sign(iteration_number) increment 1, r.group_key[iteration_number] ] = string[cv(),cv()] || ',' || r.val[iteration_number] ) order by 1 ;
GROUP_KEY STRING ---------- --------------- Group 1 a Group 2 b,a Group 3 c,b,a Group 4 b,b,a,a Group 5 ,e,d,b,a Group 6 (null) Group 7 , Group 8 ,,b,b,a,a (null) (null)
To filter duplicate VAL values, like those in Groups 4, 7, and 8, from the end result replace the base table with an inline view in the reference model query.
select group_key, substr( string, 2 ) as string from t where 1=2 model reference r on ( select row_number() over (order by val) as val_number , group_key , val from (SELECT DISTINCT GROUP_KEY, VAL FROM T) ) dimension by( val_number ) measures ( group_key, val ) main t_main dimension by( cast(null as integer) as i, group_key ) measures ( cast(null as varchar2(65)) as string ) -- note 1 rules upsert iterate( 1000000 ) -- note 2 until ( presentv(r.val[iteration_number+1],1,0) = 0 ) ( string [ for i from 1 to sign(iteration_number) increment 1, r.group_key[iteration_number] ] = string[cv(),cv()] || ',' || r.val[iteration_number] ) order by 1 ;
GROUP_KEY STRING ---------- --------------- Group 1 a Group 2 a,b Group 3 a,b,c Group 4 a,b Group 5 a,b,d,e, Group 6 (null) Group 7 (null) Group 8 a,b, (null) (null)
To filter null VAL values, like the ones in Groups 5 and higher, from the end result add a WHERE clause to the reference model query.
select group_key, substr( string, 2 ) as string from t where 1=2 model reference r on ( select row_number() over (order by val) as val_number , group_key , val from t WHERE VAL IS NOT NULL ) dimension by( val_number ) measures ( group_key, val ) main t_main dimension by( cast(null as integer) as i, group_key ) measures ( cast(null as varchar2(65)) as string ) -- note 1 rules upsert iterate( 1000000 ) -- note 2 until ( presentv(r.val[iteration_number+1],1,0) = 0 ) ( string [ for i from 1 to sign(iteration_number) increment 1, r.group_key[iteration_number] ] = string[cv(),cv()] || ',' || r.val[iteration_number] ) order by 1 ;
GROUP_KEY STRING ---------- --------------- Group 1 a Group 2 a,b Group 3 a,b,c Group 4 a,a,b,b Group 5 a,b,d,e Group 8 a,a,b,b
In the next tutorial, MODEL Method 3 - Explained, we look at how this technique works.