This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes a table that looks like this
GROUP_KEY VAL ---------- ---------- Group 3 a Group 3 b Group 3 c
and yields a set of strings that look like this.
GROUP_KEY STRING ---------- --------------- Group 3 a,b,c
The following solution, which we refer to as MODEL Method 3, is a good choice when GROUP_KEY and VAL values are only available in the same table. (If GROUP_KEY and VAL are available in separate tables see MODEL Method 2).
Test Data
desc t Name Null? Type ----------------------------- -------- -------------------- GROUP_KEY VARCHAR2(10) VAL VARCHAR2(10) set null "(null)" break on group_key skip 1 duplicates select * from t order by group_key, val ;
GROUP_KEY VAL ---------- ---------- Group 1 a Group 2 a Group 2 b Group 3 a Group 3 b Group 3 c Group 4 a Group 4 a Group 4 b Group 4 b Group 5 a Group 5 b Group 5 d Group 5 e Group 5 (null) Group 6 (null) Group 7 (null) Group 7 (null) Group 8 a Group 8 a Group 8 b Group 8 b Group 8 (null) Group 8 (null) (null) (null)
Solution
This solution uses the SQL MODEL clause which was introduced in Oracle 10g. If you are unfamiliar with the MODEL clause you may wish to review the SQL Features Tutorials: MODEL Clause tutorial before proceeding.
clear breaks
select group_key, substr( string, 2 ) as string
from t
where 1=2
model
reference r
on
(
select
row_number() over (order by val) as val_number ,
group_key ,
val
from t
)
dimension by( val_number )
measures ( group_key, val )
main t_main
dimension by( cast(null as integer) as i, group_key )
measures ( cast(null as varchar2(65)) as string ) -- note 1
rules
upsert
iterate( 1000000 ) -- note 2
until ( presentv(r.val[iteration_number+1],1,0) = 0 )
(
string
[ for i from 1 to sign(iteration_number) increment 1,
r.group_key[iteration_number]
] = string[cv(),cv()] || ',' || r.val[iteration_number]
)
order by 1 ;
GROUP_KEY STRING ---------- --------------- Group 1 a Group 2 a,b Group 3 a,b,c Group 4 a,a,b,b Group 5 a,b,d,e, Group 6 (null) Group 7 , Group 8 a,a,b,b,, (null) (null)
Notes:
- 65 = v * ng + (ng-1) where
- v = maximum length of VAL values, i.e. 10
- ng = maximum number of VALs with the same GROUP_KEY, i.e. 6
- nt = total number of rows in T
- use any number >= nt + 1
To control the order of the delimited values, change the ORDER BY argument in the ROW_NUMBER() function.
select group_key, substr( string, 2 ) as string
from t
where 1=2
model
reference r
on
(
select
row_number() over (ORDER BY VAL DESC) as val_number ,
group_key ,
val
from t
)
dimension by( val_number )
measures ( group_key, val )
main t_main
dimension by( cast(null as integer) as i, group_key )
measures ( cast(null as varchar2(65)) as string ) -- note 1
rules
upsert
iterate( 1000000 ) -- note 2
until ( presentv(r.val[iteration_number+1],1,0) = 0 )
(
string
[ for i from 1 to sign(iteration_number) increment 1,
r.group_key[iteration_number]
] = string[cv(),cv()] || ',' || r.val[iteration_number]
)
order by 1 ;
GROUP_KEY STRING ---------- --------------- Group 1 a Group 2 b,a Group 3 c,b,a Group 4 b,b,a,a Group 5 ,e,d,b,a Group 6 (null) Group 7 , Group 8 ,,b,b,a,a (null) (null)
To filter duplicate VAL values, like those in Groups 4, 7, and 8, from the end result replace the base table with an inline view in the reference model query.
select group_key, substr( string, 2 ) as string
from t
where 1=2
model
reference r
on
(
select
row_number() over (order by val) as val_number ,
group_key ,
val
from (SELECT DISTINCT GROUP_KEY, VAL FROM T)
)
dimension by( val_number )
measures ( group_key, val )
main t_main
dimension by( cast(null as integer) as i, group_key )
measures ( cast(null as varchar2(65)) as string ) -- note 1
rules
upsert
iterate( 1000000 ) -- note 2
until ( presentv(r.val[iteration_number+1],1,0) = 0 )
(
string
[ for i from 1 to sign(iteration_number) increment 1,
r.group_key[iteration_number]
] = string[cv(),cv()] || ',' || r.val[iteration_number]
)
order by 1 ;
GROUP_KEY STRING ---------- --------------- Group 1 a Group 2 a,b Group 3 a,b,c Group 4 a,b Group 5 a,b,d,e, Group 6 (null) Group 7 (null) Group 8 a,b, (null) (null)
To filter null VAL values, like the ones in Groups 5 and higher, from the end result add a WHERE clause to the reference model query.
select group_key, substr( string, 2 ) as string
from t
where 1=2
model
reference r
on
(
select
row_number() over (order by val) as val_number ,
group_key ,
val
from t
WHERE VAL IS NOT NULL
)
dimension by( val_number )
measures ( group_key, val )
main t_main
dimension by( cast(null as integer) as i, group_key )
measures ( cast(null as varchar2(65)) as string ) -- note 1
rules
upsert
iterate( 1000000 ) -- note 2
until ( presentv(r.val[iteration_number+1],1,0) = 0 )
(
string
[ for i from 1 to sign(iteration_number) increment 1,
r.group_key[iteration_number]
] = string[cv(),cv()] || ',' || r.val[iteration_number]
)
order by 1 ;
GROUP_KEY STRING ---------- --------------- Group 1 a Group 2 a,b Group 3 a,b,c Group 4 a,a,b,b Group 5 a,b,d,e Group 8 a,a,b,b
In the next tutorial, MODEL Method 3 - Explained, we look at how this technique works.
