Rows to String: MODEL Method 2

This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like these

GROUP_KEY  VAL
---------- ----------
Group 3    a
Group 3    b
Group 3    c

and yields a string like this.

GROUP_KEY  STRING
---------- ---------------
Group 3    a,b,c

The solution to follow, which we will refer to as MODEL Method 2, is a good choice when the GROUP_KEY values we wish to report on come from one table and the string values come from another. Note that it performs and scales better than the solution in MODEL Method 1, though the results differ slightly with respect to null values.

Test Data

desc t
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 GROUP_KEY                              VARCHAR2(10)
 VAL                                    VARCHAR2(10)

set null "(null)"

break on group_key skip 1 duplicates

select * from t order by group_key, val ;

GROUP_KEY  VAL
---------- ----------
Group 1    a

Group 2    a
Group 2    b

Group 3    a
Group 3    b
Group 3    c

Group 4    a
Group 4    a
Group 4    b
Group 4    b

Group 5    a
Group 5    b
Group 5    d
Group 5    e
Group 5    (null)

Group 6    (null)

Group 7    (null)
Group 7    (null)

Group 8    a
Group 8    a
Group 8    b
Group 8    b
Group 8    (null)
Group 8    (null)

(null)     (null)

clear breaks

desc t_group_keys
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 GROUP_KEY                                          VARCHAR2(10)


select * from t_group_keys order by 1 ;

GROUP_KEY
----------
Group 1
Group 2
Group 3
Group 4
Group 5
Group 6
Group 7
Group 8
(null)

Solution

This solution uses the SQL MODEL clause which was introduced in Oracle 10g. If you are unfamiliar with the MODEL clause you may wish to review the SQL Features Tutorials: MODEL Clause tutorial before proceeding.

select group_key, string
from   t_group_keys
model
  reference r on
    ( select
        group_key ,
        row_number() over ( partition by group_key order by val ) - 1
          as position ,
        val
      from t
    )
    dimension by ( group_key, position )
    measures     ( val )
  main m
    dimension by ( group_key )
    measures     ( cast( null as varchar2(65) ) as string ) -- note 1
    rules
      update
      iterate( 6 ) -- note 2
      (
        string[ any ] =
          string[cv()] ||
          case presentv( r.val[cv(group_key),iteration_number],1,0)
          when 0 then null
          else
            case iteration_number when 0 then null else ',' end ||
            r.val[cv(group_key),iteration_number]
          end
      )
order by group_key ;

GROUP_KEY  STRING
---------- ---------------
Group 1    a
Group 2    a,b
Group 3    a,b,c
Group 4    a,a,b,b
Group 5    a,b,d,e,
Group 6    (null)
Group 7    ,
Group 8    a,a,b,b,,
(null)     (null)

Notes:

65 = v * n + (n-1)
6 = n

v = maximum length of VAL values = 10
n = maximum number of rows with the same GROUP_KEY = 6

To control the sort order of the delimited values within the STRING we can change the ORDER BY clause in the expression for the POSITION column.

select group_key, string
from   t_group_keys
model
  reference r on
    ( select
        group_key ,
        row_number() over ( partition by group_key ORDER BY VAL DESC ) - 1
          as position ,
        val
      from t
    )
    dimension by ( group_key, position )
    measures     ( val )
  main m
    dimension by ( group_key )
    measures     ( cast( null as varchar2(65) ) as string ) -- note 1
    rules
      update
      iterate( 6 ) -- note 2
      (
        string[ any ] =
          string[cv()] ||
          case presentv( r.val[cv(group_key),iteration_number],1,0)
          when 0 then null
          else
            case iteration_number when 0 then null else ',' end ||
            r.val[cv(group_key),iteration_number]
          end
      )
order by group_key ;

GROUP_KEY  STRING
---------- ---------------
Group 1    a
Group 2    b,a
Group 3    c,b,a
Group 4    b,b,a,a
Group 5    ,e,d,b,a
Group 6    (null)
Group 7    ,
Group 8    ,,b,b,a,a
(null)     (null)

To filter duplicate VAL values, like those in Groups 4, 7, and 8, from the end result we change the definition of the LAST_POSITION column and add an inline view to the reference query.

select group_key, string
from   t_group_keys
model
  reference r on
    ( select
        group_key ,
        row_number() over ( partition by group_key order by val ) - 1
          as position ,
        val
      from ( SELECT DISTINCT GROUP_KEY, VAL FROM T ) TA
    )
    dimension by ( group_key, position )
    measures     ( val )
  main m
    dimension by ( group_key )
    measures     ( cast( null as varchar2(65) ) as string ) -- note 1
    rules
      update
      iterate( 6 ) -- note 2
      (
        string[ any ] =
          string[cv()] ||
          case presentv( r.val[cv(group_key),iteration_number],1,0)
          when 0 then null
          else
            case iteration_number when 0 then null else ',' end ||
            r.val[cv(group_key),iteration_number]
          end
      )
order by group_key ;

GROUP_KEY  STRING
---------- ---------------
Group 1    a
Group 2    a,b
Group 3    a,b,c
Group 4    a,b
Group 5    a,b,d,e,
Group 6    (null)
Group 7    (null)
Group 8    a,b,
(null)     (null)

To filter null VAL values, like the ones in Groups 5 and 8 from the end result we add WHERE VAL IS NOT NULL to the two inline views.

select group_key, string
from   t_group_keys
model
  reference r on
    ( select
        group_key ,
        row_number() over ( partition by group_key order by val ) - 1
          as position ,
        val
      from t
      WHERE VAL IS NOT NULL
    )
    dimension by ( group_key, position )
    measures     ( val )
  main m
    dimension by ( group_key )
    measures     ( cast( null as varchar2(65) ) as string ) -- note 1
    rules
      update
      iterate( 6 ) -- note 2
      (
        string[ any ] =
          string[cv()] ||
          case presentv( r.val[cv(group_key),iteration_number],1,0)
          when 0 then null
          else
            case iteration_number when 0 then null else ',' end ||
            r.val[cv(group_key),iteration_number]
          end
      )
order by group_key ;

GROUP_KEY  STRING
---------- ---------------
Group 1    a
Group 2    a,b
Group 3    a,b,c
Group 4    a,a,b,b
Group 5    a,b,d,e
Group 6    (null)
Group 7    (null)
Group 8    a,a,b,b
(null)     (null)

Unlike the solution in MODEL Method 1, here we see that rows are returned for groups 6, 7, and "null". The solution's design effectively performs a left outer join between T_GROUP_KEYS and T.

In the next tutorial, MODEL Method 2 - Explained we look at how this technique works.

SQL Snippets ™: Tutorials for Oracle Developers

Rows to String

MODEL Method 2

Test Data

Solution

Linking to SQL Snippets ™

Revision Notes