This tutorial shows how to extract delimited values from a string like this
C ---------- v1,v2,v3
and return them one value per row, like this.
POSITION VAL ---------- ---------- 1 v1 2 v2 3 v3
The technique described here uses REGEXP_REPLACE for parsing and an integer table to generate integers. REGEXP_REPLACE is only available in Oracle 10g or above.
Before we see the solution though, let's have a look at the test data we will use.
select * from t order by key ;
KEY C NOTE --- ---------- ------------------------- r01 v1 One value r02 v1, Two values, one empty r03 v1,, Three values, two empty r04 ,v2, Three values, two empty r05 ,,v3 Three values, two empty r06 v1,v2 Two values r07 v1,v2, Three values, one empty r08 v1,v2,v3 Three values, none empty r09 (null) One empty value r10 , Two values, two empty r11 ,, Three values, all empty
Now let's examine the solution.
set null "(null)" set pagesize 45 break on key skip 1 duplicates variable d varchar2(1) execute :d := ',' select ta.key , i.integer_value as position , rtrim( regexp_substr( ta.c || :d, '.*?' || :d, 1, i.integer_value ), :d ) as val from ( select key , c , length( regexp_replace( t.c || :d, '[^'||:d||']', null ) ) as value_count from t ) ta inner join ( select integer_value from integers where integer_value <= 5 ) i on ( i.integer_value <= ta.value_count ) order by key, position ;
KEY POSITION VAL --- ---------- ---------- r01 1 v1 r02 1 v1 r02 2 (null) r03 1 v1 r03 2 (null) r03 3 (null) r04 1 (null) r04 2 v2 r04 3 (null) r05 1 (null) r05 2 (null) r05 3 v3 r06 1 v1 r06 2 v2 r07 1 v1 r07 2 v2 r07 3 (null) r08 1 v1 r08 2 v2 r08 3 v3 r09 1 (null) r10 1 (null) r10 2 (null) r11 1 (null) r11 2 (null) r11 3 (null)
How It Works
To understand how the query works consider its individual components.
Component 1
First we look at values derived from T.
clear breaks select key , c , LENGTH( REGEXP_REPLACE( T.C || :D, '[^'||:D||']', NULL ) ) AS VALUE_COUNT from t order by 1 ;
KEY C VALUE_COUNT --- ---------- ----------- r01 v1 1 r02 v1, 2 r03 v1,, 3 r04 ,v2, 3 r05 ,,v3 3 r06 v1,v2 2 r07 v1,v2, 3 r08 v1,v2,v3 3 r09 (null) 1 r10 , 2 r11 ,, 3
In this query VALUE_COUNT is simply the total number of delimited values in the string, determined by counting all the delimiter characters in the string plus one. If your delimiter is more than one character long you will need to adjust this logic to return a correct count.
Component 2
The next component is a simple integer series generator.
clear breaks select integer_value from integers where integer_value <= 5 ;
INTEGER_VALUE ------------- 1 2 3 4 5
It returns a number of rows equal to the maximum number we expect to find in VALUE_COUNT. In our example we expect no more than 5 delimited values per string.
This technique is described in more detail at Integer Series Generators: Integer Table Method.
Components 1 + 2
Next we combine components 1 and 2 to form a single query that returns one row for each delimited value in each STRING.
break on key skip 1 duplicates select ta.key, ta.c, ta.value_count, i.integer_value from ( select key , c , LENGTH( REGEXP_REPLACE( T.C || :D, '[^'||:D||']', NULL ) ) AS VALUE_COUNT from t ) ta inner join ( SELECT INTEGER_VALUE FROM INTEGERS WHERE INTEGER_VALUE <= 5 ) I ON ( I.INTEGER_VALUE <= TA.VALUE_COUNT ) order by 1, 2, 3, 4 ;
KEY C VALUE_COUNT INTEGER_VALUE --- ---------- ----------- ------------- r01 v1 1 1 r02 v1, 2 1 r02 v1, 2 2 r03 v1,, 3 1 r03 v1,, 3 2 r03 v1,, 3 3 r04 ,v2, 3 1 r04 ,v2, 3 2 r04 ,v2, 3 3 r05 ,,v3 3 1 r05 ,,v3 3 2 r05 ,,v3 3 3 r06 v1,v2 2 1 r06 v1,v2 2 2 r07 v1,v2, 3 1 r07 v1,v2, 3 2 r07 v1,v2, 3 3 r08 v1,v2,v3 3 1 r08 v1,v2,v3 3 2 r08 v1,v2,v3 3 3 r09 (null) 1 1 r10 , 2 1 r10 , 2 2 r11 ,, 3 1 r11 ,, 3 2 r11 ,, 3 3
Component 3
To extract individual values from each string we use the parsing logic described in the String to Columns: Separated Values - 10g+ tutorial. Here are some brief examples of how this logic extracts values from given positions in a delimited string.
column c2 format a11 column value_1 format a7 column value_2 format a7 select c || :d as c2, rtrim( regexp_substr( c || :d, '.*?' || :d, 1, 1 ), :d ) as value_1 from t where key = 'r08' ;
C2 VALUE_1 ----------- ------- v1,v2,v3, v1
select c || :d as c2, rtrim( regexp_substr( c || :d, '.*?' || :d, 1, 2 ), :d ) as value_2 from t where key = 'r08' ;
C2 VALUE_2 ----------- ------- v1,v2,v3, v2
Final Solution
Adding the parsing logic from Component 3 to the query from "Components 1 + 2" yields the final solution.
select ta.key , ta.c , i.integer_value as position , rtrim( regexp_substr( ta.c || :d, '.*?' || :d, 1, i.integer_value ), :d ) as val from ( select key , c , length( regexp_replace( t.c || :d, '[^'||:d||']', null ) ) as value_count from t ) ta inner join ( select integer_value from integers where integer_value <= 5 ) i on ( i.integer_value <= ta.value_count ) order by 1, 2, 3 ;
KEY C POSITION VAL --- ---------- ---------- ---------- r01 v1 1 v1 r02 v1, 1 v1 r02 v1, 2 (null) r03 v1,, 1 v1 r03 v1,, 2 (null) r03 v1,, 3 (null) r04 ,v2, 1 (null) r04 ,v2, 2 v2 r04 ,v2, 3 (null) r05 ,,v3 1 (null) r05 ,,v3 2 (null) r05 ,,v3 3 v3 r06 v1,v2 1 v1 r06 v1,v2 2 v2 r07 v1,v2, 1 v1 r07 v1,v2, 2 v2 r07 v1,v2, 3 (null) r08 v1,v2,v3 1 v1 r08 v1,v2,v3 2 v2 r08 v1,v2,v3 3 v3 r09 (null) 1 (null) r10 , 1 (null) r10 , 2 (null) r11 ,, 1 (null) r11 ,, 2 (null) r11 ,, 3 (null)