String to Columns

XML Values - 1

This tutorial shows how to extract values from strings of XML elements like this one

DATA_SET                  STRING
------------------------- ----------------------------------------
XML - 1                   <V>a</V> <V>b,b</V> <V>c c c</V>
 

and return the values in separate columns like this.

DATA_SET                  VAL_1  VAL_2  VAL_3
------------------------- ------ ------ ------
XML - 1                   a      b,b    c c c
 

Warning: The snippets in this tutorial do not work as expected when the source data contains NULL values (as tested in Oracle 10g XE, see Gotchas below). NULL values have therefore been excluded from the test data.

There are a number of XML related features we can use in a solution. However, before using these features we must first convert our test strings into XMLTYPE data types.

Converting VARCHAR2 to XMLTYPE

Here is an example of how to use the XMLPARSE function to convert a VARCHAR2 value containing an XML fragment into an XMLTYPE datatype.

set recsep each

select num, string, XMLPARSE( CONTENT string ) as xml_instance
from   t1_xml_1
order by num ;
 
 NUM STRING                                   XML_INSTANCE
---- ---------------------------------------- ----------------------------------------
   1 <V>a</V>                                 <V>a</V>

   2 <V>a</V> <V></V>                         <V>a</V> <V></V>

   3 <V>a</V> <V></V> <V></V>                 <V>a</V> <V></V> <V></V>

   4 <V></V> <V>b,b</V> <V></V>               <V></V> <V>b,b</V> <V></V>

   5 <V></V> <V></V> <V>c c c</V>             <V></V> <V></V> <V>c c c</V>

   6 <V>a</V> <V>b,b</V>                      <V>a</V> <V>b,b</V>

   7 <V>a</V> <V>b,b</V> <V></V>              <V>a</V> <V>b,b</V> <V></V>

   8 <V>a</V> <V>b,b</V> <V>c c c</V>         <V>a</V> <V>b,b</V> <V>c c c</V>

   9 <V></V>                                  <V></V>

  10 <V></V> <V></V>                          <V></V> <V></V>

  11 <V></V> <V></V> <V></V>                  <V></V> <V></V> <V></V>
 

If we can guarantee that our strings always resolve to well-formed XML documents then we can use the WELLFORMED keyword to spare the database from performing redundant validity checks.

select num, xmlparse( content string WELLFORMED ) as xml_instance
from   t1_xml_1
order by num ;
 
 NUM XML_INSTANCE
---- ----------------------------------------
   1 <V>a</V>

   2 <V>a</V> <V></V>

   3 <V>a</V> <V></V> <V></V>

   4 <V></V> <V>b,b</V> <V></V>

   5 <V></V> <V></V> <V>c c c</V>

   6 <V>a</V> <V>b,b</V>

   7 <V>a</V> <V>b,b</V> <V></V>

   8 <V>a</V> <V>b,b</V> <V>c c c</V>

   9 <V></V>

  10 <V></V> <V></V>

  11 <V></V> <V></V> <V></V>
 

In PL/SQL the conversion is slightly more complex since the XMLPARSE function is not available (as tested in Oracle 10g XE). Attempting to use XMLPARSE will generate a PLS-00103 error.

declare
  v varchar2(50) := '<V>a</V> <V>b,b</V>' ;
  x xmltype ;
begin
  x := xmlparse( content v wellformed ) ;
end;
/
 
ERROR at line 5:
ORA-06550: line 5, column 26:
PLS-00103: Encountered the symbol "V" when expecting one of the following: ...
 

One PL/SQL alternative is to use a SELECT statement to perform the conversion.

declare
  v varchar2(50) := '<V>a</V> <V>b</V>' ;
  x xmltype ;
begin
  select xmlparse( content v wellformed )
  into   x
  from   dual;
end;
/

PL/SQL procedure successfully completed.

 

Another is to convert the XML string fragment into an XML document by giving it a root node. You can then use the XMLTYPE constructor to do the VARCHAR2 to XMLTYPE data type conversion.

declare
  v varchar2(50) := '<V>a</V> <V>b,b</V>' ;
  x xmltype ;
begin
  x := XMLTYPE('<ROOT>' || v || '</ROOT>');
  -- note: we can use any valid name for the root node here,
  -- it does not have to be "ROOT"
end;
/

PL/SQL procedure successfully completed.

 

Extracting Values with the EXTRACT and GETSTRINGVAL Member Functions

Once we have an XMLTYPE instance we can extract values using its EXTRACT and GETSTRINGVAL Member Functions as follows.

select
  t1_xml_1a.num ,
  t1_xml_1a.xml_instance ,
  t1_xml_1a.xml_instance.EXTRACT('/V[1]/text()').GETSTRINGVAL() as val_1 ,
  t1_xml_1a.xml_instance.EXTRACT('/V[2]/text()').GETSTRINGVAL() as val_2 ,
  t1_xml_1a.xml_instance.EXTRACT('/V[3]/text()').GETSTRINGVAL() as val_3
from
  ( select num, xmlparse( content string wellformed ) as xml_instance
    from   t1_xml_1
  ) t1_xml_1a
order by num ;
 
 NUM XML_INSTANCE                             VAL_1  VAL_2  VAL_3
---- ---------------------------------------- ------ ------ ------
   1 <V>a</V>                                 a

   2 <V>a</V> <V></V>                         a

   3 <V>a</V> <V></V> <V></V>                 a

   4 <V></V> <V>b,b</V> <V></V>                      b,b

   5 <V></V> <V></V> <V>c c c</V>                           c c c

   6 <V>a</V> <V>b,b</V>                      a      b,b

   7 <V>a</V> <V>b,b</V> <V></V>              a      b,b

   8 <V>a</V> <V>b,b</V> <V>c c c</V>         a      b,b    c c c

   9 <V></V>

  10 <V></V> <V></V>

  11 <V></V> <V></V> <V></V>
 

If our data contains missing elements (like rows 1, 2, 6, 9, and 10) and we try to use the same approach in PL/SQL we will encounter an ORA-30625 error.

begin

  dbms_output.put_line( '
num        VAL_1      VAL_2      VAL_3
---------- ---------- ---------- ----------
' );

  for t1_xml_1_row in
  ( select num, xmlparse( content t1_xml_1.string wellformed ) as xml_instance
    from   t1_xml_1
    order by num
  ) loop
  declare

    val_1 varchar2(10) :=
      t1_xml_1_row.xml_instance.EXTRACT('/V[1]/text()').GETSTRINGVAL() ;
    val_2 varchar2(10) :=
      t1_xml_1_row.xml_instance.EXTRACT('/V[2]/text()').GETSTRINGVAL() ;
    val_3 varchar2(10) :=
      t1_xml_1_row.xml_instance.EXTRACT('/V[3]/text()').GETSTRINGVAL() ;

  begin

    dbms_output.put_line
    ( rpad( t1_xml_1_row.num, 11 ) ||
        rpad( nvl( val_1, ' ' ), 11 ) ||
        rpad( nvl( val_2, ' ' ), 11 ) ||
        rpad( nvl( val_3, ' ' ), 11 )
    );

  end;
  end loop;

end;
/

num        VAL_1      VAL_2      VAL_3
---------- ---------- ---------- ----------

begin
*
ERROR at line 1:
ORA-30625: method dispatch on NULL SELF argument is disallowed
ORA-06512: at line 17


 

To work around the error we perform the extraction in the SELECT statement instead of the DBMS_OUTPUT command.

begin

  dbms_output.put_line( '
num        VAL_1      VAL_2      VAL_3
---------- ---------- ---------- ----------
' );

  for t1_xml_1_row in
  ( select
      t1_xml_1a.num ,
      t1_xml_1a.xml_instance ,
      t1_xml_1a.xml_instance.EXTRACT('/V[1]/text()').GETSTRINGVAL() as val_1 ,
      t1_xml_1a.xml_instance.EXTRACT('/V[2]/text()').GETSTRINGVAL() as val_2 ,
      t1_xml_1a.xml_instance.EXTRACT('/V[3]/text()').GETSTRINGVAL() as val_3
    from
      ( select t1_xml_1.num, xmlparse(content t1_xml_1.string wellformed) as xml_instance
        from   t1_xml_1
      ) t1_xml_1a
    order by num
  ) loop

    dbms_output.put_line
    ( rpad( t1_xml_1_row.num, 11 ) ||
        rpad( nvl( t1_xml_1_row.val_1, ' ' ), 11 ) ||
        rpad( nvl( t1_xml_1_row.val_2, ' ' ), 11 ) ||
        rpad( nvl( t1_xml_1_row.val_3, ' ' ), 11 )
    );

  end loop;

end;
/

num        VAL_1      VAL_2      VAL_3
---------- ---------- ---------- ----------

1          a
2          a
3          a
4                     b,b
5                                c c c
6          a          b,b
7          a          b,b
8          a          b,b        c c c
9
10
11

 

We can also work around the error by using the CASE operator to avoid applying getStringVal() to nonexistent elements.

begin

  dbms_output.put_line( '
num        VAL_1      VAL_2      VAL_3
---------- ---------- ---------- ----------
' );

  for t1_xml_1_row in
  ( select num, xmlparse( content string wellformed ) as xml_instance
    from   t1_xml_1
    order by num
  ) loop
  declare

    x_val_1 xmltype := t1_xml_1_row.xml_instance.extract('/V[1]/text()') ;
    x_val_2 xmltype := t1_xml_1_row.xml_instance.extract('/V[2]/text()') ;
    x_val_3 xmltype := t1_xml_1_row.xml_instance.extract('/V[3]/text()') ;

    val_1 varchar2(10) :=
      CASE when x_val_1 is null then null else x_val_1.getStringVal() end ;
    val_2 varchar2(10) :=
      CASE when x_val_2 is null then null else x_val_2.getStringVal() end ;
    val_3 varchar2(10) :=
      CASE when x_val_3 is null then null else x_val_3.getStringVal() end ;

  begin

    dbms_output.put_line
    ( rpad( t1_xml_1_row.num, 11 ) ||
        rpad( nvl( val_1, ' ' ), 11 ) ||
        rpad( nvl( val_2, ' ' ), 11 ) ||
        rpad( nvl( val_3, ' ' ), 11 )
    );

  end;
  end loop;

end;
/

num        VAL_1      VAL_2      VAL_3
---------- ---------- ---------- ----------

1          a
2          a
3          a
4                     b,b
5                                c c c
6          a          b,b
7          a          b,b
8          a          b,b        c c c
9
10
11
 

Extracting Numbers With the EXTRACT and GETNUMBERVAL Member Functions

If our data contained only numeric values we could use the "getNumberVal()" member function instead of "getStringVal()".

column n_val_1 format 999999
column n_val_2 format 999999
column n_val_3 format 999999

select
  t1_xml_3.num ,
  t1_xml_3.xml_instance ,
  t1_xml_3.xml_instance.extract('/V[1]/text()').GETNUMBERVAL() as N_VAL_1 ,
  t1_xml_3.xml_instance.extract('/V[2]/text()').GETNUMBERVAL() as N_VAL_2 ,
  t1_xml_3.xml_instance.extract('/V[3]/text()').GETNUMBERVAL() as N_VAL_3
from
  ( select
      1 as num,
      xmlparse(content '<V>5</V> <V></V> <V>46</V>' wellformed) as xml_instance
    from   dual
  ) t1_xml_3
;
 
 NUM XML_INSTANCE                             N_VAL_1 N_VAL_2 N_VAL_3
---- ---------------------------------------- ------- ------- -------
   1 <V>5</V> <V></V> <V>46</V>                     5              46
 

Extracting Values With the SQL EXTRACTVALUE Function

Instead of using the "extract()" and "getStringVal()" member functions together we can use the SQL convenience function EXTRACTVALUE to perform the same logic.

select
  t1_xml_1a.num ,
  t1_xml_1a.xml_instance ,
  EXTRACTVALUE( t1_xml_1a.xml_instance, '/V[1]/text()' ) as VAL_1 ,
  EXTRACTVALUE( t1_xml_1a.xml_instance, '/V[2]/text()' ) as VAL_2 ,
  EXTRACTVALUE( t1_xml_1a.xml_instance, '/V[3]/text()' ) as VAL_3
from
  ( select num, xmlparse( content string wellformed ) as xml_instance
    from   t1_xml_1
  ) t1_xml_1a
order by num ;
 
 NUM XML_INSTANCE                             VAL_1  VAL_2  VAL_3
---- ---------------------------------------- ------ ------ ------
   1 <V>a</V>                                 a

   2 <V>a</V> <V></V>                         a

   3 <V>a</V> <V></V> <V></V>                 a

   4 <V></V> <V>b,b</V> <V></V>                      b,b

   5 <V></V> <V></V> <V>c c c</V>                           c c c

   6 <V>a</V> <V>b,b</V>                      a      b,b

   7 <V>a</V> <V>b,b</V> <V></V>              a      b,b

   8 <V>a</V> <V>b,b</V> <V>c c c</V>         a      b,b    c c c

   9 <V></V>

  10 <V></V> <V></V>

  11 <V></V> <V></V> <V></V>
 

Note that EXTRACTVALUE does not appear to work in PL/SQL (as tested in Oracle 10g XE). When we attempt to use EXTRACTVALUE in PL/SQL we get a PLS-00201 error.

declare
  v     varchar2(50) := '<V>a</V> <V>b</V>' ;
  val_1 varchar2(10) ;
begin
  val_1 := extractvalue( v, '/V[1]/text()' ) ;
end;
/
  val_1 := extractvalue( v, '/V[1]/text()' ) ;
           *
ERROR at line 5:
ORA-06550: line 5, column 12:
PLS-00201: identifier 'EXTRACTVALUE' must be declared
ORA-06550: line 5, column 3:
PL/SQL: Statement ignored


 

Gotchas

Column Prefixes

In SQL commands we need to prefix references to some xml instances with the table alias to avoid an ORA-00904 error. For example, this will not work

select
  xml_instance ,
  xml_instance.extract( '/V[1]/text()' ) as xml_instance
from
  ( select num, xmlparse( content string wellformed ) as xml_instance
    from   t1_xml_1
  ) t1_xml_1a
where
  num = 'String 01'
order by
  num
;
  xml_instance.extract( '/V[1]/text()' ) as xml_instance
  *
ERROR at line 3:
ORA-00904: "XML_INSTANCE"."EXTRACT": invalid identifier


 

but this will.

select
  xml_instance ,                                            -- prefix not required
  T1_XML_1A.xml_instance.extract( '/V[1]/text()' ) as val_1 -- prefix required
from
  ( select num, xmlparse( content string wellformed ) as xml_instance
    from   t1_xml_1
  ) t1_xml_1a
where num = 1
;
 
XML_INSTANCE                             VAL_1
---------------------------------------- ------
<V>a</V>                                 a
 
Case Sensitivity

The XPath string argument in functions like EXTRACT and EXTRACTVALUE is case sensitive. For example, this snippet will return a null value for the XML_ELEMENT column because "v[1]" is not the same as "V[1]".

set null "(null)"

select xmlparse( content t1_xml_1.string ).extract( '/v[1]/text()' ) as xml_instance
from   t1_xml_1
where  num = 1 ;
 
XML_INSTANCE
----------------------------------------
(null)
 
set null ""
 

This snippet generates an error because "Text()" is not a valid XPath keyword ("TEXT()" would not be either).

select xmlparse( content t1_xml_1.string ).extract( '/V[1]/Text()' ) as xml_instance
from   t1_xml_1
where  num = 1 ;
ERROR:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00601: Invalid token in: '/V[1]/Text()'
ORA-06512: at "SYS.XMLTYPE", line 111


 

This version works.

select xmlparse( content t1_xml_1.string ).extract( '/V[1]/text()' ) as xml_instance
from   t1_xml_1
where  num = 1 ;
 
XML_INSTANCE
----------------------------------------
a
 
NULL Values

As explained in the warning message at the top of the page we have deliberately excluded test data containing NULL values because NULL values produce unexpected results due to what I believe is a bug. Here are some examples that demonstrate the unexpected results.

set null "(null)"

insert into t1_xml_1 values ( 'XML - 1', 12 , null, 'NULL string' );
commit;

select num, string, xmlparse( content string ) xml_instance
from   t1_xml_1
where  num in ( 1, 12 )
order by num ;
 
 NUM STRING                                   XML_INSTANCE
---- ---------------------------------------- ----------------------------------------
   1 <V>a</V>                                 <V>a</V>

  12 (null)                                   <V>a</V>
 
select num, string, xmlparse( content string ) xml_instance
from   t1_xml_1
where  num in ( 8, 12 )
order by num ;
 
 NUM STRING                                   XML_INSTANCE
---- ---------------------------------------- ----------------------------------------
   8 <V>a</V> <V>b,b</V> <V>c c c</V>         <V>a</V> <V>b,b</V> <V>c c c</V>

  12 (null)                                   <V>a</V> <V>b,b</V> <V>c c c</V>
 
select num, string, xmlparse( content string ) xml_instance
from   t1_xml_1
where  num in ( 11, 12 )
order by num ;
 
 NUM STRING                                   XML_INSTANCE
---- ---------------------------------------- ----------------------------------------
  11 <V></V> <V></V> <V></V>                  <V></V> <V></V> <V></V>

  12 (null)                                   <V></V> <V></V> <V></V>
 
delete from t1_xml_1 where num = 12 ;
commit;
 

The values under the XML_ELEMENT column for rows where NUM=12 should be null in each of the three examples above, but instead the result varies depending on which row appears immediately before row 12. My post about this on the OTN XML DB discussion forum, XMLPARSE and NULL values , has yielded no replies to date (2007-03-29). If anyone can shed some light on the subject please email me at electronic mail address.




Linking to SQL Snippets ™

To link to this page in Oracle Technology Network Forums or OraFAQ Forums cut and paste this code.

  • [url=http://www.sqlsnippets.com/en/topic-12000.html]SQL Snippets: String to Columns - XML Values - 1[/url]

To link to this page in HTML documents or Blogger comments cut and paste this code.

  • <a href="http://www.sqlsnippets.com/en/topic-12000.html">SQL Snippets: String to Columns - XML Values - 1</a>

To link to this page in other web sites use the following values.

  • Link Text : SQL Snippets: String to Columns - XML Values - 1
  • URL (href): http://www.sqlsnippets.com/en/topic-12000.html

Revision Notes

Date Category Note
2007-03-30 Revision This tutorial was created with material moved here from topic-11608.html.