Regular Expressions

REGEXP_COUNT

Introduced in Oracle 11g the REGEXP_COUNT function greatly simplifies counting the number of times a pattern appears inside a string. The following examples demonstrate how to use REGEXP_COUNT with some simple patterns. For versions prior to 11g alternative logic that provides similar functionality is also presented.

Before proceeding however, the following code will set up some test data for us (see Setup if you want to create the SET_TARGET and ADD_TARGET procedures on your own test system).

execute set_target( 'aaa'        )

execute add_target( 'a b c'      )
execute add_target( 'b  b  b  b' )
execute add_target( '  bbb xx  ' )
execute add_target( 'wdef def w' )
execute add_target( 'defxdefdef' )
execute add_target( null         )

set null "(null)"

column target        heading "TARGET|STRING"
column pattern_count heading "PATTERN|COUNT"
 

Count String Pattern Occurrences (Oracle 11g)

To count the occurrences of the letter "b" in the target strings we can use a query like this.

select
  target ,
  REGEXP_COUNT( TARGET, 'b' ) as pattern_count
from
  targets
;
 
TARGET        PATTERN
STRING          COUNT
---------- ----------
aaa                 0
a b c               1
b  b  b  b          4
  bbb xx            3
wdef def w          0
defxdefdef          0
(null)     (null)
 

This query finds the total occurrences of the letters "a", "c", or "f" in the target strings.

select
  target ,
  REGEXP_COUNT( TARGET, '[acf]' ) as pattern_count
from
  targets
;
 
TARGET        PATTERN
STRING          COUNT
---------- ----------
aaa                 3
a b c               2
b  b  b  b          0
  bbb xx            0
wdef def w          2
defxdefdef          3
(null)     (null)
 

To calculate the number of times a string like "def" occurs we use a query like this.

select
  target ,
  REGEXP_COUNT( TARGET, 'def' ) as pattern_count
from
  targets
;
 
TARGET        PATTERN
STRING          COUNT
---------- ----------
aaa                 0
a b c               0
b  b  b  b          0
  bbb xx            0
wdef def w          2
defxdefdef          3
(null)     (null)
 

To find the number of words in a string we can use a query like this (for our purposes we define a word as any series of characters that are not a space).

select
  target ,
  REGEXP_COUNT( TARGET, '[^ ]+' ) as pattern_count
from
  targets
;
 
TARGET        PATTERN
STRING          COUNT
---------- ----------
aaa                 1
a b c               3
b  b  b  b          4
  bbb xx            2
wdef def w          3
defxdefdef          1
(null)     (null)
 

To ensure PATTERN_COUNT always returns a non-null value use NVL.

select
  target ,
  NVL( regexp_count( target, '[^ ]+' ), 0 ) as pattern_count
from
  targets
;
 
TARGET        PATTERN
STRING          COUNT
---------- ----------
aaa                 1
a b c               3
b  b  b  b          4
  bbb xx            2
wdef def w          3
defxdefdef          1
(null)              0
 

To learn more about regular expression patterns in general see Pattern Matching.

Count String Pattern Occurrences (Oracle 10g)

While Oracle 10g supports most of the regular expression features described in this section it unfortunately does not support REGEXP_COUNT. However, we can use the LENGTH and REGEXP_REPLACE functions to produce similar results.

To count the occurrences of the letter "b" in the target strings we can use a query like this (the column INTERMEDIATE_STRING is included to demonstrate the inner workings of each solution).

column intermediate_string format a12 heading "INTERMEDIATE|STRING"

select
  target ,
  regexp_replace( target, '[^b]', null )                     as intermediate_string ,
  NVL( LENGTH( REGEXP_REPLACE( TARGET, '[^b]', NULL ) ), 0 ) as pattern_count
from
  targets
;
 
TARGET     INTERMEDIATE    PATTERN
STRING     STRING            COUNT
---------- ------------ ----------
aaa        (null)                0
a b c      b                     1
b  b  b  b bbbb                  4
  bbb xx   bbb                   3
wdef def w (null)                0
defxdefdef (null)                0
(null)     (null)                0
 

This query finds the total occurrences of the letters "a", "c", or "f" in the target strings.

select
  target ,
  regexp_replace( target, '[^acf]', null )                     as intermediate_string ,
  NVL( LENGTH( REGEXP_REPLACE( TARGET, '[^acf]', NULL ) ), 0 ) as pattern_count
from
  targets
;
 
TARGET     INTERMEDIATE    PATTERN
STRING     STRING            COUNT
---------- ------------ ----------
aaa        aaa                   3
a b c      ac                    2
b  b  b  b (null)                0
  bbb xx   (null)                0
wdef def w ff                    2
defxdefdef fff                   3
(null)     (null)                0
 

To calculate the number of times a string like "def" occurs we use a query like this.

select
  target ,
  regexp_replace( target, '(d)ef|.', '\1' )                     as intermediate_string ,
  NVL( LENGTH( REGEXP_REPLACE( TARGET, '(d)ef|.', '\1' ) ), 0 ) as pattern_count
from
  targets
;
 
TARGET     INTERMEDIATE    PATTERN
STRING     STRING            COUNT
---------- ------------ ----------
aaa        (null)                0
a b c      (null)                0
b  b  b  b (null)                0
  bbb xx   (null)                0
wdef def w dd                    2
defxdefdef ddd                   3
(null)     (null)                0
 

(I found this last technique in this OTN Discussion Forum thread by cd after reading this blog post by Volder.)

To find the number of words in a string we can use a query like this.

select
  target ,
  regexp_replace( target, ' *[^ ]+ *', 'w' )                     as intermediate_string ,
  NVL( LENGTH( REGEXP_REPLACE( TARGET, ' *[^ ]+ *', 'w' ) ), 0 ) as pattern_count
from
  targets
;
 
TARGET     INTERMEDIATE    PATTERN
STRING     STRING            COUNT
---------- ------------ ----------
aaa        w                     1
a b c      www                   3
b  b  b  b wwww                  4
  bbb xx   ww                    2
wdef def w www                   3
defxdefdef w                     1
(null)     (null)                0
 

Count String Pattern Occurrences (Any Oracle Version)

In any Oracle version we can use the LENGTH and REPLACE functions to count patterns, though the logic is slightly more complex than the techniques above.

To count the occurrences of the letter "b" in the target strings we can use a query like this.

column target_length       heading "TARGET|LENGTH"
column intermediate_length heading "INTERMEDIATE|LENGTH"

select
  target ,
  replace( target, 'b', null )                     as intermediate_string ,
  nvl( length( target ), 0 )                       as target_length ,
  nvl( length( replace( target, 'b', null ) ), 0 ) as intermediate_length ,
  NVL( LENGTH( TARGET ), 0 )
    - NVL( LENGTH( REPLACE( TARGET, 'b', NULL ) ), 0 )
    as pattern_count
from
  targets
;
 
TARGET     INTERMEDIATE     TARGET INTERMEDIATE    PATTERN
STRING     STRING           LENGTH       LENGTH      COUNT
---------- ------------ ---------- ------------ ----------
aaa        aaa                   3            3          0
a b c      a  c                  5            4          1
b  b  b  b                      10            6          4
  bbb xx      xx                10            7          3
wdef def w wdef def w           10           10          0
defxdefdef defxdefdef           10           10          0
(null)     (null)                0            0          0
 

(The INTERMEDIATE_STRING value in the third row is six spaces. In the fourth row it is three spaces followed by "xx" followed by two spaces.)

This query finds the total occurrences of the letters "a", "c", or "f" in the target strings.

select
  target ,
  translate( target, 'xacf', 'x' )                     as intermediate_string ,
  nvl( length( target ), 0 )                           as target_length ,
  nvl( length( translate( target, 'xacf', 'x' ) ), 0 ) as intermediate_length ,
  NVL( LENGTH( TARGET ), 0 )
    - NVL( LENGTH( TRANSLATE( TARGET, 'xacf', 'x' ) ), 0 )
    as pattern_count
from
  targets
;
 
TARGET     INTERMEDIATE     TARGET INTERMEDIATE    PATTERN
STRING     STRING           LENGTH       LENGTH      COUNT
---------- ------------ ---------- ------------ ----------
aaa        (null)                3            0          3
a b c       b                    5            3          2
b  b  b  b b  b  b  b           10           10          0
  bbb xx     bbb xx             10           10          0
wdef def w wde de w             10            8          2
defxdefdef dexdede              10            7          3
(null)     (null)                0            0          0
 

To calculate the number of times a string like "def" occurs we use a query like this.

column intermediate_string_1 format a12 heading "INTERMEDIATE|STRING 1"
column intermediate_string_2 format a12 heading "INTERMEDIATE|STRING 2"

select
  target ,
  replace( target, 'def', 'd' )                      as intermediate_string_1 ,
  replace( target, 'def', null )                     as intermediate_string_2 ,
  nvl( length( replace( target, 'def', 'd'  ) ), 0 ) as length_1 ,
  nvl( length( replace( target, 'def', null ) ), 0 ) as length_2 ,
  NVL( LENGTH( REPLACE( TARGET, 'def', 'd' ) ), 0 )
    - NVL( LENGTH( REPLACE( TARGET, 'def', NULL ) ), 0 )
    as pattern_count
from
  targets
;
 
TARGET     INTERMEDIATE INTERMEDIATE                          PATTERN
STRING     STRING 1     STRING 2       LENGTH_1   LENGTH_2      COUNT
---------- ------------ ------------ ---------- ---------- ----------
aaa        aaa          aaa                   3          3          0
a b c      a b c        a b c                 5          5          0
b  b  b  b b  b  b  b   b  b  b  b           10         10          0
  bbb xx     bbb xx       bbb xx             10         10          0
wdef def w wd d w       w  w                  6          4          2
defxdefdef dxdd         x                     4          1          3
(null)     (null)       (null)                0          0          0
 

To find the number of words in a string we can use a query like this. Note the query will not work for target strings containing more than one space between words.

column trimmed_length_plus_1 heading "TRIMMED LENGTH|PLUS 1"

select
  target ,
  trimmed ,
  replace( trimmed, ' ', null )                     as intermediate_string ,
  nvl( length( trimmed ) + 1, 0 )                   as trimmed_length_plus_1 ,
  nvl( length( replace( trimmed, ' ', null ) ), 0 ) as intermediate_length ,
  NVL( LENGTH( TRIMMED ) + 1, 0 )
    - NVL( LENGTH( REPLACE( TRIMMED, ' ', NULL ) ), 0 )
    as pattern_count
from
  ( select target, ltrim( rtrim(target) ) as trimmed from targets )
where
  nvl( ltrim( rtrim( target ) ), 'x' ) not like '%  %'
;
 
TARGET                     INTERMEDIATE TRIMMED LENGTH INTERMEDIATE    PATTERN
STRING     TRIMMED         STRING               PLUS 1       LENGTH      COUNT
---------- --------------- ------------ -------------- ------------ ----------
aaa        aaa             aaa                       4            3          1
a b c      a b c           abc                       6            3          3
  bbb xx   bbb xx          bbbxx                     7            5          2
wdef def w wdef def w      wdefdefw                 11            8          3
defxdefdef defxdefdef      defxdefdef               11           10          1
(null)     (null)          (null)                    0            0          0
 

In Oracle 8i or later the LTRM and RTRIM functions can be replaced with a single TRIM call.

select
  target ,
  trimmed ,
  replace( trimmed, ' ', null )                     as intermediate_string ,
  nvl( length( trimmed ) + 1, 0 )                   as trimmed_length_plus_1 ,
  nvl( length( replace( trimmed, ' ', null ) ), 0 ) as intermediate_length ,
  nvl( length( trimmed ) + 1, 0 )
    - nvl( length( replace( trimmed, ' ', null ) ), 0 )
    as pattern_count
from
  ( select target, TRIM( target ) as trimmed from targets )
where
  nvl( TRIM( target ), 'x' ) not like '%  %'
;
 
TARGET                     INTERMEDIATE TRIMMED LENGTH INTERMEDIATE    PATTERN
STRING     TRIMMED         STRING               PLUS 1       LENGTH      COUNT
---------- --------------- ------------ -------------- ------------ ----------
aaa        aaa             aaa                       4            3          1
a b c      a b c           abc                       6            3          3
  bbb xx   bbb xx          bbbxx                     7            5          2
wdef def w wdef def w      wdefdefw                 11            8          3
defxdefdef defxdefdef      defxdefdef               11           10          1
(null)     (null)          (null)                    0            0          0
 



Linking to SQL Snippets ™

To link to this page in Oracle Technology Network Forums or OraFAQ Forums cut and paste this code.

  • [url=http://www.sqlsnippets.com/en/topic-12818.html]SQL Snippets: Regular Expressions - REGEXP_COUNT[/url]

To link to this page in HTML documents or Blogger comments cut and paste this code.

  • <a href="http://www.sqlsnippets.com/en/topic-12818.html">SQL Snippets: Regular Expressions - REGEXP_COUNT</a>

To link to this page in other web sites use the following values.

  • Link Text : SQL Snippets: Regular Expressions - REGEXP_COUNT
  • URL (href): http://www.sqlsnippets.com/en/topic-12818.html