Introduced in Oracle 11g the REGEXP_COUNT function greatly simplifies counting the number of times a pattern appears inside a string. The following examples demonstrate how to use REGEXP_COUNT with some simple patterns. For versions prior to 11g alternative logic that provides similar functionality is also presented.
Before proceeding however, the following code will set up some test data for us (see Setup if you want to create the SET_TARGET and ADD_TARGET procedures on your own test system).
execute set_target( 'aaa' ) execute add_target( 'a b c' ) execute add_target( 'b b b b' ) execute add_target( ' bbb xx ' ) execute add_target( 'wdef def w' ) execute add_target( 'defxdefdef' ) execute add_target( null ) set null "(null)" column target heading "TARGET|STRING" column pattern_count heading "PATTERN|COUNT"
Count String Pattern Occurrences (Oracle 11g)
To count the occurrences of the letter "b" in the target strings we can use a query like this.
select target , REGEXP_COUNT( TARGET, 'b' ) as pattern_count from targets ;
TARGET PATTERN STRING COUNT ---------- ---------- aaa 0 a b c 1 b b b b 4 bbb xx 3 wdef def w 0 defxdefdef 0 (null) (null)
This query finds the total occurrences of the letters "a", "c", or "f" in the target strings.
select target , REGEXP_COUNT( TARGET, '[acf]' ) as pattern_count from targets ;
TARGET PATTERN STRING COUNT ---------- ---------- aaa 3 a b c 2 b b b b 0 bbb xx 0 wdef def w 2 defxdefdef 3 (null) (null)
To calculate the number of times a string like "def" occurs we use a query like this.
select target , REGEXP_COUNT( TARGET, 'def' ) as pattern_count from targets ;
TARGET PATTERN STRING COUNT ---------- ---------- aaa 0 a b c 0 b b b b 0 bbb xx 0 wdef def w 2 defxdefdef 3 (null) (null)
To find the number of words in a string we can use a query like this (for our purposes we define a word as any series of characters that are not a space).
select target , REGEXP_COUNT( TARGET, '[^ ]+' ) as pattern_count from targets ;
TARGET PATTERN STRING COUNT ---------- ---------- aaa 1 a b c 3 b b b b 4 bbb xx 2 wdef def w 3 defxdefdef 1 (null) (null)
To ensure PATTERN_COUNT always returns a non-null value use NVL.
select target , NVL( regexp_count( target, '[^ ]+' ), 0 ) as pattern_count from targets ;
TARGET PATTERN STRING COUNT ---------- ---------- aaa 1 a b c 3 b b b b 4 bbb xx 2 wdef def w 3 defxdefdef 1 (null) 0
To learn more about regular expression patterns in general see Pattern Matching.
Count String Pattern Occurrences (Oracle 10g)
While Oracle 10g supports most of the regular expression features described in this section it unfortunately does not support REGEXP_COUNT. However, we can use the LENGTH and REGEXP_REPLACE functions to produce similar results.
To count the occurrences of the letter "b" in the target strings we can use a query like this (the column INTERMEDIATE_STRING is included to demonstrate the inner workings of each solution).
column intermediate_string format a12 heading "INTERMEDIATE|STRING" select target , regexp_replace( target, '[^b]', null ) as intermediate_string , NVL( LENGTH( REGEXP_REPLACE( TARGET, '[^b]', NULL ) ), 0 ) as pattern_count from targets ;
TARGET INTERMEDIATE PATTERN STRING STRING COUNT ---------- ------------ ---------- aaa (null) 0 a b c b 1 b b b b bbbb 4 bbb xx bbb 3 wdef def w (null) 0 defxdefdef (null) 0 (null) (null) 0
This query finds the total occurrences of the letters "a", "c", or "f" in the target strings.
select target , regexp_replace( target, '[^acf]', null ) as intermediate_string , NVL( LENGTH( REGEXP_REPLACE( TARGET, '[^acf]', NULL ) ), 0 ) as pattern_count from targets ;
TARGET INTERMEDIATE PATTERN STRING STRING COUNT ---------- ------------ ---------- aaa aaa 3 a b c ac 2 b b b b (null) 0 bbb xx (null) 0 wdef def w ff 2 defxdefdef fff 3 (null) (null) 0
To calculate the number of times a string like "def" occurs we use a query like this.
select target , regexp_replace( target, '(d)ef|.', '\1' ) as intermediate_string , NVL( LENGTH( REGEXP_REPLACE( TARGET, '(d)ef|.', '\1' ) ), 0 ) as pattern_count from targets ;
TARGET INTERMEDIATE PATTERN STRING STRING COUNT ---------- ------------ ---------- aaa (null) 0 a b c (null) 0 b b b b (null) 0 bbb xx (null) 0 wdef def w dd 2 defxdefdef ddd 3 (null) (null) 0
(I found this last technique in this OTN Discussion Forum thread by cd after reading this blog post by Volder.)
To find the number of words in a string we can use a query like this.
select target , regexp_replace( target, ' *[^ ]+ *', 'w' ) as intermediate_string , NVL( LENGTH( REGEXP_REPLACE( TARGET, ' *[^ ]+ *', 'w' ) ), 0 ) as pattern_count from targets ;
TARGET INTERMEDIATE PATTERN STRING STRING COUNT ---------- ------------ ---------- aaa w 1 a b c www 3 b b b b wwww 4 bbb xx ww 2 wdef def w www 3 defxdefdef w 1 (null) (null) 0
Count String Pattern Occurrences (Any Oracle Version)
In any Oracle version we can use the LENGTH and REPLACE functions to count patterns, though the logic is slightly more complex than the techniques above.
To count the occurrences of the letter "b" in the target strings we can use a query like this.
column target_length heading "TARGET|LENGTH" column intermediate_length heading "INTERMEDIATE|LENGTH" select target , replace( target, 'b', null ) as intermediate_string , nvl( length( target ), 0 ) as target_length , nvl( length( replace( target, 'b', null ) ), 0 ) as intermediate_length , NVL( LENGTH( TARGET ), 0 ) - NVL( LENGTH( REPLACE( TARGET, 'b', NULL ) ), 0 ) as pattern_count from targets ;
TARGET INTERMEDIATE TARGET INTERMEDIATE PATTERN STRING STRING LENGTH LENGTH COUNT ---------- ------------ ---------- ------------ ---------- aaa aaa 3 3 0 a b c a c 5 4 1 b b b b 10 6 4 bbb xx xx 10 7 3 wdef def w wdef def w 10 10 0 defxdefdef defxdefdef 10 10 0 (null) (null) 0 0 0
(The INTERMEDIATE_STRING value in the third row is six spaces. In the fourth row it is three spaces followed by "xx" followed by two spaces.)
This query finds the total occurrences of the letters "a", "c", or "f" in the target strings.
select target , translate( target, 'xacf', 'x' ) as intermediate_string , nvl( length( target ), 0 ) as target_length , nvl( length( translate( target, 'xacf', 'x' ) ), 0 ) as intermediate_length , NVL( LENGTH( TARGET ), 0 ) - NVL( LENGTH( TRANSLATE( TARGET, 'xacf', 'x' ) ), 0 ) as pattern_count from targets ;
TARGET INTERMEDIATE TARGET INTERMEDIATE PATTERN STRING STRING LENGTH LENGTH COUNT ---------- ------------ ---------- ------------ ---------- aaa (null) 3 0 3 a b c b 5 3 2 b b b b b b b b 10 10 0 bbb xx bbb xx 10 10 0 wdef def w wde de w 10 8 2 defxdefdef dexdede 10 7 3 (null) (null) 0 0 0
To calculate the number of times a string like "def" occurs we use a query like this.
column intermediate_string_1 format a12 heading "INTERMEDIATE|STRING 1" column intermediate_string_2 format a12 heading "INTERMEDIATE|STRING 2" select target , replace( target, 'def', 'd' ) as intermediate_string_1 , replace( target, 'def', null ) as intermediate_string_2 , nvl( length( replace( target, 'def', 'd' ) ), 0 ) as length_1 , nvl( length( replace( target, 'def', null ) ), 0 ) as length_2 , NVL( LENGTH( REPLACE( TARGET, 'def', 'd' ) ), 0 ) - NVL( LENGTH( REPLACE( TARGET, 'def', NULL ) ), 0 ) as pattern_count from targets ;
TARGET INTERMEDIATE INTERMEDIATE PATTERN STRING STRING 1 STRING 2 LENGTH_1 LENGTH_2 COUNT ---------- ------------ ------------ ---------- ---------- ---------- aaa aaa aaa 3 3 0 a b c a b c a b c 5 5 0 b b b b b b b b b b b b 10 10 0 bbb xx bbb xx bbb xx 10 10 0 wdef def w wd d w w w 6 4 2 defxdefdef dxdd x 4 1 3 (null) (null) (null) 0 0 0
To find the number of words in a string we can use a query like this. Note the query will not work for target strings containing more than one space between words.
column trimmed_length_plus_1 heading "TRIMMED LENGTH|PLUS 1" select target , trimmed , replace( trimmed, ' ', null ) as intermediate_string , nvl( length( trimmed ) + 1, 0 ) as trimmed_length_plus_1 , nvl( length( replace( trimmed, ' ', null ) ), 0 ) as intermediate_length , NVL( LENGTH( TRIMMED ) + 1, 0 ) - NVL( LENGTH( REPLACE( TRIMMED, ' ', NULL ) ), 0 ) as pattern_count from ( select target, ltrim( rtrim(target) ) as trimmed from targets ) where nvl( ltrim( rtrim( target ) ), 'x' ) not like '% %' ;
TARGET INTERMEDIATE TRIMMED LENGTH INTERMEDIATE PATTERN STRING TRIMMED STRING PLUS 1 LENGTH COUNT ---------- --------------- ------------ -------------- ------------ ---------- aaa aaa aaa 4 3 1 a b c a b c abc 6 3 3 bbb xx bbb xx bbbxx 7 5 2 wdef def w wdef def w wdefdefw 11 8 3 defxdefdef defxdefdef defxdefdef 11 10 1 (null) (null) (null) 0 0 0
In Oracle 8i or later the LTRM and RTRIM functions can be replaced with a single TRIM call.
select target , trimmed , replace( trimmed, ' ', null ) as intermediate_string , nvl( length( trimmed ) + 1, 0 ) as trimmed_length_plus_1 , nvl( length( replace( trimmed, ' ', null ) ), 0 ) as intermediate_length , nvl( length( trimmed ) + 1, 0 ) - nvl( length( replace( trimmed, ' ', null ) ), 0 ) as pattern_count from ( select target, TRIM( target ) as trimmed from targets ) where nvl( TRIM( target ), 'x' ) not like '% %' ;
TARGET INTERMEDIATE TRIMMED LENGTH INTERMEDIATE PATTERN STRING TRIMMED STRING PLUS 1 LENGTH COUNT ---------- --------------- ------------ -------------- ------------ ---------- aaa aaa aaa 4 3 1 a b c a b c abc 6 3 3 bbb xx bbb xx bbbxx 7 5 2 wdef def w wdef def w wdefdefw 11 8 3 defxdefdef defxdefdef defxdefdef 11 10 1 (null) (null) (null) 0 0 0