Regular Expressions

Specifying WHAT to Match - Part 3

In this tutorial we continue examining ways to specify WHAT text to match. Patterns for strings of two or more characters is explored.

Matching Strings with Two or More Characters

To matching a string with two or more characters simply use the string itself as the regular expression.

execute set_pattern( 'bc' );

execute set_target( 'aaa'  );
execute add_target( 'ab'   );
execute add_target( 'abc'  );
execute add_target( 'ac'   );
execute add_target( 'bcd'  );
execute add_target( 'cba'  );

select * from test_results ;
 
PATTERN         TARGET     MATCH MATCHED_VALUE   POSITION
--------------- ---------- ----- --------------- --------
bc              aaa        N     (null)                 0
                ab         N     (null)                 0
                abc        Y     bc                     2
                ac         N     (null)                 0
                bcd        Y     bc                     1
                cba        N     (null)                 0
 

OR Conditions and Grouping

Use the "|" character to specify an OR condition between strings. For example, the pattern "ab|ba" matches either an "ab" OR a "ba".

execute set_pattern( 'ab|ba' );

select * from test_results ;
 
PATTERN         TARGET     MATCH MATCHED_VALUE   POSITION
--------------- ---------- ----- --------------- --------
ab|ba           aaa        N     (null)                 0
                ab         Y     ab                     1
                abc        Y     ab                     1
                ac         N     (null)                 0
                bcd        N     (null)                 0
                cba        Y     ba                     2
 

To avoid ambiguity, it helps to use "(" and ")" to group the terms of OR expressions. For example, does the pattern "a|bc" mean

("a" OR "b") followed by "c"

or does it mean

"a" OR ("bc")?

Using parentheses removes any ambiguity, as we see in the next three examples.

execute set_pattern( 'a|bc' );

select * from test_results ;
 
PATTERN         TARGET     MATCH MATCHED_VALUE   POSITION
--------------- ---------- ----- --------------- --------
a|bc            aaa        Y     a                      1
                ab         Y     a                      1
                abc        Y     a                      1
                ac         Y     a                      1
                bcd        Y     bc                     1
                cba        Y     a                      3
 
execute set_pattern( 'a|(bc)' );

select * from test_results ;
 
PATTERN         TARGET     MATCH MATCHED_VALUE   POSITION
--------------- ---------- ----- --------------- --------
a|(bc)          aaa        Y     a                      1
                ab         Y     a                      1
                abc        Y     a                      1
                ac         Y     a                      1
                bcd        Y     bc                     1
                cba        Y     a                      3
 
execute set_pattern( '(a|b)c' );

select * from test_results ;
 
PATTERN         TARGET     MATCH MATCHED_VALUE   POSITION
--------------- ---------- ----- --------------- --------
(a|b)c          aaa        N     (null)                 0
                ab         N     (null)                 0
                abc        Y     bc                     2
                ac         Y     ac                     1
                bcd        Y     bc                     1
                cba        N     (null)                 0
 

Once you have grouped a subexpression you can reference the string it matched by using a backreference in other sections of the pattern. The backreference "\1" refers to the result of the first bracketed subexpression in the pattern, "\2" refers to the second, and so on up to "\9".

In this example, the backreference "\1" matches the string "abc".

execute set_pattern( '(abc),\1' );

execute set_target( 'abc,'    );
execute add_target( 'abc,abc' );
execute add_target( 'abc,def' );
execute add_target( 'def,def' );

select * from test_results ;
 
PATTERN         TARGET     MATCH MATCHED_VALUE   POSITION
--------------- ---------- ----- --------------- --------
(abc),\1        abc,       N     (null)                 0
                abc,abc    Y     abc,abc                1
                abc,def    N     (null)                 0
                def,def    N     (null)                 0
 

Here, the backreference "\1" matches the string "abc" in one case and "def" in another.

execute set_pattern( '(...),\1' );

select * from test_results ;
 
PATTERN         TARGET     MATCH MATCHED_VALUE   POSITION
--------------- ---------- ----- --------------- --------
(...),\1        abc,       N     (null)                 0
                abc,abc    Y     abc,abc                1
                abc,def    N     (null)                 0
                def,def    Y     def,def                1
 

Ignoring White Space

To ignore white space in a regular expression, use the pattern matching modifier "x" in REGEXP conditions and functions.

execute set_target( 'abc' )

select
  'abc' as pattern ,
  null as modifier ,
  target ,
  regexp_substr( target, 'abc' ) as matched_value
from
  targets
union
select
  'abc' as pattern ,
  'x' as modifier ,
  target ,
  regexp_substr( target, 'abc', 1, 1, 'x' ) as matched_value
from
  targets
union
select
  'a b c' as pattern,
  null as modifier ,
  target ,
  regexp_substr( target, 'a b c' ) as matched_value
from
  targets
union
select
  'a b c' as pattern,
  'x' as modifier ,
  target ,
  regexp_substr( target, 'a b c', 1, 1, 'x' ) as matched_value
from
  targets
order by
  1 desc ,
  2 desc ,
  3
;
 
PATTERN         MODIFIER TARGET     MATCHED_VALUE
--------------- -------- ---------- ---------------
abc             (null)   abc        abc
                x        abc        abc

a b c           (null)   abc        (null)
                x        abc        abc
 

Note that the "x" modifier ignores white space only in the pattern, *not* the search target.

execute set_target( 'a b c' )

select
  'abc' as pattern ,
  null as modifier ,
  target ,
  regexp_substr( target, 'abc' ) as matched_value
from
  targets
union
select
  'abc' as pattern ,
  'x' as modifier ,
  target ,
  regexp_substr( target, 'abc', 1, 1, 'x' ) as matched_value
from
  targets
union
select
  'a b c' as pattern,
  null as modifier ,
  target ,
  regexp_substr( target, 'a b c' ) as matched_value
from
  targets
union
select
  'a b c' as pattern,
  'x' as modifier ,
  target ,
  regexp_substr( target, 'a b c', 1, 1, 'x' ) as matched_value
from
  targets
order by
  1 desc ,
  2 desc ,
  3
;
 
PATTERN         MODIFIER TARGET     MATCHED_VALUE
--------------- -------- ---------- ---------------
abc             (null)   a b c      (null)
                x        a b c      (null)

a b c           (null)   a b c      a b c
                x        a b c      (null)
 

Excluding Strings

Techniques for excluding a string from a pattern match vary depending on the requirements.

To find all targets that do not contain a string like "ab", use the "not" condition.

execute set_target( 'abb'  );
execute add_target( 'abc'  );
execute add_target( 'acc'  );

select target
from   targets
where  not regexp_like( target, 'ab' );
 
TARGET
----------
acc
 

To match a pattern like 'a..' where the second and third characters after an "a" are anything other than "bc" use two WHERE conditions.

select target
from   targets
where  regexp_like( target, 'a..' )
and    not regexp_like( target, 'abc' );
 
TARGET
----------
abb
acc
 



Linking to SQL Snippets ™

To link to this page in Oracle Technology Network Forums or OraFAQ Forums cut and paste this code.

  • [url=http://www.sqlsnippets.com/en/topic-10209.html]SQL Snippets: Regular Expressions - Specifying WHAT to Match - Part 3[/url]

To link to this page in HTML documents or Blogger comments cut and paste this code.

  • <a href="http://www.sqlsnippets.com/en/topic-10209.html">SQL Snippets: Regular Expressions - Specifying WHAT to Match - Part 3</a>

To link to this page in other web sites use the following values.

  • Link Text : SQL Snippets: Regular Expressions - Specifying WHAT to Match - Part 3
  • URL (href): http://www.sqlsnippets.com/en/topic-10209.html