In this tutorial we begin examining ways to specify WHAT text to match by exploring patterns for individual characters.
Matching Individual Characters
To match a single character string, e.g. the letter 'b', simply use it as the regular expression pattern.
execute set_pattern( 'b' ) execute set_target( 'aaa' ) execute add_target( 'abc' ) execute add_target( 'bbb' ) execute add_target( 'bcd' ) execute add_target( 'cde' ) execute add_target( 'def' ) execute add_target( 'efg' ) select * from test_results ;
PATTERN TARGET MATCH MATCHED_VALUE POSITION --------------- ---------- ----- --------------- -------- b aaa N (null) 0 abc Y b 2 bbb Y b 1 bcd Y b 1 cde N (null) 0 def N (null) 0 efg N (null) 0
To match any one of a set of individual characters use character list syntax '[ ]'. The following example matches either an 'a', 'c', or 'g'.
execute set_pattern( '[acg]' ) select * from test_results ;
PATTERN TARGET MATCH MATCHED_VALUE POSITION --------------- ---------- ----- --------------- -------- [acg] aaa Y a 1 abc Y a 1 bbb N (null) 0 bcd Y c 2 cde Y c 1 def N (null) 0 efg Y g 3
To exclude characters place '^' after the '['.
execute set_pattern( '[^acg]' ) select * from test_results ;
PATTERN TARGET MATCH MATCHED_VALUE POSITION --------------- ---------- ----- --------------- -------- [^acg] aaa N (null) 0 abc Y b 2 bbb Y b 1 bcd Y b 1 cde Y d 2 def Y d 1 efg Y e 1
Matching any one of a range of characters, e.g. 'a' through 'c', can be done with the range operator '-'.
execute set_pattern( '[a-c]' ) select * from test_results ;
PATTERN TARGET MATCH MATCHED_VALUE POSITION --------------- ---------- ----- --------------- -------- [a-c] aaa Y a 1 abc Y a 1 bbb Y b 1 bcd Y b 1 cde Y c 1 def N (null) 0 efg N (null) 0
Be careful when working with alphabets other than English since a range like '[a-z]' may not include all characters in your alphabet.
execute set_pattern( '[a-z]' ) execute set_target( 'aeiou' ) execute add_target( 'àéîõü' ) select * from test_results ;
PATTERN TARGET MATCH MATCHED_VALUE POSITION --------------- ---------- ----- --------------- -------- [a-z] aeiou Y a 1 àéîõü N (null) 0
POSIX character classes and equivalence classes (discussed in WHAT to Match - 2) offer a solution when working with alphabets other than English.
Case Insensitive Matching
By default, pattern matching with regular expressions is case sensitive. To perform case insensitive matches use the pattern matching modifier 'i' in REGEXP conditions and functions.
First, consider a case sensitive match.
execute set_target( 'a' ) execute add_target( 'b' ) execute add_target( 'c' ) execute add_target( 'A' ) execute add_target( 'B' ) execute add_target( 'C' ) select '[b]' as pattern, target , regexp_substr( target, '[b]' ) as matched_value from targets ;
PATTERN TARGET MATCHED_VALUE --------------- ---------- --------------- [b] a (null) b b c (null) A (null) B (null) C (null)
Now consider the same pattern in a case insensitive match. The 'i' modifier is specified in the fifth parameter of the REGEXP_SUBSTR function.
select '[b]' as pattern, target , regexp_substr( target, '[b]', 1, 1, 'i' ) as matched_value from targets ;
PATTERN TARGET MATCHED_VALUE --------------- ---------- --------------- [b] a (null) b b c (null) A (null) B B C (null)
Escaping Special Characters
Attempting to match special characters, like '[', may cause problems if you fail to escape them. The pattern 'a[b', for example, will not work.
execute set_pattern( 'a[b' ) execute set_target( 'abc' ) execute add_target( 'a[b[c' ) execute add_target( 'a\b\c' ) select * from test_results ; select * from test_results * ERROR at line 1: ORA-12726: unmatched bracket in regular expression
To escape special characters, prefix them with the '\' character.
execute set_pattern( 'a\[b' ) select * from test_results ;
PATTERN TARGET MATCH MATCHED_VALUE POSITION --------------- ---------- ----- --------------- -------- a\[b a[b[c Y a[b 1 a\b\c N (null) 0 abc N (null) 0
execute set_pattern( 'a\\b' ) select * from test_results ;
PATTERN TARGET MATCH MATCHED_VALUE POSITION --------------- ---------- ----- --------------- -------- a\\b a[b[c N (null) 0 a\b\c Y a\b 1 abc N (null) 0
Wildcard Character
To match any character at all, except null and the newline character, use the '.' character.
execute set_pattern( '.' ); execute set_target( 'abc' ) execute add_target( '#24' ) execute add_target( '123' ) select * from test_results ;
PATTERN TARGET MATCH MATCHED_VALUE POSITION --------------- ---------- ----- --------------- -------- . #24 Y # 1 123 Y 1 1 abc Y a 1
Be careful with nulls, nuls, line feeds, and carriage returns. '.' does not match them all.
execute set_pattern( '.' ); begin set_target( chr(0) ); end; -- NUL character, do not confuse with NULL / begin add_target( chr(10) ); end; -- a Linefeed / begin add_target( chr(13) ); end; -- a Carriage Return / begin add_target( 'a' ); end; / begin add_target( null ); end; / begin add_target( '' ); end; -- an empty string is equivalent to NULL / select pattern , ascii( target ) as target_ascii_code , match , ascii( matched_value ) as matched_value_ascii_code , position from test_results ;
PATTERN TARGET_ASCII_CODE MATCH MATCHED_VALUE_ASCII_CODE POSITION --------------- ----------------- ----- ------------------------ -------- . 0 Y 0 1 10 N (null) 0 13 Y 13 1 97 Y 97 1 (null) N (null) (null) (null) N (null) (null)
If you need '.' to match the newline character you can use the 'n' modifier in REGEXP conditions and functions.
select '.' as pattern , ascii( target ) as target_ascii_code , case when regexp_like( target, '.', 'n' ) then 'Y' else 'N' end as match , regexp_instr( target, '.', 1, 1, 0, 'n' ) as position from targets ;
PATTERN TARGET_ASCII_CODE MATCH POSITION --------------- ----------------- ----- -------- . 0 Y 1 10 Y 1 13 Y 1 97 Y 1 (null) N (null) (null) N (null)
Nulls
Using a null for either a regular expression pattern or a search target always yields an unknown result.
execute set_pattern( null ) select * from test_results ;
PATTERN TARGET MATCH MATCHED_VALUE POSITION --------------- ---------- ----- --------------- -------- (null) N (null) (null) N (null) (null) N (null) (null) a N (null) (null) (null) N (null) (null) (null) N (null) (null)
execute set_pattern( '.' ) execute set_target( 'abc' ) execute add_target( null ) select * from test_results ;
PATTERN TARGET MATCH MATCHED_VALUE POSITION --------------- ---------- ----- --------------- -------- . abc Y a 1 (null) N (null) (null)
Gotchas
You cannot specify multi-character strings in character lists. A pattern like 'a[(bc)]' does NOT mean 'a' followed by the string 'bc'. It has the same meaning as 'a[bc]'.
execute set_pattern( 'a[(bc)]' ) execute set_target( 'abc' ) execute add_target( 'ab' ) execute add_target( 'ac' ) execute add_target( 'bc' ) select * from test_results ;
PATTERN TARGET MATCH MATCHED_VALUE POSITION --------------- ---------- ----- --------------- -------- a[(bc)] ab Y ab 1 abc Y ab 1 ac Y ac 1 bc N (null) 0
execute set_pattern( 'a[bc]' ) select * from test_results ;
PATTERN TARGET MATCH MATCHED_VALUE POSITION --------------- ---------- ----- --------------- -------- a[bc] ab Y ab 1 abc Y ab 1 ac Y ac 1 bc N (null) 0
You cannot use 'not' to turn an expression with an unknown result into an expression with a true result. For example, neither of the the following two queries return any rows, though some might expect the second one would.
set feedback on select * from dual where regexp_like( null, 'abc' ) ; no rows selected select * from dual where not regexp_like( null, 'abc' ) ; no rows selected