Hi Experts,
After going through some documentation on regular expressions in Oracle I have tried to draw some conclusions about the same. As I wasn’t much confident on how the patterns are built, I have tried to interpret them by looking at the output. It’s basically a reverse engineering I have tried to do.
Please let me know if my interpretations are correct. Any additions /suggestions/corrections are most welcome.
Some of the examples may lack conclusions, please ignore those.
***********************************************************************
select regexp_substr('1PSN/231_3253/ABc','^([[:alnum:]]*)') from dual;
Output: 1PSN
Interpreted as:
^ From the start of the source string
([[:alnum:]]*) zero or more occurrences of alphanumeric characters
************************************************************************
select regexp_substr('@@/231_3253/ABc','@*([[:alnum:]]+)') from dual;
Output: 231
Interpreted as:
@* Search for zero or more occurrences of @
([[:alnum:]]+) followed by one or more occurrences of alphanumeric characters
Note: In the above example oracle looks for @(zero times or more) immediately followed by alphanumeric characters.
Since a '/' comes between @ and 231 the o/p is 0 occurences of @ + one or more occurrences of alphanumerics.
************************************************************************
select regexp_substr('1@/231_3253/ABc','@+([[:alnum:]]*)') from dual;
Output: @
Interpreted as:
@+ one or more ocurrences of @
([[:alnum:]]*) followed by 0 or more occurrences of alphanumerics
************************************************************************
select regexp_substr('1@/231_3253/ABc','@+([[:alnum:]]+)') from dual;
Output: Null
Interpreted as:
@+ one or more occurences of @
([[:alnum:]]+) followed by one or more occurences of aplhanumerics
************************************************************************
select regexp_substr('@1PSN/231_3253/ABc125','([[:digit:]]+)$') from dual;
Output: 125
Interpreted as:
([[:digit:]]+) one or more occurences of digits only
$ at the end of the string
************************************************************************
select regexp_substr('@1PSN/231_3253/ABc','([^[:digit:]]+)$') from dual;
output: /ABc
Interpreted as:
([^[:digit:]]+)$ one or more occurrences of non-digit literals at the end of the string
'^' inside square brackets marks the negation of the class
************************************************************************
Look for http:// followed by a substring of one or more alphanumeric characters and optionally, a period (.)
SELECT REGEXP_SUBSTR('Go to http://www.oracle.com/products and click on database','http://([[:alnum:]]+\.?){3,4}/?') RESULT
FROM dual;
Output: http://www.oracle.com
Interpreted as:
[[:alnum:]]+ one or more occurences of alplanumeric characters
\.? dot optionally (backslash represents escape sequence,? represents optionally)
{3,4} 3 or 4 times
/? followed by forward slash optionally
If you have www.oracle.co.uk; {3,4} extracts it for you as well
************************************************************************
Validate email:
select case when
REGEXP_LIKE('abc_pqr@xyz.co.uk',
'^([[:alnum:]]+(\_?|\.))([[:alnum:]]*)@([[:alnum:]]+)(.([[:alnum:]]+)){1,2}$') then 'Match Found'
else 'No Match Found'
end
as output from dual;
Interpreted as:
([[:alnum:]]+(\_?|\.)) one or more occurrences of alpha numerics optionally followed by an underscore or dot
([[:alnum:]]*) followed by 0 or more occurrences of alplhanumerics
@ followed by @
([[:alnum:]]+) followed by one or more occurrences of alplhanumerics
(.([[:alnum:]]+)){1,2} followed by a dot followed by alphanumerics from once till max of twice (Ex- .com or .co.uk)
Output: Match Found
Input: abc1pqr@xyz.co.uk
Output: Match Found
Input: abc-pqr@xyz.co.uk
Output: No Match Found
************************************************************************
Truncate the part, ending with digits
select regexp_substr('Yahoo11245@US','^.*[[:digit:]]',1) from dual;
Output: Yahoo11245
select regexp_substr('*Yahoo*11245@US','^.*[[:digit:]]',1) from dual;
Output: *Yahoo*11245
Interpreted as:
.* zero or more occurrences of any characters (dot signifies any character)
************************************************************************
Replace 2 to 8 spaces with single space
select regexp_replace('Hello you OPs there','[[:space:]]{2,8}',' ')
from dual;
************************************************************************
Search for control characters
select case when
regexp_like('Super' || chr(13) || 'Star' ,'[[:cntrl:] then 'Match Found'
else 'No Match Found'
end
as output from dual;
Output: Match Found
************************************************************************
Search for lower case letters only with a string length varying from a min of 3 to max of 12
select case when
regexp_like('terminator' ,'^[[:lower:]]{3,12}$')
then 'Match Found'
else 'No Match Found'
end
as output from dual;
************************************************************************
4th character must be a special character
select case when
regexp_like('ter*minator' ,'^...[^[:alnum:] then 'Match Found'
else 'No Match Found'
end
as output from dual;
Ouput: Match Found
************************************************************************
Case Sensitive Search
select case when
regexp_like('Republic Of Africa' ,'of','c')
then 'Match Found'
else 'No Match Found'
end
as output from dual;
Output: No match found
c stands for case sensitive
select case when
regexp_like('Republic Of africa' ,'of','i')
then 'Match Found'
else 'No Match Found'
end
as output from dual;
Output: Match Found
i stands for case insensitive
************************************************************************
Two consecutive occurences of characters from a to z
select regexp_substr('Republicc Of Africaa' ,'([a-z])\1', 1,1,'i') from dual;
Output: cc
Interpreted as:
([a-z]) character set a-z
\1 consecutive occurence of any character
1 starting from 1st character in the string
1 First occurence
i case insensitive
Three consecutive occurences of characters from 6 to 9
select case when
regexp_like('Patch 10888 applied' ,'([7-9])\1\1')
then 'Match Found'
else 'No Match Found'
end
as output from dual;
Output: Match Found
************************************************************************
Phone validator:
select case when
regexp_like('123-44-5555' ,'^[0-9]{3}-[0-9]{2}-[0-9]{4}$')
then 'Match Found'
else 'No Match Found'
end
as output from dual;
Output: Match Found
Input: 111-222-3333
Output: No match found
Interpreted as:
^ start of the string
[0-9]{3} three ocurrences of digits from 0-9
- followed by hyphen
[0-9]{2} two ocurrences of digits from 0-9
- followed by hyphen
[0-9]{4} four ocurrences of digits from 0-9
$ end of the string
************************************************************************
Source Links:
http://www.psoug.org/reference/regexp.html
http://www.oracle.com/technology/obe/obe10gdb/develop/regexp/regexp.htm
Edited by: Preta on Feb 25, 2010 4:38 PM
Corrected the example for www.oracle.com
Edited by: Preta Incorported Logan's comments