SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Using regular expressions

PretaFeb 25 2010 — edited May 13 2010

Hi Experts,

After going through some documentation on regular expressions in Oracle I have tried to draw some conclusions about the same. As I wasn’t much confident on how the patterns are built, I have tried to interpret them by looking at the output. It’s basically a reverse engineering I have tried to do.

Please let me know if my interpretations are correct. Any additions /suggestions/corrections are most welcome.
Some of the examples may lack conclusions, please ignore those.

***********************************************************************
select regexp_substr('1PSN/231_3253/ABc','^([[:alnum:]]*)') from dual;

Output: 1PSN

Interpreted as:

^ From the start of the source string
([[:alnum:]]*) zero or more occurrences of alphanumeric characters

************************************************************************

select regexp_substr('@@/231_3253/ABc','@*([[:alnum:]]+)') from dual;

Output: 231

Interpreted as:

@* Search for zero or more occurrences of @
([[:alnum:]]+) followed by one or more occurrences of alphanumeric characters

Note: In the above example oracle looks for @(zero times or more) immediately followed by alphanumeric characters.
Since a '/' comes between @ and 231 the o/p is 0 occurences of @ + one or more occurrences of alphanumerics.

************************************************************************

select regexp_substr('1@/231_3253/ABc','@+([[:alnum:]]*)') from dual;

Output: @

Interpreted as:

@+ one or more ocurrences of @
([[:alnum:]]*) followed by 0 or more occurrences of alphanumerics

************************************************************************

select regexp_substr('1@/231_3253/ABc','@+([[:alnum:]]+)') from dual;

Output: Null

Interpreted as:

@+ one or more occurences of @
([[:alnum:]]+) followed by one or more occurences of aplhanumerics

************************************************************************

select regexp_substr('@1PSN/231_3253/ABc125','([[:digit:]]+)$') from dual;

Output: 125

Interpreted as:

([[:digit:]]+) one or more occurences of digits only
$ at the end of the string

************************************************************************

select regexp_substr('@1PSN/231_3253/ABc','([^[:digit:]]+)$') from dual;

output: /ABc

Interpreted as:

([^[:digit:]]+)$ one or more occurrences of non-digit literals at the end of the string
'^' inside square brackets marks the negation of the class
************************************************************************
Look for http:// followed by a substring of one or more alphanumeric characters and optionally, a period (.)

SELECT REGEXP_SUBSTR('Go to http://www.oracle.com/products and click on database','http://([[:alnum:]]+\.?){3,4}/?') RESULT
FROM dual;

Output: http://www.oracle.com

Interpreted as:

[[:alnum:]]+ one or more occurences of alplanumeric characters
\.? dot optionally (backslash represents escape sequence,? represents optionally)
{3,4} 3 or 4 times
/? followed by forward slash optionally
 
If you have www.oracle.co.uk; {3,4} extracts it for you as well

************************************************************************

Validate email:

select case  when
       REGEXP_LIKE('abc_pqr@xyz.co.uk',
                   '^([[:alnum:]]+(\_?|\.))([[:alnum:]]*)@([[:alnum:]]+)(.([[:alnum:]]+)){1,2}$') then 'Match Found' 
       else 'No Match Found'
       end
as output from dual;

Interpreted as:

([[:alnum:]]+(\_?|\.)) one or more occurrences of alpha numerics optionally followed by an underscore or dot
([[:alnum:]]*) followed by 0 or more occurrences of alplhanumerics
@ followed by @
([[:alnum:]]+) followed by one or more occurrences of alplhanumerics
(.([[:alnum:]]+)){1,2} followed by a dot followed by alphanumerics from once till max of twice (Ex- .com or .co.uk)

Output: Match Found

Input: abc1pqr@xyz.co.uk
Output: Match Found

Input: abc-pqr@xyz.co.uk
Output: No Match Found

************************************************************************
Truncate the part, ending with digits

select regexp_substr('Yahoo11245@US','^.*[[:digit:]]',1) from dual;
Output: Yahoo11245

select regexp_substr('*Yahoo*11245@US','^.*[[:digit:]]',1) from dual;
Output: *Yahoo*11245

Interpreted as:

.* zero or more occurrences of any characters (dot signifies any character)

************************************************************************
Replace 2 to 8 spaces with single space

select regexp_replace('Hello   you      OPs       there','[[:space:]]{2,8}',' ')
from dual;

************************************************************************
Search for control characters

select case  when
       regexp_like('Super' || chr(13) || 'Star' ,'[[:cntrl:]              then 'Match Found' 
       else 'No Match Found'
       end
as output from dual;

Output: Match Found

************************************************************************
Search for lower case letters only with a string length varying from a min of 3 to max of 12

select case  when
       regexp_like('terminator' ,'^[[:lower:]]{3,12}$')
              then 'Match Found' 
       else 'No Match Found'
       end
as output from dual;
************************************************************************
4th character must be a special character

select case  when
       regexp_like('ter*minator' ,'^...[^[:alnum:]              then 'Match Found' 
       else 'No Match Found'
       end
as output from dual;

Ouput: Match Found
************************************************************************
Case Sensitive Search

select case  when
       regexp_like('Republic Of  Africa' ,'of','c')
              then 'Match Found' 
       else 'No Match Found'
       end
as output from dual;

Output: No match found

c stands for case sensitive

select case  when
       regexp_like('Republic Of  africa' ,'of','i')
              then 'Match Found' 
       else 'No Match Found'
       end
as output from dual;

Output: Match Found

i stands for case insensitive
************************************************************************
Two consecutive occurences of characters from a to z

select regexp_substr('Republicc Of Africaa' ,'([a-z])\1', 1,1,'i') from dual;

Output: cc

Interpreted as:

([a-z]) character set a-z
\1 consecutive occurence of any character
1 starting from 1st character in the string
1 First occurence
i case insensitive

Three consecutive occurences of characters from 6 to 9

select case  when
       regexp_like('Patch 10888 applied' ,'([7-9])\1\1')
              then 'Match Found' 
       else 'No Match Found'
       end
as output from dual;

Output: Match Found
************************************************************************

Phone validator:

select case  when
       regexp_like('123-44-5555' ,'^[0-9]{3}-[0-9]{2}-[0-9]{4}$')
              then 'Match Found' 
       else 'No Match Found'
       end
as output from dual;

Output: Match Found

Input: 111-222-3333
Output: No match found

Interpreted as:

^ start of the string
[0-9]{3} three ocurrences of digits from 0-9
- followed by hyphen
[0-9]{2} two ocurrences of digits from 0-9
- followed by hyphen
[0-9]{4} four ocurrences of digits from 0-9
$ end of the string
************************************************************************

Source Links:
http://www.psoug.org/reference/regexp.html

http://www.oracle.com/technology/obe/obe10gdb/develop/regexp/regexp.htm

Edited by: Preta on Feb 25, 2010 4:38 PM
Corrected the example for www.oracle.com

Edited by: Preta Incorported Logan's comments

Locked Post

New comments cannot be posted to this locked post.

Locked on Jun 10 2010

Added on Feb 25 2010

4 comments

1,292 views