Thread: Use REGEXP_REPLACE as advanced initcap function?


Permlink Replies: 35 - Pages: 3 [ 1 2 3 | Next ] - Last Post: Jul 21, 2008 2:30 AM Last Post By: cd_2
haavard

Posts: 26
Registered: 02/13/07
Use REGEXP_REPLACE as advanced initcap function?
Posted: Feb 27, 2008 5:38 AM
Click to report abuse...   Click to reply to this thread Reply
Hi,

I wan't to convert a string "bygger'n" to "Bygger'n", is it possible to do something like this with regexp_replace?

"bygger'n" is just meant as an example, I want to replace all matching patterns with lowercase letters.

Oracle does not seem to consider lower/upper-functions for the replace of patterns:
SQL> select regexp_replace(initcap('bygger''n'), '(''[[:alnum:]])', lower('\1')) try_lower
2 , regexp_replace(initcap('bygger''n'), '(''[[:alnum:]])', ' \1 ') spaces
3 from dual
4 /

TRY_LOWE SPACES

----------
Bygger'N Bygger 'N

Regards
Haavard
damorgan

Posts: 11,377
Registered: 10/20/03
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Feb 27, 2008 9:56 AM   in response to: haavard in response to: haavard
Click to report abuse...   Click to reply to this thread Reply
Use the case sensitive match options.

http://www.psoug.org/reference/regexp.html
haavard

Posts: 26
Registered: 02/13/07
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Feb 27, 2008 11:23 PM   in response to: damorgan in response to: damorgan
Click to report abuse...   Click to reply to this thread Reply
@damorgan:
I don't think case sensitivity is going to help me, perhaps explaining the problem properly will :-)

The problem: Formatting names from uppercase:
Source	       Formatted


O'MALLEY O'Malley
TEH'TE Teh'te
BYGGER'N Bygger'n
A single quote placed 2nd in the name should be followed by an upprecase character, placed anywhere else it should be followed by a lowercase character.

So; can this be solved by regexp_replace?

I realise now that in my original question the lower('\1') will only perform lower-function to '\1', not the actual result of the back reference.

Can the text in a back reference in the replace_string of a regexp_replace-function be manipulated in any way?

Regards,
Haavard
cd_2

Posts: 5,004
Registered: 09/08/98
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Feb 28, 2008 1:44 AM   in response to: haavard in response to: haavard
Click to report abuse...   Click to reply to this thread Reply
Can the text in a back reference in the
replace_string of a regexp_replace-function be
manipulated in any way?

In the current versions: No.

Think of the REGEXP_REPLACE function as an advanced REPLACE function. For your purpose you'd need something like a REGEXP_INITCAP function or a callback function inside the REPEXP_REPLACE result string, which both are not available.

C.
haavard

Posts: 26
Registered: 02/13/07
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Feb 28, 2008 4:16 AM   in response to: cd_2 in response to: cd_2
Click to report abuse...   Click to reply to this thread Reply
@cd
Thank you for your clarification. REGEXP_INITCAP would help in this particular case, REGEXP_REPLACE with all the normal Oracle string manipulating functions available for manipulating the back reference would be even better :-)

In the mean while:
SQL> select original
2 , regexp_replace(
3 initcap(
4 regexp_replace( original
5 , '([[:alnum:]]{2,})('')([[:alnum:]*])'
6 , '\199dummystring99\3'))
7 , '99dummystring99'
8 , '''') formatted
9 from (select 'O''MALLEY' original from dual
10 union
11 select 'TEH''TE' original from dual
12 union
13 select 'BYGGER''N' original from dual);

ORIGINAL FORMATTED


--------------------
BYGGER'N Bygger'n
O'MALLEY O'Malley
TEH'TE Teh'te


It's not pretty, but works for my purposes. Well, at least until there comes a person called "99dummystring99". Just a question of time :-)

Regards,
Haavard
cd_2

Posts: 5,004
Registered: 09/08/98
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Feb 28, 2008 5:07 AM   in response to: haavard in response to: haavard
Click to report abuse...   Click to reply to this thread Reply
Another solution, without 99dummystring99:

WITH t AS (SELECT 'O''MALLEY' col1
             FROM dual
            UNION
           SELECT 'TEH''TE'
             FROM dual
            UNION 
           SELECT 'BYGGER''N'
             FROM dual
            UNION 
           SELECT 'MCDONALD'
             FROM dual
           )
SELECT t.col1
     , INITCAP(REGEXP_SUBSTR(t.col1, '^([^'']''|mc|.)', 1, 1, 'i'))
    || INITCAP(REGEXP_REPLACE(t.col1, '(^([^'']''|mc)(.*)$)|^.*$', '\3', 1, 1, 'i'))
    || LOWER(REGEXP_REPLACE(t.col1, '^(([^'']''|mc).*$|.)', '', 1, 1, 'i')) new_col1
  FROM t
; 
 
 
COL1            NEW_COL1
--------------- ---------------
BYGGER'N        Bygger'n
MCDONALD        McDonald
O'MALLEY        O'Malley
TEH'TE          Teh'te


C.
haavard

Posts: 26
Registered: 02/13/07
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Feb 28, 2008 5:46 AM   in response to: cd_2 in response to: cd_2
Click to report abuse...   Click to reply to this thread Reply
@cd
Thank you! I'm only coming to grasp the regexp syntax, and I never would have thought to dissect the string like that.

Very educational, and Mr. 99dummystring99 will also receive a letter with nicely formatted name :-)
Volder

Posts: 986
Registered: 04/14/07
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Mar 1, 2008 2:13 AM   in response to: haavard in response to: haavard
Click to report abuse...   Click to reply to this thread Reply
if your values comprise only single names - no need for regexp at all:

SQL> WITH t AS (SELECT 'O''MALLEY' col1
  2               FROM dual
  3              UNION
  4             SELECT 'TEH''TE'
  5               FROM dual
  6              UNION
  7             SELECT 'BYGGER''N'
  8               FROM dual
  9              UNION
 10             SELECT 'MCDONALD'
 11               FROM dual
 12             )
 13             --
 14             select col1, substr(initcap(col1),1,3)||lower(substr(col1,4)) from t
 15  / 
 
COL1     SUBSTR(INITCAP(COL1),1,3)||LOW
-------- ------------------------------
BYGGER'N Bygger'n
MCDONALD Mcdonald
O'MALLEY O'Malley
TEH'TE   Teh'te
 
SQL> 
cd_2

Posts: 5,004
Registered: 09/08/98
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Mar 2, 2008 2:49 AM   in response to: Volder in response to: Volder
Click to report abuse...   Click to reply to this thread Reply
Shouldn't it be McDonald?

C.
Volder

Posts: 986
Registered: 04/14/07
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Mar 2, 2008 3:30 AM   in response to: cd_2 in response to: cd_2
Click to report abuse...   Click to reply to this thread Reply
Shouldn't it be McDonald?

I don't know, there was nothing about it from the OP's side ;)

He only asked for:
"A single quote placed 2nd in the name should be followed by an upprecase character,
placed anywhere else it should be followed by a lowercase character."
haavard

Posts: 26
Registered: 02/13/07
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Mar 2, 2008 11:01 PM   in response to: Volder in response to: Volder
Click to report abuse...   Click to reply to this thread Reply
@Volder
Yeah, that was what I intended... ;)

Ok, let's up the ante; replace the <magic regexp thingy> to achieve the following in the formatted column based on the source column:
with t as (select 'FIRSTNAME O''MALLEY' col1
from dual
union
select 'FIRST''NAME TEH''TE'
from dual
union
select 'FORMAT ME BYGGER''N'
from dual
union
select 'OLD MCDONALD'
from dual
union
select 'EVEN OL''DER MACDONALD'
from dual
)
select t.col1 source
, <magic regexp thingy> formatted
from t
;

SOURCE FORMATTED


---------------------
EVEN OL'DER MACDONALD Even Ol'der MacDonald
FIRST'NAME TEH'TE First'name Teh'te
FIRSTNAME O'MALLEY Firstname O'Malley
FORMAT ME BYGGER'N Format Me Bygger'n
OLD MCDONALD Old McDonald

Bearing in mind that:
- String SOURCE contains 1..x names separated by space
- Formatting rules must be applied to all names in the string

I can be as precise as you want in specifying the rules for the formatted column, but in the spirit of the scrum age that is upon us, I'll settle for: Please, don't hesitate to ask if anything is unclear :D

Regards,
Havard

NB: I see that this post could result in "Do you want US to do all YOUR work?", but I'll take my chances:) Anyway, I think the result might be of an interest to more than me.

cd_2

Posts: 5,004
Registered: 09/08/98
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Mar 3, 2008 2:29 AM   in response to: haavard in response to: haavard
Click to report abuse...   Click to reply to this thread Reply
Here's a quick hack with the MODEL clause:

WITH t AS (SELECT 'FIRSTNAME O''MALLEY' col1
             FROM DUAL
            UNION
           SELECT 'FIRST''NAME  TEH''TE'
             FROM DUAL
            UNION
           SELECT 'FORMAT ME BYGGER''N'
             FROM DUAL
            UNION
           SELECT 'OLD MCDONALD'
             FROM DUAL
            UNION
           SELECT 'EVEN OL''DER MACDONALD'
             FROM DUAL)
SELECT col1
     , new_col1
  FROM t
 MODEL
 PARTITION BY (ROWNUM rn)
   DIMENSION BY (0 dim)
   MEASURES(col1, CAST('' AS VARCHAR2(255)) word, CAST('' AS VARCHAR(255)) new_col1)
   RULES ITERATE(99) UNTIL (word[0] IS NULL)
                (word[0] = REGEXP_SUBSTR(col1[0], '[^ ]+( *|$)', 1, ITERATION_NUMBER + 1)
               , new_col1[0] = new_col1[0]
              || INITCAP(REGEXP_SUBSTR(word[0], '^([^'']''|ma?c|.)', 1, 1, 'i'))
              || INITCAP(REGEXP_REPLACE(word[0], '(^([^'']''|ma?c)(.*)$)|^.*$', '\3', 1, 1, 'i'))
              || LOWER(REGEXP_REPLACE(word[0], '^(([^'']''|ma?c).*$|.)', '', 1, 1, 'i'))
                )
  ;
 
COL1                      NEW_COL1
------------------------- -------------------------
EVEN OL'DER MACDONALD     Even Ol'der MacDonald
FIRSTNAME O'MALLEY        Firstname O'Malley
FORMAT ME BYGGER'N        Format Me Bygger'n
FIRST'NAME  TEH'TE        First'name  Teh'te
OLD MCDONALD              Old McDonald


C.
Alessandro Rossi

Posts: 1,034
Registered: 08/09/06
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Mar 3, 2008 3:42 AM   in response to: haavard in response to: haavard
Click to report abuse...   Click to reply to this thread Reply
If you are on 10g you can do it with a Java Stored Procedure that handles regular expressions with a powerful API. Take a look at the Matcher class.

http://www.regular-expressions.info/java.html
http://www.javaworld.com/javaworld/jw-02-2003/jw-0207-java101.html

create or replace java source named MyInitcap as
import java.util.regex.*;
 
class MyInitcap {
	public static String firstUpper(String s) {
		return Character.toUpperCase(s.charAt(0))+s.substring(1).toLowerCase();
	}
	public static String initcap(String input) {
		
		Pattern p = Pattern.compile ("((\\w|')+)");
		Matcher m = p.matcher (input);
		StringBuffer sb = new StringBuffer ();
		while (m.find ())
			m.appendReplacement (sb, firstUpper(m.group()));
		m.appendTail (sb);
		return sb.toString();
	}
}
/ 
 
create or replace function my_initcap (
		string in varchar2
	)return varchar2 
as language java name 'MyInitcap.initcap(java.lang.String) return java.lang.String';
/ 


And then

Processing ...
with t as (select 'FIRSTNAME O''MALLEY' col1
from dual
union
select 'FIRST''NAME TEH''TE'
from dual
union
select 'FORMAT ME BYGGER''N'
from dual
union
select 'OLD MCDONALD'
from dual
union
select 'EVEN OL''DER MACDONALD'
from dual
)
select t.col1 source
, my_initcap(col1) formatted
from t
Query finished, retrieving results...
SOURCE FORMATTED
------------------------
EVEN OL'DER MACDONALD Even Ol'der Macdonald
FIRST'NAME TEH'TE First'name Teh'te
FIRSTNAME O'MALLEY Firstname O'malley
FORMAT ME BYGGER'N Format Me Bygger'n
OLD MCDONALD Old Mcdonald

5 row(s) retrieved


Bye Alessandro

cd_2

Posts: 5,004
Registered: 09/08/98
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Mar 3, 2008 3:45 AM   in response to: Alessandro Rossi in response to: Alessandro Rossi
Click to report abuse...   Click to reply to this thread Reply
Well if you're on 10g, you could use an equivalent PL/SQL function (which would also work under XE). If you look at your result set, you'll see that
M(a)cDonald and O'Malley is not working in your solution.

C.

Message was edited by:
cd
haavard

Posts: 26
Registered: 02/13/07
Re: Use REGEXP_REPLACE as advanced initcap function?
Posted: Mar 3, 2008 4:24 AM   in response to: cd_2 in response to: cd_2
Click to report abuse...   Click to reply to this thread Reply
@cd
"Quick hack" being the understatement of the century ;) But it works beautifully!

@Rossi
For me this is pl/sql food; I am familiar with the language (more than the coffee), and most logic is already in pl/sql. I can't see what hiding the logic behind a function call in one or the other differs, other than in the actual implementation.

But I agree with the strategy; hide the logic behind a function, and I am sure so does cd. This is not the kind of sql you want to maintain in every query... Although, in a thread like this, the sql is very simple to communicate and reproduce/enhance.
Legend
Guru Guru : 2500 - 1000000 pts
Expert Expert : 1000 - 2499 pts
Pro Pro : 500 - 999 pts
Journeyman Journeyman : 200 - 499 pts
Newbie Newbie : 0 - 199 pts
Oracle ACE Director
Oracle ACE Member
Oracle Employee ACE
Helpful Answer (5 pts)
Correct Answer (10 pts)

Point your RSS reader here for a feed of the latest messages in all forums