Skip to Main Content

Search a clob for a pattern

PugzlySep 11 2021

I am trying to create a function that searches a CLOB for a specific term. In this case 'hello world' ONLY!!!

Note some of the text within the CLOB can contain LF and some may not. I provided some sample data off both.

In this particular example I'm looking to search through the CLOB to find the starting and ending positions of the phrase 'hello world'  ONLY.

I know to find the starting position I need to use the INSTR command and the ending position should be INSTR+LENGTH of the phrase I'm searching for.

I haven't added that code yet as I am unable to get the parsing working correctly yet.

If you run the sample data I provided you can I'm getting back more than the phrase 'hello world'

Once the parse works, I'm looking to generate the following output (see below).

id  start_pos  end_pos  search_term

Note in my function the number 4000 is hard coded as I recall that being the max limit of a VARCHAR2. I think it's been extended to 32767 in 12c and up but that should be a non issue here as I only have a small amount of data.

Any help would be greatly appreciated. Below is my  test CASE and im testing on livesql so my environment can be replicated if needed.

create table t ( x int primary key, y clob );

insert into t values ( 1, 'line 1 line 2 line 3 line 4 line 5 hello world' );

insert into t values ( 2, 'line 1 hello world line 2 line 3 line 4 line 5 hello world' );

insert into t values ( 3, 'line 1 hello world line 2 line 3 line 4 line 5

hello world' );

declare
 l_text long;
 begin
 for i in 1 .. 3000
 loop
 l_text := l_text || 'line ' || i || chr(10);
 end loop;
 l_text := l_text || 'hello world 3001';
 insert into t values ( 4, l_text );
 l_text := l_text || chr(10) || 'line 3002';
 insert into t values ( 5, l_text );
 l_text := 'hello world 0' || chr(10) || l_text;
 insert into t values ( 6, l_text );
 end;

create or replace function search_clob( p_lob in clob, p_what in varchar2 ) return varchar2
 as
 l_text long;
 l_instr number;
 begin
 l_instr := dbms_lob.instr( p_lob, p_what );
 l_instr := dbms_lob.instr( p_lob, chr(10), l_instr );
 if ( l_instr = 0 ) then l_instr := dbms_lob.getlength(p_lob); end if;
 if ( l_instr < 4000 )
 then
 l_text := rtrim(chr(10) || dbms_lob.substr( p_lob, l_instr, 1 ), chr(10));
 else
 l_text := rtrim(chr(10) || dbms_lob.substr( p_lob, 4000, l_instr-3999 ), chr(10));
 end if;
 return substr( l_text, instr(l_text,chr(10), -1, 1 )+1 );
 end;
 

select x,search_clob( y, 'hello world' ) y, dbms_lob.getlength(y) len
 from t
 where dbms_lob.instr( y, 'hello world' ) > 0;
Comments
Post Details
Added on Sep 11 2021
19 comments
1,127 views