Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Table Join Cartesian and String Match

reddy482Apr 27 2017 — edited May 2 2017

Hello

I working on the doing an application to customer match , where an application can have multiple customers  (corporate)  and customer can have multiple applications.

so i am doing a cartesian join and using UTL_MATCH .JARO_WINKLER_SMILIARTY for string matching.

Table A (applications) Approximatly 20,000 per day

Table B (CusomterPfile) approximatly 400,000 per month.

I have to match apps for month, so intermeidate volume is 20,000*400,000 = 8,000,000,000 and final data set with JW_SCORE>970

Its time consuming to create the table of 8billion and fillter for200K Approx having  jw_score>970.

And also to filter in sub querry. please suggest a better approach  .

CREATE TABLE tablea

(

  APP_ID           NUMBER(14),

  APP_LOAD_DATE  DATE,

  SSN            VARCHAR2(20 BYTE),

  DOB            DATE,

  FIRST_NAME     VARCHAR2(50 BYTE),

  LAST_NAME      VARCHAR2(50 BYTE),

  ADDR           VARCHAR2(92 BYTE),

  CITY           VARCHAR2(50 BYTE),

  STATE          VARCHAR2(20 BYTE),

  ZIP            VARCHAR2(20 BYTE)

);

CREATE TABLE tableb

(

  ACCOUNT_NUMBER                 NUMBER(9)                 NOT NULL,

  DOB                 DATE,

  SSN                 NUMBER(9),

  FIRST_NAME          VARCHAR2(32 BYTE),

  LAST_NAME           VARCHAR2(60 BYTE),

  ADDR                VARCHAR2(100 BYTE),

  CITY                VARCHAR2(26 BYTE),

  STATE               CHAR(2 BYTE),

  ZIP                 CHAR(5 BYTE),

  ZIP_4               CHAR(4 BYTE)

);

SELECT A.* , JW_SSN+JW_DOB+JW_FNAME+JW_LNAME+JW_ADDR AS JW_SCORE   FROM  (

SELECT /*+ PARALLEL(16) */

      a.appid

     , a.ssn a_ssn

     , b.ssn b_ssn

     , a.dob AS a_dob

     , b.dob AS b_dob

     , a.first_name AS a_first_name

     , b.first_name AS b_first_name

     , a.last_name AS a_last_name

     , b.last_name AS b_last_name

     , A.FIRST_NAME ||' '|| A.LAST_NAME AS A_FULLNAME

     , B.FIRST_NAME ||' ' ||B.LAST_NAME  AS B_FULLNAME 

     , a.addr AS a_addr

     , b.addr b_addr

     , a.city AS a_city

     , b.city b_city

     , a.state AS a_state

     , b.state b_state

     , a.zip AS a_zip

     , b.zip b_zip

     , ( CASE WHEN ( A.ssn IS NULL or B.SSN IS NULL ) THEN 100

            WHEN ( ( A.SSN = 111111111 OR B.SSN = 111111111)) THEN 100

            ELSE UTL_MATCH.jaro_winkler_similarity (  LPAD( A.ssn,9,0) , LPAD( b.ssn ,9,0))

      END)* 3 as JW_SSN

       ,( CASE WHEN ( A.dob IS NULL OR B.dob IS NULL ) THEN 100

            ELSE UTL_MATCH.jaro_winkler_similarity ( A.dob ,B.dob)

      END )*2 as JW_DOB

      ,( CASE WHEN ( A.first_name IS NULL OR B.first_name IS NULL ) THEN 100

            ELSE UTL_MATCH.jaro_winkler_similarity ( A.first_name ,B.first_name)

      END ) *2as JW_FNAME

      , (CASE WHEN ( A.last_name IS NULL OR B.last_name IS NULL ) THEN 100

            ELSE UTL_MATCH.jaro_winkler_similarity ( A.last_name ,B.last_name)

      END) *2 as JW_LNAME

      , CASE WHEN (  A.FIRST_NAME || A.LAST_NAME IS OR AND B.FIRST_NAME||B.LAST_NAME IS NULL ) THEN 100

            ELSE UTL_MATCH.jaro_winkler_similarity (  A.FIRST_NAME ||' '|| A.LAST_NAME , B.FIRST_NAME ||' ' ||B.LAST_NAME)

      END as JW_FULLNAME

     , ( CASE WHEN ( A.addr IS NULL AND B.addr IS NULL ) THEN 100

            ELSE UTL_MATCH.jaro_winkler_similarity ( A.addr ,B.addr)

      END) * 1 as JW_ADDR

  FROM tablea a, tableb b

WHERE     1 = 1

AND b.start_service_date >= a.app_load_date

  ) A

WHERE JW_SSN+JW_DOB+JW_FNAME+JW_LNAME+JW_ADDR > 970;

Please suggest.

Thanks in Advance.

This post has been answered by Mustafa KALAYCI on May 1 2017
Jump to Answer
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on May 30 2017
Added on Apr 27 2017
19 comments
759 views