Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Parsing Street Addresses with Oracle Text.

mcarter-OracleApr 25 2009 — edited Apr 27 2009
I've been playing around with garage sale ads trying to find a system which can successfully read a RSS feed and parse the address and date out of it. My current method is to parse the adds for known patterns and start geocoding the patterns with google to get a match. However, this falls apart as soon as I reach the daily geocoding limit.

So my next idea was to grab the US Tiger, and Canadian NRN files and build my own geocoder. But now that I have the names/locations of every road in North America I though I could turn this around. Search the ads for actual street names based on proximity or RSS coverage.

I thought I'd see if Oracle could handle it. I've tried a few approaches including data mining. Right now I'm playing with 'NEAR(("'||x.street_name||'", $"||x.street_type||'", 2, TRUE) which seems to work rather well.

The questions:
1. Ok now I've found a matching street for the ad. How do I tell where the pattern is? As instr doesn't really work if it's matching on Main N ST instead of Main ST.
2. Usually there is a number in front of the street_name. If I know the start position of the pattern I can scan for 144 or 144 & 143 or 144 and 143 or ... you get the picture. Is there a way to have the NEAR search for a number near the pattern?
3. How do I do a NEAR with the street direction. $"N" is pretty useless.

Any ideas anyone?
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on May 25 2009
Added on Apr 25 2009
2 comments
2,008 views