BookmarkSubscribeRSS Feed
Rohit_1990
Calcite | Level 5

Hi All.

 

I have a master table with a column string as shown below :

STRING
49 Laurel Court Mount Vernon NY 10550
9925 W. Greystone Lane Hamdn, CT 06514,HMn
83 South Aldeood Street Middletown, CT 06457
518 Goldfield Drive Bakersfield, CA 93306
8991 Fairground Lane Nutley, NJ 07110
63 Bay Court Elkridge, MD 21075
83 South Alderwood Street Midletown, CT 06457 Bakersfield
8991 Fairground Lane Nutley, NJ 07110 W. Graystoe Lane

 

 

Now There is another table which contain lookup values .

I need to use the lookup values to find the probable matches in the string column of master table.

 

LOOKUP VALUES
Hamden
Middletown
Nutley
W. Graystone Lane
South Alderwood Street

 

So , I need my final table as below

STRINGProbable1Probable2
49 Laurel Court Mount Vernon NY 10550  
9925 W. Greystone Lane Hamdn, CT 06514,HMnHAMDEN 
83 South Aldeood Street Middletown, CT 06457Middletown 
518 Goldfield Drive Bakersfield, CA 93306  
8991 Fairground Lane Nutley, NJ 07110Nutley 
63 Bay Court Elkridge, MD 21075  
83 South Alderwood Street Midletown, CT 06457 BakersfieldMiddletownSouth Alderwood Street
8991 Fairground Lane Nutley, NJ 07110 W. Graystoe LaneNutleyW. Graystone Lane

So we have hamden in lookup table available and the most near match can be hamdn in 2nd record so I poulate 

Hamden in Probable1 column for the record.

 

Like wise in the last record I can see Nutley as well as W. Graystone Lane having aprobable match in it

so I need to populate Nutley in probable1 and W. Graystone Lane(not exact match) in probable2 column.

 

I can do if there is a direct match in the string column but in case of variations things do not workout as expected.

So basically I also need to pull the lookup value if a variation of that is available in string column .

 

Any help on above will be highly appreciated.

 

Regards,

 

 

 

 

 

 

 

 

 

 

 

 

 

2 REPLIES 2
smantha
Lapis Lazuli | Level 10

There are two aspects to this 

1. how do you find similarity of two strings (spedis, soundex functions etc. are some viable options ). Prx (perl regular expressions ) but they do not do a fuzzy match and very memory intensive.

2. Get to the final layout. 

          a. use a hash table and iterate over all the elements you want to find.

          b. create a cartesian join and then do a transpose

Rohit_1990
Calcite | Level 5
Hi Samantha,

Thanks for your reply but the suggested approach does work for direct match (using index function)
but fails to capture variations since variation is within a large string , so can't do a direct spedis or soundex activity for it.
Nonetheless thanks again and any further advise is highy valued.

Regards

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 550 views
  • 0 likes
  • 2 in conversation