Hello,
I want to find exact matches in two sets of book titles without regard to case-sensitive and abbreviations. Also I am new to SAS.
Title 1:
AM J CLIN ONCOL CANC
ADV DRUG DELIVER REV
ADV ECOL RES
ABDOM RADIOL
ADV ELECTRON MATER
ENDOCR RELAT CANCER
MAYO CLIN PROC
MBIO
MCN AM J MATERN CHIL
Title 2:
Abdom Radiol (NY)
ACS Appl Mater Interfaces
Adv Drug Deliv Rev
AJR Am J Roentgenol
Am J Clin Oncol
Endocr Relat Cancer
***Expected Output something like this that shows exact title matches disregarding case-sensitve or abbreviations
Exact matches
ADV DRUG DELIVER REV : Adv Drug Deliv Rev
ABDOM RADIOL : Abdom Radiol (NY)
ENDOCR RELAT CANCER : Endocr Relat Cancer
AM J CLIN ONCOL CANC: Am J Clin Oncol
How to write a code that does this?
Thanks!
Its the same code as above, if you want only exact match variable , you can change sql query select statement and keep only Exact_Match variable. And am not sure why your getting only title, any error or warning in logs?
PROC SQL;
Create table dummy as
select catx(' ',title1,' : ',title2) as Exact_match
from (
select a.*,
substr(compress(lowcase(title1),' '),1,min_len) as temp_title1,
substr(compress(lowcase(title2),' '),1,min_len) as temp_title2,
spedis(calculated temp_title1, calculated temp_title2) as dis
from (
Select *,
length(compress(title1,' ')) as len1,
length(compress(title2,' ')) as len2,
case when calculated len1 < calculated len2 then calculated len1
when calculated len2 < calculated len1 then calculated len2
else calculated len1 end as min_len
from A as x cross join B as y
) a
where calculated temp_title1 = calculated temp_title2 or calculated dis<=15
)
;
Quit;
I tried running the code and got this error:
Though the error isn't there, but I'm not getting results. Can you share what your result look like?
Thanks
Snap shot of output.
Can you post your code?
I'm not getting the same output as yours when I ran it. It only shows the title 1 data.
Here's the code I used:
data a;
input title1 & $100.;
datalines;
AM J CLIN ONCOL CANC
ADV DRUG DELIVER REV
ADV ECOL RES
ABDOM RADIOL
ADV ELECTRON MATER
ENDOCR RELAT CANCER
MAYO CLIN PROC
MBIO
MCN AM J MATERN CHIL
;
run;
data b;
input title2 & $100.;
datalines;
Abdom Radiol (NY)
ACS Appl Mater Interfaces
Adv Drug Deliv Rev
AJR Am J Roentgenol
Am J Clin Oncol
Endocr Relat Cancer
;
run;
PROC SQL;
Create table dummy as
select title1, title2, catx(' ',title1,' : ',title2) as Exact_match
from (
select a.*,
substr(compress(lowcase(title1),' '),1,min_len) as temp_title1,
substr(compress(lowcase(title2),' '),1,min_len) as temp_title2,
spedis(calculated temp_title1, calculated temp_title2) as dis
from (
Select *,
length(compress(title1,' ')) as len1,
length(compress(title2,' ')) as len2,
case when calculated len1 < calculated len2 then calculated len1
when calculated len2 < calculated len1 then calculated len2
else calculated len1 end as min_len
from A as x cross join B as y
) a
where calculated temp_title1 = calculated temp_title2 or calculated dis<=15
)
;
Quit;
Its the same code as above, if you want only exact match variable , you can change sql query select statement and keep only Exact_Match variable. And am not sure why your getting only title, any error or warning in logs?
PROC SQL;
Create table dummy as
select catx(' ',title1,' : ',title2) as Exact_match
from (
select a.*,
substr(compress(lowcase(title1),' '),1,min_len) as temp_title1,
substr(compress(lowcase(title2),' '),1,min_len) as temp_title2,
spedis(calculated temp_title1, calculated temp_title2) as dis
from (
Select *,
length(compress(title1,' ')) as len1,
length(compress(title2,' ')) as len2,
case when calculated len1 < calculated len2 then calculated len1
when calculated len2 < calculated len1 then calculated len2
else calculated len1 end as min_len
from A as x cross join B as y
) a
where calculated temp_title1 = calculated temp_title2 or calculated dis<=15
)
;
Quit;
I added this statement and it worked!
proc print noobs;run;
Not sure why, but I'd be curious to know why.
Thank you again for your help!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.