Help using Base SAS procedures

Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

Reply
Frequent Contributor
Posts: 125

Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

Actual question:

How to use proc sql's select case/when statements with macro by modifying the code recipe in the reference by Glenn Wright? The code that I am trying to modify here will be found on the page #10 in the reference. I also enclosed my modified code that doesn't produce data "similarity_score".

What am I doing wrong here?

 

Reference: Glenn Wright, Probabilistic Record Linkage, see page #10.

http://www.wuss.org/proceedings11/Papers_Wright_G_76128.pdf

 

Background:

My objective is to conduct proabilistic record linkage to reduce the duplicated entries of patients in hospital discharge data (NY). In the absence of personal variables such as name, address, unique idnetity code, I use gender, date of birth, patient facility identifier, zip and race. The bottom line is that any rows in the data that match to each other in combination of these five variables are the most likely to represent the same person rather different. Vice versa. SAS version: 9.4 (TS1M3).

 

"Sample" data:

data sample;
   input SEX $ DOB PFI ZIP RACE;
datalines;
F	189804	125	10111	1
M	189802	126	12365	2
F	189801	129	12369	1
M	190111	256	14236	2
F	190302	101	12144	1
M	190307	115	12203	2
M	190307	115	12203	2
M	189806	356	18963	1
F	190308	258	14789	2
F	190302	101	12144	1
;
run;

Code modified from Glenn Wright: (please see reference attached)

 

data mydata_clean (index = (obs_num sex dob pfi zip racecat)); set sample;
obs_num = _n_;
run;

%macro blocking_var(myvar);
create table similarity_score(compress=yes) as
select 
 a.obs_num    as obs_num_1,  b.obs_num as obs_num_2, 
 a.sex        as sex_1,      b.sex as sex_2, 
 a.dob        as dob_1,      b.dob as dob_2,
 a.pfi        as pfi_1,      b.pfi as pfi_2,
 a.zip        as zip_1,      b.zip as zip_2,
 a.racecat    as racecat_1,  b.racecat as racecat_2,

select 
case 
when a.sex = b.sex then 1
else -5.64 end as sex_score,

case 
when a.pfi = b.pfi then 7.6
else -4.3 end as pfi_score,

case 
when a.dob = b.dob then 3.5
else -4.2 end as dob_score,

case 
when a.racecat = b.racecat then 1
else -3.3 end as racecat_score

case 
when a.zip = b.zip then 10.6
else -4.3 end as zip_score

calculated sex_score + calculated pfi_score +
calculated dob_score + calculated racecat_score + calculated zip_score as score

from mydata_clean as a INNER JOIN mydata_clean as b
on a.obs_num > b.obs_num
and a.&myvar. = b.&myvar.
where a.&myvar. is not missing
and calculated score=<-10
%mend;

data threshold;
set similarity_scores;
where score =<-12;
run;

Reference:

Glenn Wright's Probabilistic Record Linkage in SAS®, offers excellent solution to  address the problem almost identical to mine. Any hints highly appreciated. 

 

 

Super User
Posts: 17,813

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

I might be missing something, but you defined the macro but never actually ran it. 

 

There should be a statement somewhere that looks like:

 

Proc sql;
%block_var(BLOCKVAR);
quit;

 

Frequent Contributor
Posts: 125

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

Thanks Reeza,

Using software is not permitted in this project unfortunately. I tried taking in and out the proc sql and quit with many different variations, but still doesn't work. Wish I had sql experience but not. Please see edited code. What do you think?

 

proc sql;
%macro blocking_var(myvar);
create table similarity_score(compress=yes) as
select
 a.obs_num    as obs_num_1,  b.obs_num as obs_num_2,
 a.sex        as sex_1,      b.sex as sex_2,
 a.dob        as dob_1,      b.dob as dob_2,
 a.pfi        as pfi_1,      b.pfi as pfi_2,
 a.zip        as zip_1,      b.zip as zip_2,
 a.racecat    as racecat_1,  b.racecat as racecat_2,

select
case
when a.sex = b.sex then 1
else -5.64 end as sex_score,

case
when a.pfi = b.pfi then 7.6
else -4.3 end as pfi_score,

case
when a.dob = b.dob then 3.5
else -4.2 end as dob_score,

case
when a.racecat = b.racecat then 1
else -3.3 end as racecat_score

case
when a.zip = b.zip then 10.6
else -4.3 end as zip_score

calculated sex_score + calculated pfi_score +
calculated dob_score + calculated racecat_score + calculated zip_score as score

from mydata_clean as a INNER JOIN mydata_clean as b
on a.obs_num > b.obs_num
and a.&myvar. = b.&myvar.
where a.&myvar. is not missing
and calculated score=<-10
;
quit;
%mend; 

 

Super User
Posts: 17,813

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

Remove the macro components, get it working and then turn it into a macro. As is you're trying to debug too many things. My code was an example of how to use the macro. 

 

If you're unfamiliar with the structure of the code the paper the paper did have full examples. Make sure it's the same format. 

Frequent Contributor
Posts: 125

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

Thanks. Working as suggested.
Respected Advisor
Posts: 4,644

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

Your code has (at least) 2 syntax errors which might go unnoticed when run inside a macro:

 

1) the second select keyword shouldn't be there

2) there should be commas after as zip_score and as racecat_score

 

As @Reeza suggested, get the code to work outside the macro before you run it as a macro. It is much easier to debug that way.

PG
Frequent Contributor
Posts: 125

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

Thanks. Good catches indeed.
Frequent Contributor
Posts: 125

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

Hi PG,

I had proc sql worked outside the macro. But it loses its function when i throw it into macro. I run it but nothing happens. No data set created. What am I doing wrong here? I appreciate your time.

 

%macro blocking_var(myvar);
proc sql; 
 create table similarity_scores (compress=yes) as 
 select
 a.obs_num    as obs_num_1,  b.obs_num as obs_num_2, 
 a.sex        as sex_1,      b.sex as sex_2, 
 a.dob        as dob_1,      b.dob as dob_2,
 a.pfi        as pfi_1,      b.pfi as pfi_2,
 a.zip        as zip_1,      b.zip as zip_2,
case when a.sex = b.sex then 2
     else -5 end as sex_score,
case when a.dob = b.dob then 12
     else -6 end as dob_score,
case when a.pfi = b.pfi then 10
     else -5 end as pfi_score,
case when a.zip = b.zip then 10
     else -5 end as zip_score,

calculated sex_score + calculated dob_score  + calculated pfi_score +calculated zip_score 
as total_score 
from
mydata_clean as a INNER JOIN mydata_clean as b  
 on a.obs_num > b.obs_num
 and a.&myvar.=b.&myvar.
 where a.&myvar. is not missing
 and calculated score <=-10  
;
quit;
%mend;
Super User
Posts: 17,813

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

Back to my first comment - where do you call/execute the macro?

Frequent Contributor
Posts: 125

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

%macro blocking_var(myvar);
proc sql; 
 "union"
quit;
%mend;

Hi Reeza,

 

My last code posted included above macro elements, isn't it?

Super User
Posts: 17,813

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

No. That's a macro definition, you still need to execute the macro.

Macro definition

%macro print_demo(dataset_name);

proc print data=&dataset_name (obs=10);
run;

%mend;

Macro execution:

 

%print_demo(sashelp.cars);
Frequent Contributor
Posts: 125

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

Like so?

%blocking_var(sex,dob,pfi,zip);
Frequent Contributor
Posts: 125

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

[ Edited ]

I get:

 

ERROR: More positional parameters found than defined.

 

when I add:

%blocking_var(sex,dob,pfi,zip); 

I also tried: 

%blocking_var("sex","dob","pfi","zip");

still get the same error as above

Frequent Contributor
Posts: 125

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

Hi ya'll,

 

Still working on it. I really have to get it working.

Now I have macro definition and executions parts and the error I get is:

 

error.png

 

%macro blocking_var(myvar);
proc sql; 
 create table similarity_scores (compress=yes) as 
 select
a.obs_num    as obs_num_1,  b.obs_num as obs_num_2, 
 a.sex        as sex_1,      b.sex as sex_2, 
 a.dob        as dob_1,      b.dob as dob_2,
a.pfi        as pfi_1,      b.pfi as pfi_2,
a.zip        as zip_1,      b.zip as zip_2,
case when a.sex = b.sex then 2
     else -5 end as sex_score,
case when a.dob = b.dob then 12
     else -6 end as dob_score,
case when a.pfi = b.pfi then 10
     else -5 end as pfi_score,
case when a.zip = b.zip then 10
     else -5 end as zip_score,

calculated sex_score + calculated dob_score  + calculated pfi_score +calculated zip_score 
as total_score 
from
mydata_clean as a INNER JOIN mydata_clean as b  
 on a.obs_num > b.obs_num
and a.&myvar.=b.&myvar.
where a.&myvar. is not missing
and calculated score <=-10  
;
quit;
%mend blocking_var;
%blocking_var(sex); 
%blocking_var(dob);
%blocking_var(pfi);
%blocking_var(zip);

Any hints?

Thank you!!!

 

 

Super User
Posts: 17,813

Re: Union join Proc Sql in Macro with select case when statements for Probabilistic Record Linkage

PS. Your first step should be getting the code to work exactly as defined before making changes to it. 

Also, you can take a look at the The Link King which offers a tool to create linkages based on SAS code as well. 

 

 

Ask a Question
Discussion stats
  • 19 replies
  • 344 views
  • 5 likes
  • 4 in conversation