BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rkrocks09
Fluorite | Level 6

Hi All,

 

I really hope someone can help me out; I've reached the edge of my SAS programming knowledge. I'm new to using hash tables, and honestly I'm not sure if what I want to do is possible.

 

I have SEER-Medicare claims data (very large dataset). Right now it's split into two datasets--those with cancer (cases) and those without cancer (controls). For the cases, are requiring at least one year of pre-enrollment in Medicare parts A, B, and D before their cancer diagnosis date. The controls don't have a cancer diagnosis date (obviously), but we want to assign a 'proxy diagnosis date' that would be the same as his or her matched cancer case. We're also matching on birth year +/- 2 yrs and gender. The issue is the one year pre-enrollment criteria in the controls. When this isn't accounted for, I end up dropping almost half of my matches. (Once the cases and controls are matched, I've been running the pre-enrollment criteria on the controls using their proxy cancer diagnosis date).

 

Currently, I have a hash table set up based on this:

 https://www.sas.com/content/dam/SAS/en_ca/User%20Group%20Presentations/Edmonton-User-Group/GeorgeZhu...

 

with a link to a check eligibility step that is it's own macro (based loosely on this: 

http://www.sascommunity.org/wiki/Implementing_Control_Selection_Using_Hash_Tables:_A_Case_Study).

 

Everything works great except for the pre-enrollment issue. The way the enrollment data is set up, it uses a series of arrays that count each year/month where an individual was enrolled. I'm not sure if, in the macro, there's something wrong with using arrays inside it, or if I should be referencing my hash table? Help?

 

Here's my code:

 

* the main data step;
%let ratio=1;

data _null_;
if _n_=1 then do;
set cases_100(obs=1); *make the variables in Cases data set available in the PDV;

 

*put the cases in a hash table;
declare hash cases(dataset:"cases_100",hashexp:8,ordered:"y");
cases.definekey("case_rand","case_id");
cases.definedata("case_rand","case_id","count1","case_age","case_gender","cancer_dxdt");
cases.definedone();

 

declare hiter hi_cases('cases'); * declare a hash table iterator object;
declare hash matches(); *declare a hash table for matched cases and controls;
matches.definekey("case_id","control_id");
matches.definedata("case_id","control_id","case_age","control_age","cancer_dxdt","control_gender","case_gender",
"a200701","a200702","a200703","a200704","a200705","a200706","a200707","a200708","a200709","a200710","a200711","a200712",
"a200801","a200802","a200803","a200804","a200805","a200806","a200807","a200808","a200809","a200810","a200811","a200812",
"a200901","a200902","a200903","a200904","a200905","a200906","a200907","a200908","a200909","a200910","a200911","a200912",
"a201001","a201002","a201003","a201004","a201005","a201006","a201007","a201008","a201009","a201010","a201011","a201012",
"a201101","a201102","a201103","a201104","a201105","a201106","a201107","a201108","a201109","a201110","a201111","a201112",
"a201201","a201202","a201203","a201204","a201205","a201206","a201207","a201208","a201209","a201210","a201211","a201212",
"a201301","a201302","a201303","a201304","a201305","a201306","a201307","a201308","a201309","a201310","a201311","a201312",
"a201401","a201402","a201403","a201404","a201405","a201406","a201407","a201408","a201409","a201410","a201411","a201412",

"b200701","b200702","b200703","b200704","b200705","b200706","b200707","b200708","b200709","b200710","b200711","b200712",
"b200801","b200802","b200803","b200804","b200805","b200806","b200807","b200808","b200809","b200810","b200811","b200812",
"b200901","b200902","b200903","b200904","b200905","b200906","b200907","b200908","b200909","b200910","b200911","b200912",
"b201001","b201002","b201003","b201004","b201005","b201006","b201007","b201008","b201009","b201010","b201011","b201012",
"b201101","b201102","b201103","b201104","b201105","b201106","b201107","b201108","b201109","b201110","b201111","b201112",
"b201201","b201202","b201203","b201204","b201205","b201206","b201207","b201208","b201209","b201210","b201211","b201212",
"b201301","b201302","b201303","b201304","b201305","b201306","b201307","b201308","b201309","b201310","b201311","b201312",
"b201401","b201402","b201403","b201404","b201405","b201406","b201407","b201408","b201409","b201410","b201411","b201412",

"d200701","d200702","d200703","d200704","d200705","d200706","d200707","d200708","d200709","d200710","d200711","d200712",
"d200801","d200802","d200803","d200804","d200805","d200806","d200807","d200808","d200809","d200810","d200811","d200812",
"d200901","d200902","d200903","d200904","d200905","d200906","d200907","d200908","d200909","d200910","d200911","d200912",
"d201001","d201002","d201003","d201004","d201005","d201006","d201007","d201008","d201009","d201010","d201011","d201012",
"d201101","d201102","d201103","d201104","d201105","d201106","d201107","d201108","d201109","d201110","d201111","d201112",
"d201201","d201202","d201203","d201204","d201205","d201206","d201207","d201208","d201209","d201210","d201211","d201212",
"d201301","d201302","d201303","d201304","d201305","d201306","d201307","d201308","d201309","d201310","d201311","d201312",
"d201401","d201402","d201403","d201404","d201405","d201406","d201407","d201408","d201409","d201410","d201411","d201412");
matches.definedone();

 

*declare a hash table for recording matched controls;
control_id_hash=case_id;
declare hash m_control();
m_control.definekey("control_id_hash");
m_control.definedone();
m_control.clear();
end;

 

set controls_300 end=eof;
control_id_hash=control_id; *get current control_id for searching;
if (m_control.find() ne 0) then do; *not matched to a case yet;
rc=hi_cases.first(); *search cases table using hash iterator object;
do while(rc=0);

 

eligibility=0;
link eligibilityCheckStep;

 

if (count1<&ratio. and eligibility=1) then do;
count1+1;
cases.replace();
matches.add();
m_control.add();
leave;
end;
rc=hi_cases.next();
end;
end;

 

*check if all the cases have matches (ie, count=&ratio.);
done=1;
rc=hi_cases.first();
do while(rc=0);
if count1<&ratio. then do;
done=0;
leave;
end;

rc=hi_cases.next();
end;

 

*if all the cases are matched or run out of controls, output the resulting data sets;
if (done or eof) then do;
matches.output(dataset:"matches");
cases.output(dataset:"matched_cases");
m_control.output(dataset:"matched_controls");
stop;
end;

 

eligibilityCheckStep:
%CheckElig;


run;

 

 

*confirmed this works in a datastep outside of macro and works without the array enrollments*;


%macro CheckElig();
/* match criteria: born within 2 years of each other, same gender, and enrollment info will work*/

yrb4_yr = year(cancer_dxdt-365);
yrb4_mo = month(cancer_dxdt-365);

if yrb4_yr<2007 then return;

array _a(2007:2014, 12) a200701--a201412;
array _b(2007:2014, 12) b200701--b201412;
array _d(2007:2014, 12) d200701--d201412;

*selects enrolled A&B&D 12 months prior to case cancer_dxdt;
if (_a(yrb4_yr, yrb4_mo)>=12) & (_b(yrb4_yr, yrb4_mo)>=12) & (_d(yrb4_yr, yrb4_mo)>=12) then _enroll_pre=1;

 

if abs(case_age-control_age)<= 2 and case_gender=control_gender and _enroll_pre=1 then eligibility=1;

 

%mend;

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@rkrocks09 

Thanks for posting sample data as fully working code.

 

Just a few things I've seen.

 

1. Use single dash and not double dash for variable lists. Only use double dash if you know exactly what and why you're doing this.

From:
array _a a200701 -- a200712;

To:
array _a a200701 - a200712;

 

2. You're calling the macro only once. Get rid of the macro and the link/return logic. ....actually: Because there is no RETURN before the linked code the SAS code in the macro after the LINK label eligibilityCheckStep: gets executed at least once for every single iteration of the data step.

Proposed change to code:

1. Move the code from the macro to here:
/*        link eligibilityCheckStep;*/
        mo_b4 = month(cancer_dxdt-60);
        array _a a200701-a200712;
        array _b b200701-b200712;
        array _d d200701-d200712;

        *selects enrolled A&B&D 12 months prior to case cancer_dxdt;
        if (_a(mo_b4)>=2) & (_b(mo_b4)>=2) & (_d(mo_b4)>=2) then
          _enroll_pre=1;

        if abs(case_age-control_age)<= 2 and case_gender=control_gender and _enroll_pre=1 then
          eligibility=1;

2. Comment or remove below two lines

/*eligibilityCheckStep:*/
/*  %CheckElig;*/

Now after these changes you'll still get an array subscript out of range error. This happens because you'r using variable mo_b4 as index for using your array elements in code like _a(mo_b4)

You populate this variable with the following formula: mo_b4 = month(cancer_dxdt-60); 

Problem is: Variable cancer_dxdt is sometimes missing and though mo_b4 gets missing and though you're using a missing value as index for an array - and that's why you currently get this error.

You need to fix the logic for populating cancer_dxdt - or then add a check for a missing mo_b4 and only call array elements if the value is not missing.

 

And last but not least: Is this one off code or something you plan to run regularly with eventually changing variable names?

View solution in original post

4 REPLIES 4
Patrick
Opal | Level 21

@rkrocks09 

You're posting a lot of information and code here and I feel it could potentially take me quite a while just to get my head around what it's doing before even starting trying to figure out why it's not doing what you want.

I believe what would really help: Post a data step which creates sample data that works with your code. Then show us based on the sample data how the desired result should look like (as compared to what we get when just running the code with your sample data). 

rkrocks09
Fluorite | Level 6

Sorry it took me awhile to come up with a simulated dataset...I apologize for the wall of text...

 

%let nCase=10000;
%let nControl=100000;
%let ratio=2;
* Generate the Cases data set;
data cases(drop=i);
retain id age gender;
length gender $1;
do i=1 to &nCase.;
id=i;
age=rand('integer', 65, 90);
gender=ifc(ranuni(0)>0.5,"F","M");
output;
end;
run;

data random_date;
mindate='01jan2007'd;
maxdate='31dec2007'd;
range = maxdate-mindate+1;
format mindate maxdate cancer_dxdt date9.;
do i = 1 to 1000;
cancer_dxdt = mindate + int(ranuni(12345)*range);
output;
end;
run;

data cases_1;
merge cases (keep=id age gender)
random_date (keep=cancer_dxdt); run;

* Generate the Controls data set;
data controls(drop=i);
retain id age gender;
length gender $1;
do i=1 to &nControl.;
id=i;
age=rand('integer', 65, 90);
gender=ifc(ranuni(0)>0.5,"F","M");
output;
end;
run;

data controls_1;
set controls;

do J=1 to 12;
RETAIN I(0);
I = I + 1;
ARRAY A7(12) a200701 - a200712;
array B7(12) b200701 - b200712;
array D7(12) d200701 - d200712;

IF I = J THEN A7(J) = .;

IF I = J THEN b7(J) = .;

IF I = J THEN d7(J) = .;
END;
ARRAY NUMS _NUMERIC_;
DO OVER NUMS; END; drop J i;

a200701=rand('integer', 48, 1);
b200701=rand('integer', 48, 1);
d200701=rand('integer', 48, 1);
RUN;

data controls_2;
set controls_1;

a200702=a200701-1;
a200703=a200702-1;
a200704=a200703-1;
a200705=a200704-1;
a200706=a200705-1;
a200707=a200706-1;
a200708=a200707-1;
a200709=a200708-1;
a200710=a200709-1;
a200711=a200710-1;
a200712=a200711-1;

b200702=b200701-1;
b200703=b200702-1;
b200704=b200703-1;
b200705=b200704-1;
b200706=b200705-1;
b200707=b200706-1;
b200708=b200707-1;
b200709=b200708-1;
b200710=b200709-1;
b200711=b200710-1;
b200712=b200711-1;

d200702=d200701-1;
d200703=d200702-1;
d200704=d200703-1;
d200705=d200704-1;
d200706=d200705-1;
d200707=d200706-1;
d200708=d200707-1;
d200709=d200708-1;
d200710=d200709-1;
d200711=d200710-1;
d200712=d200711-1; run;

data controls_3;
set controls_2;

array _a a200701--a200712;
array _b b200701--b200712;
array _d d200701--d200712;

do i=1 to dim(_a);
if _a(i)<=0 then _a(i)=.; end;

do i=1 to dim(_b);
if _b(i)<=0 then _b(i)=.; end;

do i=1 to dim(_d);
if _d(i)<=0 then _d(i)=.; end; drop i; run;

 

 

 

So, with these datasets, I want to match Cases_1 to Controls_3 by: age and gender...accounting for cancer_dxdt. The variables a200701-a200712, b200701-b200712, and d200701-d200712, represent months of insurance enrollment. Each one counts down until they are no longer enrolled. I need the controls to have at least 2 months of enrollment in a, b, and d to be matched to a case relative to the cancer_dxdt. (so a, b, and d >=2) For an example, if a control has a cancer_dxdt of 13May2007, to match to a control, the control would need to be a200705>=2, b200705>=2, d200705>=2.

 

Here's my hash table code:

 

*prepare controls*;

data Controls_H;
set Controls_3;
control_rand=ranuni(0);
rename age=control_age gender=control_gender id=control_id;
run;
proc sort data=controls_H; *scramble the control list for randomness;
by control_rand;
run;

 

*prepare cases*;
data Cases_H;
set Cases_1;
case_rand=ranuni(0); *for scramble the order of cases;
rename age=case_age gender=case_gender id=case_id;
count=0; *for recording number of controls matched;
run;

 

 

 

*main data step*;
%let ratio=2;

data _null_;
if _n_=1 then do;
set cases_h(obs=1); *make the variables in Cases data set available in the PDV;

*put the cases in a hash table;
declare hash cases(dataset:"cases_h",hashexp:8,ordered:"y");
cases.definekey("case_rand","case_id");
cases.definedata("case_rand","case_id","count","case_age","case_gender","cancer_dxdt");
cases.definedone();

declare hiter hi_cases('cases'); * declare a hash table iterator object;
declare hash matches(); *declare a hash table for matched cases and controls;
matches.definekey("case_id","control_id");
matches.definedata("case_id","control_id","case_age","control_age","cancer_dxdt","control_gender","case_gender","a200701","a200702","a200703","a200704","a200705","a200706","a200707","a200708","a200709","a200710","a200711","a200712",
"b200701","b200702","b200703","b200704","b200705","b200706","b200707","b200708","b200709","b200710","b200711","b200712",
"d200701","d200702","d200703","d200704","d200705","d200706","d200707","d200708","d200709","d200710","d200711","d200712");
matches.definedone();

*declare a hash table for recording matched controls;
control_id_hash=case_id;
declare hash m_control();
m_control.definekey("control_id_hash");
m_control.definedone();
m_control.clear();
end;

set controls_h end=eof;
control_id_hash=control_id; *get current control_id for searching;
if (m_control.find() ne 0) then do; *not matched to a case yet;
rc=hi_cases.first(); *search cases table using hash iterator object;
do while(rc=0);

eligibility=0;
link eligibilityCheckStep;

if (count<&ratio. and eligibility=1) then do;
count+1;
cases.replace();
matches.add();
m_control.add();
leave;
end;
rc=hi_cases.next();
end;
end;

*check if all the cases have matches (ie, count=&ratio.);
done=1;
rc=hi_cases.first();
do while(rc=0);
if count<&ratio. then do;
done=0;
leave;
end;

rc=hi_cases.next();
end;

*if all the cases are matched or run out of controls, output the resulting data sets;
if (done or eof) then do;
matches.output(dataset:"matches");
cases.output(dataset:"matched_cases");
m_control.output(dataset:"matched_controls");
stop;
end;

eligibilityCheckStep:
%CheckElig;
run;

*SAM confirmed this works in a datastep and works without the array enrollments*;
%macro CheckElig();

mo_b4 = month(cancer_dxdt-60);

array _a a200701--a200712;
array _b b200701--b200712;
array _d d200701--d200712;

*selects enrolled A&B&D 12 months prior to case cancer_dxdt;
if (_a(mo_b4)>=2) & (_b(mo_b4)>=2) & (_d(mo_b4)>=2) then _enroll_pre=1;

if abs(case_age-control_age)<= 2 and case_gender=control_gender and _enroll_pre=1 then eligibility=1;

%mend;

 

 

The problem with the code is in the macro... I need to fix the array elements so that they will work in the macro without a dataset. I think it's doable, I just haven't been able to figure it out.

Patrick
Opal | Level 21

@rkrocks09 

Thanks for posting sample data as fully working code.

 

Just a few things I've seen.

 

1. Use single dash and not double dash for variable lists. Only use double dash if you know exactly what and why you're doing this.

From:
array _a a200701 -- a200712;

To:
array _a a200701 - a200712;

 

2. You're calling the macro only once. Get rid of the macro and the link/return logic. ....actually: Because there is no RETURN before the linked code the SAS code in the macro after the LINK label eligibilityCheckStep: gets executed at least once for every single iteration of the data step.

Proposed change to code:

1. Move the code from the macro to here:
/*        link eligibilityCheckStep;*/
        mo_b4 = month(cancer_dxdt-60);
        array _a a200701-a200712;
        array _b b200701-b200712;
        array _d d200701-d200712;

        *selects enrolled A&B&D 12 months prior to case cancer_dxdt;
        if (_a(mo_b4)>=2) & (_b(mo_b4)>=2) & (_d(mo_b4)>=2) then
          _enroll_pre=1;

        if abs(case_age-control_age)<= 2 and case_gender=control_gender and _enroll_pre=1 then
          eligibility=1;

2. Comment or remove below two lines

/*eligibilityCheckStep:*/
/*  %CheckElig;*/

Now after these changes you'll still get an array subscript out of range error. This happens because you'r using variable mo_b4 as index for using your array elements in code like _a(mo_b4)

You populate this variable with the following formula: mo_b4 = month(cancer_dxdt-60); 

Problem is: Variable cancer_dxdt is sometimes missing and though mo_b4 gets missing and though you're using a missing value as index for an array - and that's why you currently get this error.

You need to fix the logic for populating cancer_dxdt - or then add a check for a missing mo_b4 and only call array elements if the value is not missing.

 

And last but not least: Is this one off code or something you plan to run regularly with eventually changing variable names?

rkrocks09
Fluorite | Level 6

Thank you SO much for your help! Your suggestions worked! I converted missing values in my array variables to 0 (which I think was one of my biggest problems!), got rid of the macro and moved it into my main code, and I also realized I needed to add all variables I reference in the code into the 'definedata' step of my hash table. In case there are others struggling with a similar problem, here's my full code below:

 

*prepare datasets*;

data controls (keep=control_id control_age control_gender a200701-a201412 b200701-b201412 d200701-d201412 control_rand registry2007-registry2015);
set control_enroll_1;
control_age=year(birth_dt);
control_gender=input(m_sex, best12.); *convert to numeric*;
control_id=patient_id;
control_rand=ranuni(0); run;

data controls_1;
set controls;

*convert missing to 0*;
array _a a200701--a201412;
do i=1 to dim(_a);
if _a[i]=. then _a[i]=0; end;

array _b b200701--b201412;
do i=1 to dim(_b);
if _b[i]=. then _b[i]=0; end;

array _d d200701--d201412;
do i=1 to dim(_d);
if _d[i]=. then _d[i]=0; end;

run;

data cases (keep=case_id cancer_dxdt case_age case_gender case_rand count reg1);
set exposed;
case_age=year(birth_dt);
case_gender=input(M_sex, best12.); *convert to numeric*;
case_id=patient_id;
case_rand=ranuni(0); *for scramble the order of cases;
count=0; run; *for recording number of controls matched;

data cases_1;
set cases;
count1= input(count, best12.); *convert to numeric bc SAS is being DUMB*;
yr_b4=cancer_dxdt-365; format yr_b4 date7.; run;

data cases_2;
set cases_1;
yrb4_yr = year(yr_b4);
yrb4_mo = month(yr_b4); run;


*"case_id","control_id","case_age","control_age","cancer_dxdt","control_gender","case_gender"*;

* the main data step;
%let ratio=2;

data _null_;
if _n_=1 then do;
set cases_2(obs=1); *make the variables in Cases data set available in the PDV;

*put the cases in a hash table;
declare hash cases(dataset:"cases_2",hashexp:8,ordered:"y");
cases.definekey("case_rand","case_id");
cases.definedata("case_rand","case_id","count1","case_age","case_gender","cancer_dxdt","reg1","yrb4_yr","yrb4_mo");
cases.definedone();

declare hiter hi_cases('cases'); * declare a hash table iterator object;
declare hash matches(); *declare a hash table for matched cases and controls;
matches.definekey("case_id","control_id");
matches.definedata("case_id","control_id","case_age","control_age","cancer_dxdt","control_gender","case_gender","reg1","REGISTRY2007","REGISTRY2008","REGISTRY2009","REGISTRY2010",
"REGISTRY2011","REGISTRY2012","REGISTRY2013","REGISTRY2014","yrb4_yr","yrb4_mo",

"a200701","a200702","a200703","a200704","a200705","a200706","a200707","a200708","a200709","a200710","a200711","a200712",
"a200801","a200802","a200803","a200804","a200805","a200806","a200807","a200808","a200809","a200810","a200811","a200812",
"a200901","a200902","a200903","a200904","a200905","a200906","a200907","a200908","a200909","a200910","a200911","a200912",
"a201001","a201002","a201003","a201004","a201005","a201006","a201007","a201008","a201009","a201010","a201011","a201012",
"a201101","a201102","a201103","a201104","a201105","a201106","a201107","a201108","a201109","a201110","a201111","a201112",
"a201201","a201202","a201203","a201204","a201205","a201206","a201207","a201208","a201209","a201210","a201211","a201212",
"a201301","a201302","a201303","a201304","a201305","a201306","a201307","a201308","a201309","a201310","a201311","a201312",
"a201401","a201402","a201403","a201404","a201405","a201406","a201407","a201408","a201409","a201410","a201411","a201412",

"b200701","b200702","b200703","b200704","b200705","b200706","b200707","b200708","b200709","b200710","b200711","b200712",
"b200801","b200802","b200803","b200804","b200805","b200806","b200807","b200808","b200809","b200810","b200811","b200812",
"b200901","b200902","b200903","b200904","b200905","b200906","b200907","b200908","b200909","b200910","b200911","b200912",
"b201001","b201002","b201003","b201004","b201005","b201006","b201007","b201008","b201009","b201010","b201011","b201012",
"b201101","b201102","b201103","b201104","b201105","b201106","b201107","b201108","b201109","b201110","b201111","b201112",
"b201201","b201202","b201203","b201204","b201205","b201206","b201207","b201208","b201209","b201210","b201211","b201212",
"b201301","b201302","b201303","b201304","b201305","b201306","b201307","b201308","b201309","b201310","b201311","b201312",
"b201401","b201402","b201403","b201404","b201405","b201406","b201407","b201408","b201409","b201410","b201411","b201412",

"d200701","d200702","d200703","d200704","d200705","d200706","d200707","d200708","d200709","d200710","d200711","d200712",
"d200801","d200802","d200803","d200804","d200805","d200806","d200807","d200808","d200809","d200810","d200811","d200812",
"d200901","d200902","d200903","d200904","d200905","d200906","d200907","d200908","d200909","d200910","d200911","d200912",
"d201001","d201002","d201003","d201004","d201005","d201006","d201007","d201008","d201009","d201010","d201011","d201012",
"d201101","d201102","d201103","d201104","d201105","d201106","d201107","d201108","d201109","d201110","d201111","d201112",
"d201201","d201202","d201203","d201204","d201205","d201206","d201207","d201208","d201209","d201210","d201211","d201212",
"d201301","d201302","d201303","d201304","d201305","d201306","d201307","d201308","d201309","d201310","d201311","d201312",
"d201401","d201402","d201403","d201404","d201405","d201406","d201407","d201408","d201409","d201410","d201411","d201412");
matches.definedone();

*declare a hash table for recording matched controls;
control_id_hash=case_id;
declare hash m_control();
m_control.definekey("control_id_hash");
m_control.definedone();
m_control.clear();
end;

set controls_1 end=eof;
control_id_hash=control_id; *get current control_id for searching;
if (m_control.find() ne 0) then do; *not matched to a case yet;
rc=hi_cases.first(); *search cases table using hash iterator object;
do while(rc=0);

eligibility=0;

array _a(2007:2014, 12) a200701--a201412;
array _b(2007:2014, 12) b200701--b201412;
array _d(2007:2014, 12) d200701--d201412;

if (_a(yrb4_yr, yrb4_mo)>=12) and (_b(yrb4_yr, yrb4_mo)>=12) and (_d(yrb4_yr, yrb4_mo)>=12) then _enroll_pre=1; else _enroll_pre=0;

if abs(case_age-control_age)<= 2 and case_gender=control_gender and _enroll_pre=1 and (REG1=REGISTRY2007 or REG1=REGISTRY2008 or REG1=REGISTRY2009 or REG1=REGISTRY2010 or REG1=REGISTRY2011
or REG1=REGISTRY2012 or REG1=REGISTRY2013) then eligibility=1;

if (count1<&ratio. and eligibility=1) then do;
count1+1;
cases.replace();
matches.add();
m_control.add();
leave;
end;
rc=hi_cases.next();
end;
end;

*check if all the cases have matches (ie, count=&ratio.);
done=1;
rc=hi_cases.first();
do while(rc=0);
if count1<&ratio. then do;
done=0;
leave;
end;

rc=hi_cases.next();
end;

*if all the cases are matched or run out of controls, output the resulting data sets;
if (done or eof) then do;
matches.output(dataset:"matches_round1");
cases.output(dataset:"matched_cases_round1");
m_control.output(dataset:"matched_controls_round1");
stop;
end;

run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1189 views
  • 0 likes
  • 2 in conversation