A HASH object is very effective for counting unique values (.NUM_ITEMS) when the variable is a key, and for entering into membership via .ADD()
A logic statement with a compiled IN operation is said to be one of the fastest methods for checking set membership.
Use IN to see if a doctors diagnosis is one that is of interest (for say a study)
Use HASH to track unique diagnosis in the study list
Use another HASH to track non-study diagnoses (for yucks)
Example:
Data for 1,000 patients having random number of visits and random number of random diagnoses.
data have;
call streaminit(123);
do patid = 1 to 100;
date = today() - rand('integer',250,1300);
top = rand('integer',5,1000);
do index = 1 by 1 while (date <= today() and index <= top);
array diag(24);
do dindex = 1 to rand('integer',1,rand('integer', dim(diag)));
* possible repeated diagnosis in row dont matter in this example;
* so dont try to prevent them;
diag(dindex) = rand('integer',1,1000);
end;
visitid + 1;
output;
date + rand('integer',0,3);
call missing (of diag(*));
end;
end;
keep patid date diag: visitid;
format date yymmdd10.;
run; * visit count distribution, just a look see; proc sql; create table freq1 as select visit_count, count(*) as freq from ( select patid, count(*) as visit_count from have group by patid ) group by visit_count ;
Per patient, count number of diagnoses matching the study list across all diagnoses of all visits.
%let study_diagnoses = 2,5,11,17,23,31,43,53,61,71,79,89;
data want;
if _n_ = 1 then do;
call missing(dx);
declare hash study_dx();
declare hash other_dx();
study_dx.defineKey('dx');
study_dx.defineDone();
other_dx.defineKey('dx');
other_dx.defineDone();
end;
do until (last.patid);
set have;
by patid;
array dxs diag:;
do index = 1 to dim(dxs) while (not missing(dxs(index)));
dx = dxs(index);
if dx in (&study_diagnoses)
then rc = study_dx.add();
else rc = other_dx.add();
end;
end;
dx_in_study_count = study_dx.num_items;
dx_not_in_study_count = other_dx.num_items;
study_dx.clear();
other_dx.clear();
keep patid dx_:;
run;
... View more