Hi all,
I have two datasets: one with individual IDs and measurement month, and the other with individuals' enrollment months. I wanted to count the number of enrollment months in the past 12 months per measurement month. Could anyone provide some efficient code to do that? Below are the data sets I have (have 1, have2) and the target data file (want). Thank you!
data have1;
infile datalines truncover dsd;
input Person_ID measure_mth;
datalines;
1,202201
1,202202
1,202203
1,202204
1,202205
1,202206
1,202207
1,202208
1,202209
1,202210
1,202211
1,202212
2,202201
2,202202
2,202203
2,202204
2,202205
2,202206
2,202207
2,202208
2,202209
2,202210
2,202211
2,202212
;
run;
data have2;
infile datalines truncover dsd;
input Person_ID enrolled_mth;
datalines;
1,202101
1,202102
1,202103
1,202104
1,202105
1,202106
1,202107
1,202201
1,202202
1,202203
1,202204
1,202205
1,202206
1,202207
1,202208
1,202209
1,202210
1,202211
1,202212
2,202105
2,202106
2,202107
2,202108
2,202109
2,202110
2,202111
2,202112
2,202201
2,202202
2,202203
2,202204
2,202205
2,202206
2,202207
2,202208
2,202209
2,202210
2,202211
2,202212
;
run;
data want;
infile datalines truncover dsd;
input Person_ID measure_mth N_enrolled_mths;
datalines;
1,202201,7
1,202202,7
1,202203,7
1,202204,7
1,202205,7
1,202206,7
1,202207,7
1,202208,8
1,202209,9
1,202210,10
1,202211,11
1,202212,12
2,202201,9
2,202202,10
2,202203,11
2,202204,12
2,202205,12
2,202206,12
2,202207,12
2,202208,12
2,202209,12
2,202210,12
2,202211,12
2,202212,12
;
run;
Assuming you've corrected have1 and have2 along the lines suggested by @Patrick
If have2 and have1 are already sorted by person_id, you can:
data want (drop=enrolled_mth);
set have2 (in=in2) have1 (in=in1);
by person_id;
array n_year_by_month{2021:2024,1:12} _temporary_;
if first.person_id then call missing(of n_year_by_month{*});
if in2 then do _n_=0 to 11;
n_year_by_month{year(enrolled_mth),month(enrolled_mth)}+1;
enrolled_mth=intnx('month',enrolled_mth,1);
end;
if in1;
n_enrolled_mths=n_year_by_month{year(measure_mth),month(measure_mth)};
run;
Two points:
Below solution requires your columns measure_mth and enrolled_mth to store a SAS Date value and not just a number.
I've changed your input statements to create such sample data. You would need to do something similar for your real data.
Once you've got SAS Date values you can use function intck() to join the data for your rolling counts.
proc sql;
/* create table want2 as*/
select
t1.person_id
,t1.measure_mth
,count(*) as n_enrolled_mths
from
have1 t1 left join have2 t2
on
t1.person_id=t2.person_id
and intck('month',t2.enrolled_mth,t1.measure_mth) between 0 and 11
group by t1.person_id, t1.measure_mth
;
quit;
Assuming you've corrected have1 and have2 along the lines suggested by @Patrick
If have2 and have1 are already sorted by person_id, you can:
data want (drop=enrolled_mth);
set have2 (in=in2) have1 (in=in1);
by person_id;
array n_year_by_month{2021:2024,1:12} _temporary_;
if first.person_id then call missing(of n_year_by_month{*});
if in2 then do _n_=0 to 11;
n_year_by_month{year(enrolled_mth),month(enrolled_mth)}+1;
enrolled_mth=intnx('month',enrolled_mth,1);
end;
if in1;
n_enrolled_mths=n_year_by_month{year(measure_mth),month(measure_mth)};
run;
Two points:
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.