BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
lichee
Quartz | Level 8

Hi all,

 

I have two datasets: one with individual IDs and measurement month, and the other with individuals' enrollment months. I wanted to count the number of enrollment months in the past 12 months per measurement month. Could anyone provide some efficient code to do that? Below are the data sets I have (have 1, have2) and the target data file (want). Thank you!

 

data have1;
infile datalines truncover dsd;
input Person_ID measure_mth;
datalines;
1,202201
1,202202
1,202203
1,202204
1,202205
1,202206
1,202207
1,202208
1,202209
1,202210
1,202211
1,202212
2,202201
2,202202
2,202203
2,202204
2,202205
2,202206
2,202207
2,202208
2,202209
2,202210
2,202211
2,202212
;
run;

data have2;
infile datalines truncover dsd;
input Person_ID enrolled_mth;
datalines;
1,202101
1,202102
1,202103
1,202104
1,202105
1,202106
1,202107
1,202201
1,202202
1,202203
1,202204
1,202205
1,202206
1,202207
1,202208
1,202209
1,202210
1,202211
1,202212
2,202105
2,202106
2,202107
2,202108
2,202109
2,202110
2,202111
2,202112
2,202201
2,202202
2,202203
2,202204
2,202205
2,202206
2,202207
2,202208
2,202209
2,202210
2,202211
2,202212
;
run;

data want;
infile datalines truncover dsd;
input Person_ID measure_mth N_enrolled_mths;
datalines;
1,202201,7
1,202202,7
1,202203,7
1,202204,7
1,202205,7
1,202206,7
1,202207,7
1,202208,8
1,202209,9
1,202210,10
1,202211,11
1,202212,12
2,202201,9
2,202202,10
2,202203,11
2,202204,12
2,202205,12
2,202206,12
2,202207,12
2,202208,12
2,202209,12
2,202210,12
2,202211,12
2,202212,12
;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Assuming you've corrected have1 and have2 along the lines suggested by @Patrick 

 

If have2 and have1 are already sorted by person_id, you can:

  1. Read all have2 obs for a given id, and add to a two-way array (year by month) of counters from 0 to 11 months forward from enrolled_mth.
  2. Read all have1 obs for the same id and retrieve the value from the array of counters corresponding to measure_mth, and output.
data want (drop=enrolled_mth);
  set have2 (in=in2)   have1 (in=in1);
  by person_id;

  array n_year_by_month{2021:2024,1:12} _temporary_;

  if first.person_id then call missing(of n_year_by_month{*});

  if in2 then do _n_=0 to 11;
    n_year_by_month{year(enrolled_mth),month(enrolled_mth)}+1;
    enrolled_mth=intnx('month',enrolled_mth,1);
  end;

  if in1;
  n_enrolled_mths=n_year_by_month{year(measure_mth),month(measure_mth)};
run;

Two points:

  1. Make sure the array bounds cover from the earlier of the minimum enroll_mth or minimum measure_mth through the maximum of measure_mth or 11 months beyond the maximum enroll_mth.
  2. It doesn't matter whether have1 or have2 are chronologically sorted within person_id.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

2 REPLIES 2
Patrick
Opal | Level 21

Below solution requires your columns measure_mth and enrolled_mth to store a SAS Date value and not just a number. 

I've changed your input statements to create such sample data. You would need to do something similar for your real data.

Patrick_0-1690592831168.png  Patrick_1-1690592860426.png

 

Once you've got SAS Date values you can use function intck() to join the data for your rolling counts.

proc sql;
/*  create table want2 as*/
  select
    t1.person_id
    ,t1.measure_mth
    ,count(*) as n_enrolled_mths
  from 
    have1 t1 left join have2 t2
    on 
      t1.person_id=t2.person_id
      and intck('month',t2.enrolled_mth,t1.measure_mth) between 0 and 11
  group by t1.person_id, t1.measure_mth
  ;
quit;

 

mkeintz
PROC Star

Assuming you've corrected have1 and have2 along the lines suggested by @Patrick 

 

If have2 and have1 are already sorted by person_id, you can:

  1. Read all have2 obs for a given id, and add to a two-way array (year by month) of counters from 0 to 11 months forward from enrolled_mth.
  2. Read all have1 obs for the same id and retrieve the value from the array of counters corresponding to measure_mth, and output.
data want (drop=enrolled_mth);
  set have2 (in=in2)   have1 (in=in1);
  by person_id;

  array n_year_by_month{2021:2024,1:12} _temporary_;

  if first.person_id then call missing(of n_year_by_month{*});

  if in2 then do _n_=0 to 11;
    n_year_by_month{year(enrolled_mth),month(enrolled_mth)}+1;
    enrolled_mth=intnx('month',enrolled_mth,1);
  end;

  if in1;
  n_enrolled_mths=n_year_by_month{year(measure_mth),month(measure_mth)};
run;

Two points:

  1. Make sure the array bounds cover from the earlier of the minimum enroll_mth or minimum measure_mth through the maximum of measure_mth or 11 months beyond the maximum enroll_mth.
  2. It doesn't matter whether have1 or have2 are chronologically sorted within person_id.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 739 views
  • 1 like
  • 3 in conversation