Hello SAS community,
I have 13 months span data but I need to know if the person is pre-enrolled in first month before the start of 12months span.
For example: I have data from 201712 to 201812, 201712 is to flag if the person is enrolled before or the new enrollment.
I want to have new variables which are defined as:
Total_Elig_Enrolled_Months - Total no.of months person is enrolled over the 12 months span excluding the pre-enrolled month.
Total_missing_months - total no.of span months(default it is 12) subtracting with total_elig_Enrolled_Months
Enroll_Continous_months - total no.of months enrolled consecutively over 12 months span excluding the first month.
No.of_times_discontinued - data discontinued for the consecutive months, if person has enrolled from 201801 to 201810 and discontinued for a month and then enrolled back again from 201812 then the no of discontinued times is '1', where as if the person is not enrolled from 201801 but enrolled in 201805 to 201810 and then from 201812 then the no.of times discontinued should be '2'.
Enrolled_all_12_months - If the person is enrolled in all 12 months of span.
New_enrollment - If person is not enrolled in first month which is 201712 then the person is considered as new enrollment, if present in 201712 then person is not newly enrolled into the program.
I appreciate the help. I am new to SAS and in learning phase. Thanks again for the help.
Source data:
Person_id | region code | Enroll_month |
101 | 10001 | 201712 |
101 | 10001 | 201801 |
101 | 10001 | 201802 |
101 | 10001 | 201803 |
101 | 10001 | 201804 |
101 | 10001 | 201812 |
102 | 10003 | 201712 |
102 | 10003 | 201805 |
102 | 10003 | 201806 |
102 | 10003 | 201807 |
102 | 10003 | 201808 |
102 | 10003 | 201809 |
102 | 10003 | 201810 |
103 | 10009 | 201812 |
Expected Data:
Person_id | region_code | Total_Elig_enrolled_months | Total_Missing Months | Enroll_Continous_Months | No.of_times_discontinued | Enrolled_all_12_months | New_Enrollment |
101 | 10001 | 5 | 7 | 4 | 1 | No | No |
102 | 10003 | 5 | 7 | 5 | 2 | No | No |
103 | 10009 | 5 | 11 | 1 | 2 | No | Yes |
Hi @sri21592
Here is an attempt to compute the various flags.
According to you data, it is not logical to set the Enroll_Continous_Months to 1 and No.of_times_discontinued to 2 for patient 103.
In addition, for patient 102, Total_Elig_enrolled_months should be 6 as well as Total_missing_months.
data have;
infile datalines dlm="09"x;
input Person_id region_code Enroll_month YYMMn6.;
format Enroll_month YYMMn6.;
datalines;
101 10001 201712
101 10001 201801
101 10001 201802
101 10001 201803
101 10001 201804
101 10001 201812
102 10003 201712
102 10003 201805
102 10003 201806
102 10003 201807
102 10003 201808
102 10003 201809
102 10003 201810
103 10009 201812
;
run;
/* Preparation */
%let start = 201712;
/* List all participants: dataset "list" */
proc sql;
create table list as
select distinct Person_id, region_code
from have;
quit;
/* Retrieve all months and flag observations accordingly: dataset "have_all"*/
/* flag_Enroll_month = set the month of enrollment to 1 */
/* flag_Enroll = set months of participation to 1 */
data list_month;
set list;
format Enroll_month YYMMn6.;
do i=1 to 13;
Enroll_month = intnx('month',input("&start",YYMMn6.),i-1,"b");
output;
end;
drop i;
run;
data have_flag;
set have;
by Person_id region_code Enroll_month;
if first.region_code then flag_Enroll_month = 1;
else flag_Enroll_month = .;
run;
data enroll;
set have_flag;
where flag_Enroll_month=1;
drop flag_Enroll_month;
rename Enroll_month = initial_month;
run;
data have_pre_all;
merge list_month (in=x) have_flag (in=y);
by Person_id region_code Enroll_month;
if y=1 then flag_enroll = 1;
else flag_enroll = 0;
format Enroll_month YYMMn6.;
run;
data have_all;
merge have_pre_all enroll;
by Person_id region_code;
run;
/* Flag New_enrollment */
proc sql;
create table want1 as
select Person_id, region_code,
case when put(Enroll_month,YYMMn6.) = "&start" then 'No'
else 'Yes'
end as New_enrollment
from have_all
where flag_Enroll_month = 1;
run;
/* Flag Total_Elig_Enrolled_Months
Enrolled_all_12_months
Total_missing_months */
proc sql;
create table want2 as
select Person_id, region_code,
sum(flag_enroll) as Total_Elig_Enrolled_Months,
12 - sum(flag_enroll) as Total_missing_months,
case when sum(flag_enroll) < 12 then 'No'
when sum(flag_enroll) = 12 then 'Yes'
end as Enrolled_all_12_months
from have_all
where put(Enroll_month,YYMMn6.) ne "&start"
group by Person_id, region_code;
quit;
/* Flag Flag No.of_times_discontinued */
data have_disc;
set have_all;
by Person_id region_code Enroll_month;
_lag = lag(flag_enroll);
if flag_enroll=0 and _lag=1 then output;
run;
proc sql;
create table want3 as
select Person_id, region_code, count(*) as No_of_times_discontinued
from have_disc
group by Person_id, region_code;
quit;
/* Enroll_Continous_months */
data have_continous;
set have_all (where=(Enroll_month>initial_month));
by Person_id region_code flag_enroll notsorted;
if first.flag_enroll then count = 0;
count + 1;
run;
proc sql;
create table want4 as
select Person_id, region_code, max(count) as Enroll_Continous_months
from have_continous
where flag_enroll=1
group by Person_id, region_code;
quit;
/* Final dataset */
data want;
merge want1 want2 want3 want4;
run;
Since you are new, probably the most valuable thing you can do, both for learning and for solving this particular problem, is to create a SAS data set with the raw data - i.e. don't worry yet about generating the indicators. And that would give us material to better help you.
So take a look at https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... which shows you how to construct a SAS DATA step to include in your message. Then we can use that data step to give advice.
In particular consider how you want to code your month indicators. You might want to input them as sas DATE values rather than the 6-digit YYYYMM numbers that you have listed. That's because 201712 plus 1 is NOT 201801, but the sas date value for, say, '31DEC2017'd plus one month (via the INTNX command) can trivially generate '31JAN2018'd (or '01jan2018'd if preferred). Which in turn means you can more easily trace continuous monthly enrollment across year boundaries.
Hi @sri21592
Here is an attempt to compute the various flags.
According to you data, it is not logical to set the Enroll_Continous_Months to 1 and No.of_times_discontinued to 2 for patient 103.
In addition, for patient 102, Total_Elig_enrolled_months should be 6 as well as Total_missing_months.
data have;
infile datalines dlm="09"x;
input Person_id region_code Enroll_month YYMMn6.;
format Enroll_month YYMMn6.;
datalines;
101 10001 201712
101 10001 201801
101 10001 201802
101 10001 201803
101 10001 201804
101 10001 201812
102 10003 201712
102 10003 201805
102 10003 201806
102 10003 201807
102 10003 201808
102 10003 201809
102 10003 201810
103 10009 201812
;
run;
/* Preparation */
%let start = 201712;
/* List all participants: dataset "list" */
proc sql;
create table list as
select distinct Person_id, region_code
from have;
quit;
/* Retrieve all months and flag observations accordingly: dataset "have_all"*/
/* flag_Enroll_month = set the month of enrollment to 1 */
/* flag_Enroll = set months of participation to 1 */
data list_month;
set list;
format Enroll_month YYMMn6.;
do i=1 to 13;
Enroll_month = intnx('month',input("&start",YYMMn6.),i-1,"b");
output;
end;
drop i;
run;
data have_flag;
set have;
by Person_id region_code Enroll_month;
if first.region_code then flag_Enroll_month = 1;
else flag_Enroll_month = .;
run;
data enroll;
set have_flag;
where flag_Enroll_month=1;
drop flag_Enroll_month;
rename Enroll_month = initial_month;
run;
data have_pre_all;
merge list_month (in=x) have_flag (in=y);
by Person_id region_code Enroll_month;
if y=1 then flag_enroll = 1;
else flag_enroll = 0;
format Enroll_month YYMMn6.;
run;
data have_all;
merge have_pre_all enroll;
by Person_id region_code;
run;
/* Flag New_enrollment */
proc sql;
create table want1 as
select Person_id, region_code,
case when put(Enroll_month,YYMMn6.) = "&start" then 'No'
else 'Yes'
end as New_enrollment
from have_all
where flag_Enroll_month = 1;
run;
/* Flag Total_Elig_Enrolled_Months
Enrolled_all_12_months
Total_missing_months */
proc sql;
create table want2 as
select Person_id, region_code,
sum(flag_enroll) as Total_Elig_Enrolled_Months,
12 - sum(flag_enroll) as Total_missing_months,
case when sum(flag_enroll) < 12 then 'No'
when sum(flag_enroll) = 12 then 'Yes'
end as Enrolled_all_12_months
from have_all
where put(Enroll_month,YYMMn6.) ne "&start"
group by Person_id, region_code;
quit;
/* Flag Flag No.of_times_discontinued */
data have_disc;
set have_all;
by Person_id region_code Enroll_month;
_lag = lag(flag_enroll);
if flag_enroll=0 and _lag=1 then output;
run;
proc sql;
create table want3 as
select Person_id, region_code, count(*) as No_of_times_discontinued
from have_disc
group by Person_id, region_code;
quit;
/* Enroll_Continous_months */
data have_continous;
set have_all (where=(Enroll_month>initial_month));
by Person_id region_code flag_enroll notsorted;
if first.flag_enroll then count = 0;
count + 1;
run;
proc sql;
create table want4 as
select Person_id, region_code, max(count) as Enroll_Continous_months
from have_continous
where flag_enroll=1
group by Person_id, region_code;
quit;
/* Final dataset */
data want;
merge want1 want2 want3 want4;
run;
Hi @ed_sas_member,
Thanks for the quick response and it worked as magic. I just need to find the way to display the discontinued no.of times.
Thanks again!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.