I've identified birth events in the data, and now I need to see if the mom has continuous enrollment for insurance coverage for the subsequent 12 months (so for a birth in June 2017, she would have to have continuous enrollment until May 2018). The data looks like this:
data test;
input ID deliv_date YEAR_of_BIRTH YEAR_of_ENROLL ENRMON1 ENRMON2 ENRMON3 ENRMON4 ENRMON5 ENRMON6 ENRMON7 ENRMON8 ENRMON9 ENRMON10 ENRMON11 ENRMON12;
datalines;
1234 06/01/2017 2017 2017 1 1 1 1 1 1 1 1 1 1 1 1
1234 06/01/2017 2017 2018 1 1 1 1 1 1 1 1 1 1 1 1
1234 06/01/2017 2017 2019 1 1 1 1 1 1 1 1 1 1 1 1
2345 08/15/2018 2018 2017 1 1 1 1 1 1 1 1 1 1 1 1
2345 08/15/2018 2018 2018 1 1 1 1 1 1 1 1 1 1 1 1
2345 08/15/2018 2018 2019 0 1 1 1 1 1 1 1 1 1 1 1
;
In this example, ID 1234 has a baby on June 1 2017 and has continuous coverage for 12+ months. But ID 2345 does NOT have continuous coverage due to the gap in Jan 2019.
I would like to create a flag for each case: contin_enroll = 1 if the case meets the criteria, or 0 if the case does not. I have some ideas on how to do this using a bunch of IF/THEN statements, but it would be clunky and tedious to program. I'm hoping for guidance on a better solution.
One way that does what I understand:
data test; input ID deliv_date:mmddyy10. YEAR_of_BIRTH YEAR_of_ENROLL ENRMON1 ENRMON2 ENRMON3 ENRMON4 ENRMON5 ENRMON6 ENRMON7 ENRMON8 ENRMON9 ENRMON10 ENRMON11 ENRMON12; format deliv_date mmddyy10.; datalines; 1234 06/01/2017 2017 2017 1 1 1 1 1 1 1 1 1 1 1 1 1234 06/01/2017 2017 2018 1 1 1 1 1 1 1 1 1 1 1 1 1234 06/01/2017 2017 2019 1 1 1 1 1 1 1 1 1 1 1 1 2345 08/15/2018 2018 2017 1 1 1 1 1 1 1 1 1 1 1 1 2345 08/15/2018 2018 2018 1 1 1 1 1 1 1 1 1 1 1 1 2345 08/15/2018 2018 2019 0 1 1 1 1 1 1 1 1 1 1 1 ; /* get dates for the enrollments and restructure*/ data useful; set test; array m (*) enrmon: ; do i=1 to dim(m); testdate= mdy(i,1,year_of_enroll); Enrollcode = m[i]; output; end; format testdate mmddyy10.; keep id -- YEAR_of_ENROLL testdate enrollcode; run; data setflags; set useful; if (intck('month',deliv_date,testdate) le 12) then Cont_flag = enrollcode; run; proc summary data=setflags nway; class id deliv_date YEAR_of_BIRTH; var Cont_flag; output out=want (drop=_: ) min= ; run;
The Useful set could be used with different logic to look at other insured dates.
The Setflags could be done with one step but since the Useful set might be wanted as a start for something else I think it may be worth having.
Summary, with the Cont_flags conditionally set only for the period following birth, then lets you get the minimum value. If all enrolled the flag is 1.
You data step fails because you do not specify an Informat for the Deliv_date that will read the date value. It is not a simple number so it requires an informat. Is your Deliv_date an actual date value, numeric with a format like MMYYDD10. assigned or is it character.
Just what does "ENRMON1" mean? Does a value of 1 mean that the mom was enrolled for the month of January? Or for the first month after birth?
Where does the flag go? What, ideally, does the output data look like?
One might ask why you bother to have enrollment values for a year prior to birth to the child if the important part is "12 months after birth". No flag makes sense for any observation where the Year_of_Enroll is less than the Year_of_birth, or if the Year_of_enroll is 2 or more years after the Year_of_birth.
@ballardw wrote:
You data step fails because you do not specify an Informat for the Deliv_date that will read the date value. It is not a simple number so it requires an informat. Is your Deliv_date an actual date value, numeric with a format like MMYYDD10. assigned or is it character.
Just what does "ENRMON1" mean? Does a value of 1 mean that the mom was enrolled for the month of January? Or for the first month after birth?
Where does the flag go? What, ideally, does the output data look like?
One might ask why you bother to have enrollment values for a year prior to birth to the child if the important part is "12 months after birth". No flag makes sense for any observation where the Year_of_Enroll is less than the Year_of_birth, or if the Year_of_enroll is 2 or more years after the Year_of_birth.
deliv_date is an actual date value, with the format MMYYDD10. ENRMON1 indicates coverage for January in the year of enrollment. ENRMON2 is for February, ENRMON3 for March, etc. 1=enrollment for that month, 0=no enrollment.
The final output file could just be a single line for each ID, with a value of 0 or 1 for the contin_enroll flag.
The enrollment flags for years prior to birth or more than 12 months after birth are simply what's in the data, and I didn't see any reason to delete them. And it's possible, as this project develops, that we might want to see what kind of prenatal care the mother received. In that case, we would need to know if she was enrolled during pregnancy.
One way that does what I understand:
data test; input ID deliv_date:mmddyy10. YEAR_of_BIRTH YEAR_of_ENROLL ENRMON1 ENRMON2 ENRMON3 ENRMON4 ENRMON5 ENRMON6 ENRMON7 ENRMON8 ENRMON9 ENRMON10 ENRMON11 ENRMON12; format deliv_date mmddyy10.; datalines; 1234 06/01/2017 2017 2017 1 1 1 1 1 1 1 1 1 1 1 1 1234 06/01/2017 2017 2018 1 1 1 1 1 1 1 1 1 1 1 1 1234 06/01/2017 2017 2019 1 1 1 1 1 1 1 1 1 1 1 1 2345 08/15/2018 2018 2017 1 1 1 1 1 1 1 1 1 1 1 1 2345 08/15/2018 2018 2018 1 1 1 1 1 1 1 1 1 1 1 1 2345 08/15/2018 2018 2019 0 1 1 1 1 1 1 1 1 1 1 1 ; /* get dates for the enrollments and restructure*/ data useful; set test; array m (*) enrmon: ; do i=1 to dim(m); testdate= mdy(i,1,year_of_enroll); Enrollcode = m[i]; output; end; format testdate mmddyy10.; keep id -- YEAR_of_ENROLL testdate enrollcode; run; data setflags; set useful; if (intck('month',deliv_date,testdate) le 12) then Cont_flag = enrollcode; run; proc summary data=setflags nway; class id deliv_date YEAR_of_BIRTH; var Cont_flag; output out=want (drop=_: ) min= ; run;
The Useful set could be used with different logic to look at other insured dates.
The Setflags could be done with one step but since the Useful set might be wanted as a start for something else I think it may be worth having.
Summary, with the Cont_flags conditionally set only for the period following birth, then lets you get the minimum value. If all enrolled the flag is 1.
It works! And it's a much better solution than all the IF-THEN statements I was thinking about. Thank you! 😁
@Wolverine wrote:
It works! And it's a much better solution than all the IF-THEN statements I was thinking about. Thank you! 😁
You will often find that wide data is more difficult to work with in SAS than long.
When a question relates to date, time or datetime intervals or ranges of values one of the first things to look for is getting actual date, time or datetime values to work with rules involving the interval or duration calculations.
Another approach, that I am not going to code, is once you have the Useful data set is to use BY group processing based on the Id and deliv_date values, assume the flag variable is 1 until a 0 is encountered to set the flag as 0. Which would be one approach if you needed a "when was continuous broken" as well as if you needed "when enrollment restarted".
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.