I've come across this task using two dataset, I've simplified the issues in the two input tables:
What I want to figure out is:
a. # of pregnancy episodes of women whose pregnancy start and end date both in 2015 and who also has no more than 45 days continuous enrollment gap in 2015(one single gap > 45 days counts as a gap, not accumulated gap)
b. # of pregnancy episodes of women whose has insurance from the 30 days prior to their pregnancy end towards their pregnancy end, and whose pregnancy end date is in year 2015, same 45 day gap rule.
First table has pregnancy start and end date (pregnancy_ID = 2 means this is this women's 2nd pregnancy episode)
PersonID | Pregnancy_start | Pregnancy_end | Pregnancy_ID |
1 | 4/16/2014 | 1/31/2015 | 1 |
1 | 6/1/2015 | 12/21/2015 | 2 |
2 | 2/12/2015 | 11/31/2015 | 1 |
3 | 1/1/2015 | 8/16/2015 | 1 |
4 | 3/1/2015 | 11/23/2015 | 1 |
5 | 2/3/2015 | 6/7/2015 | 1 |
5 | 10/12/2015 | 7/31/2016 | 2 |
6 | 4/1/2015 | 12/19/2015 | 1 |
second table has insurance start and end date
PersonID | Insurance_start | Insurance_end |
1 | 1/1/2014 | 1/6/2014 |
1 | 9/6/2014 | 5/3/2015 |
1 | 5/3/2015 | 1/1/2017 |
2 | 1/1/2014 | 10/1/2015 |
3 | 1/1/2014 | 12/31/2016 |
4 | 5/1/2014 | 3/6/2015 |
4 | 5/1/2015 | 12/31/2016 |
5 | 1/1/2014 | 1/1/2017 |
6 | 5/1/2014 | 11/31/2015 |
For task a, I can easily identify the first part, whose pregnancy start and end both in 2015, but the second part gets tricky. As shown in the second table, a women may have multiple insurance enrollment entries (they enrolled and dropped and re-enrolled), and merging both tables create a multiple to multiple merge, kind complex the problem....
For task b, it's easy to identify if they have insurance at the exact day of the 30 days prior, but I can't find a way to ensure the entire period is continuous enrolled.....
Thank you!
Please provide the expected output (and double check it is correct).
thank you for asking @ChrisNZ, the desired output is just two data point
- # of women in task a.
- # of women in task b.
The reason why I didn't list a desired table is that I am not sure what is the best way to count it, to merge two datasets and then do it or satisfy the conditions separately in each dataset and then merge or never merge. The pregnancy dataset has more than 5000 observations and the enrollment dataset has something like 6000 (because one person can have multiple rows/multiple enrollment and drop enrollment), thus I feel a many to many merge will complex the problem.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.