- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have insurance claims data, and I've identified cases that have had a particular medical event. Now I'm trying to make sure that these cases were continuously enrolled 3 months prior to and 2 months after that event. In the following example data, let's say the index date for 1234 was Feb 15 2019. That case would qualify. But if the index date for 5678 was Jan 20 2022, that case would not qualify.
memberid monthid 1234 201811
1234 201812 1234 201901 1234 201902
1234 201903
1234 201904 5678 202110 5678 202111 5678 202201
5678 202202
5678 202203
I searched and found some potentially useful code, but this is set up to find cases that have not had a gap in the last 12 months. So how do I modify this to work with the index date and to search 3 months back and 2 months after?
data want;
set have;
by memberID monthID;
if first.memberID then counter=0;
if dif(monthID)>1 and mod(monthID,100) ne 1 then counter=0;
if mod(monthID,100) eq 1 and dif(monthID) ne 89 then counter=0;
counter+1;
if counter ge 12 then output;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There's no Index date in your data.
Can we assume that the monthid values are continuous and never skip a month? Can we assume that the monthid values are numeric? (Or are they character?)
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, I should have clarified. Right now the index date is in another file, and I plan to link them by memberid. There's a character version called service_date in the format 20191015 (Oct 15 2019), as well as a numeric version called service_date_d that contains the number of days since Jan 1 1960 (22,568 represents Oct 15 2019).
The monthid is character. The months of coverage are independent of each other, so a case may be enrolled in 201910, skip 201911, and be re-enrolled in 201912.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I've updated the data file so that it look like this:
memberid monthid service_date service_date_d 1234 201811 20190215 21595 1234 201812 20190215 21595 1234 201901 20190215 21595 1234 201902 20190215 21595 1234 201903 20190215 21595 1234 201904 20190215 21595 5678 202110 20220122 22665 5678 202111 20220122 22665 5678 202201 20220122 22665 5678 202202 20220122 22665 5678 202203 20220122 22665
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
And what do you want your output dataset to look like?
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@mkeintz wrote:
And what do you want your output dataset to look like?
The final file should have a single record for each member_ID and a flag variable (1=yes, 0=no) for continuous coverage around the timeframe of the index event.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Borrowing an idea from one of claims vendors (LABRX??) why not convert your coverage information into a string where each character represents a month.
data have ;
input memberid $ monthid $;
cards;
1234 201811
1234 201812
1234 201901
1234 201902
1234 201903
1234 201904
5678 202110
5678 202111
5678 202201
5678 202202
5678 202203
;
%let basemonth='01JAN2018'd;
%let nmonths=%eval(5*12);
data coverage;
do until (last.memberid);
set have;
by memberid;
length coverage $&nmonths ;
if first.memberid then coverage=repeat('.',&nmonths-1);
offset=intck('month',&basemonth,servicedate);
substr(coverage,offset+1,1)='X';
end;
keep memberid coverage;
run;
Now the test for continuous coverage is just a string comparison operation. Use the INDEX date to calculate what part of teh string to look at.
data want;
merge coverage service ;
by memberid;
offset=intck('month',&basemonth,input(monthid,yymmn6.));
covered = ( 'XXXXX' = substrn(coverage,offset-2,5) );
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The other way is to convert your coverage into intervals. I am sure this has been asked before here how to collapse overlapping of contiguous intervals. Or you could just use the array approach implied by the string approach and then roll it out as a series of intervals.
Then you just need to join by the memberid and whether the index date falls into the coverage interval.
proc sql;
create table want as
select a.*
, b.start
, b.end
, intnx('month',b.start,a.index) as months_pre
, intnx('month',a.index,b.end) as months_post
from service a
left join coverge b
on a.memberid = b.memberid
and a.index between b.start and b.end
;
quit;