BookmarkSubscribeRSS Feed
Wolverine
Quartz | Level 8

I have insurance claims data, and I've identified cases that have had a particular medical event.  Now I'm trying to make sure that these cases were continuously enrolled 3 months prior to and 2 months after that event. In the following example data, let's say the index date for 1234 was Feb 15 2019.  That case would qualify.  But if the index date for 5678 was Jan 20 2022, that case would not qualify.

memberid  monthid
1234      201811
1234 201812 1234 201901 1234 201902
1234 201903
1234 201904 5678 202110 5678 202111 5678 202201
5678 202202
5678 202203

I searched and found some potentially useful code, but this is set up to find cases that have not had a gap in the last 12 months.  So how do I modify this to work with the index date and to search 3 months back and 2 months after?

 

data want;
 set have;
 by memberID monthID;
 if first.memberID then counter=0;
 if dif(monthID)>1 and mod(monthID,100) ne 1 then counter=0;
 if mod(monthID,100) eq 1 and dif(monthID) ne 89 then counter=0;
 counter+1;
 if counter ge 12 then output;
run;
8 REPLIES 8
PaigeMiller
Diamond | Level 26

There's no Index date in your data.

 

Can we assume that the monthid values are continuous and never skip a month? Can we assume that the monthid values are numeric? (Or are they character?) 

 

--
Paige Miller
Wolverine
Quartz | Level 8

Sorry, I should have clarified.  Right now the index date is in another file, and I plan to link them by memberid.  There's a character version called service_date in the format 20191015 (Oct 15 2019), as well as a numeric version called service_date_d that contains the number of days since Jan 1 1960 (22,568 represents Oct 15 2019).

 

The monthid is character.  The months of coverage are independent of each other, so a case may be enrolled in 201910, skip 201911, and be re-enrolled in 201912.

Wolverine
Quartz | Level 8

I've updated the data file so that it look like this:

 

 

memberid  monthid service_date   service_date_d
1234      201811   20190215      21595
1234      201812   20190215      21595
1234      201901   20190215      21595
1234      201902   20190215      21595
1234      201903   20190215      21595
1234      201904   20190215      21595
5678      202110   20220122      22665
5678      202111   20220122      22665
5678      202201   20220122      22665
5678      202202   20220122      22665
5678      202203   20220122      22665

 

 

mkeintz
PROC Star

And what do you want your output dataset to look like?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Wolverine
Quartz | Level 8

@mkeintz wrote:

And what do you want your output dataset to look like?


The final file should have a single record for each member_ID and a flag variable (1=yes, 0=no) for continuous coverage around the timeframe of the index event.

Tom
Super User Tom
Super User

Borrowing an idea from one of claims vendors (LABRX??) why not convert your coverage information into a string where each character represents a month.

data have ;
  input memberid $  monthid $;
cards;
1234      201811
1234      201812
1234      201901
1234      201902
1234      201903
1234      201904
5678      202110
5678      202111
5678      202201
5678      202202
5678      202203
;


%let basemonth='01JAN2018'd;
%let nmonths=%eval(5*12);

data coverage;
  do until (last.memberid);
    set have;
    by memberid;
    length coverage $&nmonths ;
    if first.memberid then coverage=repeat('.',&nmonths-1);
    offset=intck('month',&basemonth,servicedate);
    substr(coverage,offset+1,1)='X';
  end;
  keep memberid coverage;
run;

Tom_0-1670555519238.png

Now the test for continuous coverage is just a string comparison operation.  Use the INDEX date to calculate what part of teh string to look at.

data want;
   merge coverage service ;
   by memberid;
   offset=intck('month',&basemonth,input(monthid,yymmn6.));
   covered = ( 'XXXXX' = substrn(coverage,offset-2,5) );
run;
Wolverine
Quartz | Level 8
My biggest concern with this approach is that the coverage variable is not human-readable -- I can't easily tell which month/year is covered and which are not. If there was some issue where everything was shifted by a month, I probably wouldn't be able to see it.
Tom
Super User Tom
Super User

The other way is to convert your coverage into intervals.  I am sure this has been asked before here how to collapse overlapping of contiguous intervals.  Or you could just use the array approach implied by the string approach and then roll it out as a series of intervals.

 

Then you just need to join by the memberid and whether the index date falls into the coverage interval.

proc sql;
create table want as
select a.*
     , b.start
     , b.end
     , intnx('month',b.start,a.index) as months_pre
     , intnx('month',a.index,b.end) as months_post
from service a
left join coverge b
on a.memberid = b.memberid
and a.index between b.start and b.end 
;
quit;

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 651 views
  • 0 likes
  • 4 in conversation