BookmarkSubscribeRSS Feed
ftahsin
Obsidian | Level 7

Hello SAS Community,

I am working on a Medicaid dataset that includes a treatment and a control group. To finalize who will be in my treatment group, I need to determine the patients who had continuous enrollment for at least 9 months into an intervention program. I already reconciled the dataset and removed all the duplicated months and years of their enrollment. My reconciling code worked fine, which was-

data Babies_Clean(drop=prior_: temp_: orig_:);
retain orsid begin end;
format Prior_end Orig_begin Orig_end
temp_beg temp_end begin end yymm10.;
set Mcaid_elig_babies(rename=(begin=Orig_begin
end =Orig_end ));
by orsid Orig_begin Orig_end;
if Orig_begin > Orig_end then delete;
Prior_end = lag(Orig_end);
if first.orsid
then do;
temp_beg = Orig_begin;
temp_end = Orig_end;
Prior_end = .;
end;
else if 0 <= (Orig_begin - Prior_end) <= 1
then do;
temp_end = Orig_end;
end;
else if (Orig_begin - temp_end) > 1
then do;
begin = temp_beg;
end = temp_end;
output;
temp_beg = Orig_begin;
temp_end = Orig_end;
end;
else if temp_beg <= Orig_begin <= temp_end
then do;
if Orig_end > temp_end
then temp_end = Orig_end;
end;
if last.orsid then do;
begin = temp_beg;
if Orig_end > temp_end
then end = Orig_end;
else end = temp_end;
output;
end;
run;

Now I want to write the codes that will indicate the patients who were enrolled for at least 9 months into that program separately from those who were enrolled for less than 9 months. I also want to create a new binary variable that will show this indication [i.e., 1 = eligible (had at least 9 months enrollment), 0 = not eligible).

Please help me with this coding.

Thanks a lot!
Farah

6 REPLIES 6
MCoopmans
SAS Employee

Hello Farah,

 

This is going to be difficult without a good idea of the structure of the dataset and the record layout. Could you share a couple of rows with some sample data that could help clarifying your question ?

 

Thanks,

Mathias.

ftahsin
Obsidian | Level 7

Hello Mathias,

 

Thanks for your reply. I have attached a screenshot of my main 3 raws here:

 

ftahsin_1-1596049509307.png

 

I apologize, I have reconciled the dates and deleted the duplicates (i.e., Mar 2009 - July 2017 for the 1st orsid 8760376, instead of keeping it in 2 separate sections). But, unfortunately, my file crashed and I am working on it again.

 

Let me know if this will help with the code for continuous enrollment that I want. For example, the 1st orsid was enrolled from March 2009 - July 2017 (more than 9 months), so he is eligible for my study. Orsid 8749769 was enrolled from November 2008 - May 2009 (less than 9 months, as we can get from the table), so he won't be eligible. There are thousands of ID's in my dataset, so I can't do them manually. I need a code for that and I also want to create a variable that will indicate 1 = eligible and 0 = not, according to equal or more than 9 months and less than that.

 

Thanks again!

Farah

MCoopmans
SAS Employee

Hi Farah,

 

Are you still looking for a solution ? I think this example here would be a good start:

 

data have ;
infile cards;
input orsid begin : date9. end : date9.;
format begin end monyy7.;
cards;
8760376 01MAR2009 01JAN2010
8760376 01JAN2010 01JUL2017
8749769 01NOV2008 01APR2009
8749769 01NOV2008 01MAY2009
8749769 01NOV2008 01MAY2009
;
run;

data want;
 set have ;
 retain prev_begin prev_end;
 by notsorted orsid;
  if not first.orsid then do;
   if begin le prev_end  then begin=prev_begin;
  end;
 length=INTCK('MONTH',begin,end);
 prev_begin=begin;
 prev_end=end;
  format prev_begin prev_end monyy7.;
run;

Here is the resulting dataset:

2020-08-17_220747.png

As you can see in the second row the "begin" date was copied from the "begin" date of the first row since the 2 periods are overlapping for the same orsid. The length variable holds the number of months that are between the begin and end variable. You could add some code to only retain the maximum value of length by orsid. 

 

Does this help ?

Mathias.

 

 

ftahsin
Obsidian | Level 7
Hello Mathias,

Thanks a lot for your help! Yes, that will be a good start. I am applying your code by modifying it a bit according to my whole dataset. I will let you know if that works and if I come up with any other concerns.

Thanks again!
Farah
mkeintz
PROC Star

What do you want the output to look like?

 

  1. One record per spell, with beginning and ending dates of each continuous spell (and maybe a new variable with length-of-spell)?  If a person has two spells, they could possibly have one of 9 months or more, and one of less-than 9 months.
  2. One record per id, where a person ends up in the desired file once if they have ANY spells of 9 months or more.  If so, what date information would you want, from the multiple candidate spells?
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ftahsin
Obsidian | Level 7
Hello mkeintz,

My expected output will be the 1st option that you mentioned. One record per spell, with beginning and ending dates of each continuous spell.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2842 views
  • 2 likes
  • 3 in conversation