BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sasgorilla
Pyrite | Level 9

I'm having trouble figuring out how to create two new variables that depend on the most recent date. Let's say I have ID, dx (diagnosis), a prescribed med, and a start date. 

data have;
input id dx $ med $ startdate :date9.;
format startdate date9.;
datalines;
1 a a 01JAN2020 
1 a a 04APR2020
1 a a 21APR2020
1 a a 01JUL2020
2 a a 01FEB2020
2 a b 01JUL2020
2 a a 07JUL2020
3 b b 01JAN2020
3 b a 04APR2020
3 b a 01MAY2020
3 b a 15JUN2020
3 b a 01SEP2020
3 b b 02SEP2020
;
RUN;

For any id, when there is > 60 day gap from the preceding date, given the same dx and med, I want to call this is a "new" prescription. For the same dx and med <60 days from the most recent date I want to call it a "refill". 

data want;
input id dx $ med $ startdate :date9. refill $ new $;
format startdate date9.;
datalines;
1 a a 01JAN2020 0 0
1 a a 04APR2020 0 1
1 a a 21APR2020 1 0
1 a a 01JUL2020 0 1
2 a a 01FEB2020 0 0
2 a a 07JUL2020 0 1
2 a b 01JUL2020 0 0
3 b a 04APR2020 0 0
3 b a 01MAY2020 1 0
3 b a 15JUN2020 1 0
3 b a 01SEP2020 0 1
3 b b 01JAN2020 0 0
3 b b 02SEP2020 0 1
;
RUN;

The code I've been trying to use is below. 

proc sort data=have out=havesort;
by id dx med startdate;
run;
data want;
	set havesort;
	by id dx med;
	IF FIRST.med THEN DO;
	TEMP=startdate;
	refill=0;
	new=0;
	END;
	RETAIN TEMP refill new;
	IF startdate>TEMP + 60 THEN new+1; 
	IF TEMP<startdate<=TEMP + 60 then refill+1;
	TEMP=startdate;
	drop temp;
run;

However, as you can see, this doesn't reset the 60 day counter to the most recent startdate. How can I do that to get what I want? 

 

Let me know if that does not make sense. Thank you always for the help!

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input id dx $ med $ startdate :date9.;
format startdate date9.;
datalines;
1 a a 01JAN2020 
1 a a 04APR2020
1 a a 21APR2020
1 a a 01JUL2020
2 a a 01FEB2020
2 a b 01JUL2020
2 a a 07JUL2020
3 b b 01JAN2020
3 b a 04APR2020
3 b a 01MAY2020
3 b a 15JUN2020
3 b a 01SEP2020
3 b b 02SEP2020
;
RUN;
proc sort data=have;by id dx med startdate;run;
data want;
 set have;
 by id dx med;
 refill=0;new=0;
 if not first.med and .<dif(startdate)<60 then refill=1;
 if not first.med and   dif(startdate)>60 then new=1;
run;

View solution in original post

9 REPLIES 9
quickbluefish
Barite | Level 11
proc sort data=have; by id dx; run;

data want;
set have;
by id dx;
length lastdt 4 refill new 3;
format lastdt date9.;
retain lastdt;
if first.dx then lastdt=.;
if startdate-lastdt>60 then new=1;
else if lastdt>. then refill=1;
lastdt=startdate;
run;

proc print data=want; run;
sasgorilla
Pyrite | Level 9

Thanks @quickbluefish ! I hadn't even thought to make a separate date column. 

 

So, this worked with a subtle correction. Because the med type can vary and that is relevant to whether it was a new or refill for that particular med, I used the following to get what I needed: 

proc sort data=have out=havesort;
	by id dx med;
run;

data want;
set havesort;
by id dx med;
length lastdt 4 refill new 3;
format lastdt date9.;
retain lastdt;
if first.med then lastdt=.;
if startdate-lastdt>60 then new=1;
else if lastdt>. then refill=1;
lastdt=startdate;
run;

proc print data=want;
run;
quickbluefish
Barite | Level 11
Oh, missed that part! Glad you got it to work. Yeah, so many of these kinds of problems that seem tricky are suddenly very simple once the data are sorted appropriately.
Ksharp
Super User
data have;
input id dx $ med $ startdate :date9.;
format startdate date9.;
datalines;
1 a a 01JAN2020 
1 a a 04APR2020
1 a a 21APR2020
1 a a 01JUL2020
2 a a 01FEB2020
2 a b 01JUL2020
2 a a 07JUL2020
3 b b 01JAN2020
3 b a 04APR2020
3 b a 01MAY2020
3 b a 15JUN2020
3 b a 01SEP2020
3 b b 02SEP2020
;
RUN;
proc sort data=have;by id dx med startdate;run;
data want;
 set have;
 by id dx med;
 refill=0;new=0;
 if not first.med and .<dif(startdate)<60 then refill=1;
 if not first.med and   dif(startdate)>60 then new=1;
run;
sasgorilla
Pyrite | Level 9

Thank you @Ksharp . I think this is the cleanest solution! I added "<=60" for the refill =1 and it got me what I wanted. 

 

Question, about the dif function, as I'm not familiar with that... In your code I'm assuming the difference is being taken between the preceding startdate and the current startdate because of the "if not first.med" statement. Is that correct? 

if not first.med and .<dif(startdate)<60 then refill=1;

 

Ksharp
Super User
dif()
is the same as
startdate-lag(startdate)
For example:
date dif
01jan2020 .
04jan2020 3
06jan2020 2
05jan2020 -1

Here I used "if not first.med" to avoid to assign refill=1 and new=1 when it is the first obs of each group.
a.k.a
refill and new will always get 0 for the first obs of each group .

mkeintz
PROC Star
proc sort data=have out=need;
  by id dx med startdate;
run;

data want;
  set need;
  by id dx med;
  refill=(first.med=0 and lag(startdate)>=startdate-60);
  new=(first.med=0 and lag(startdate)<startdate-60);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
sasgorilla
Pyrite | Level 9

This is another good option, thank you! This one doesn't feel as intuitive to me due to the "lag" function. I'll have to get more comfortable with it. 

I did find the SAS documentation has a good example here

Astounding
PROC Star
Whichever function you choose (dif or lag), make sure it executes on every observation. If you skip observations where first.med is 0, the next observation will have an incorrect calculation.

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2106 views
  • 5 likes
  • 5 in conversation