BookmarkSubscribeRSS Feed
casmcfarland
Calcite | Level 5

I am trying to write code that pulls in data related to a program where kids receive services. If the time between services is 30 days or less than it is a follow-up, but if it is over 30 days it is a re-referral. I am trying to keep only the first date of referral and re-referral dates and not any of the follow-up dates.

Here are my data:

 

data example;
input id date date9. ;
datalines;
JS6749 15mar2019
JS6749 15Jun2020
JS6749 01Jul2020
JS6749 15Jul2020
JS6749 01Aug2020
JS6749 30Jan2021
JS4524 15May2020
JS4524 30May2020
JS4524 01Jun2020
JS4524 10Jun2020
;
run;

 

/*code I have so far that does not work to keep the correct dates*/
data fixcode (keep=id date numdays newdate);
set example;
if id = lag(patient_id_number)then do
numdays=date-lag(date);
if numdays>30 then newdate=date; else newdate=lag(date);
end;
if id ne lag(id) then newdate=date;
format newdate mmddyy10.;
run;

 

/*What I want the date to look like when I am done*/

IDdatenumdaysnewdate
JS674915-Mar-2019 3/15/2019
JS674915-Jun-20204586/15/2020
JS67491-Jul-2020166/15/2020
JS674915-Jul-2020146/15/2020
JS67491-Aug-2020176/15/2020
JS674930-Jan-20211821/30/2021
JS452415-May-2020 5/15/2020
JS452430-May-2020155/15/2020
JS45241-Jun-202025/15/2020
JS452410-Jun-202095/15/2020
1 REPLY 1
ballardw
Super User

Two issues, first minor: your ID variable in the Example data needs to be read as character.

Second, and a bit more worrisome is your lag(patient_id_number) . Your example data does not have a "patient_id_number" variable so your condition is never going to be true. Also, LAG after an IF seldom works as expected because of the queue nature of the Lag and Dif functions.

 

This does what I think you are asking:

data example;
input id $ date date9. ;
format date date9.;
datalines;
JS6749 15mar2019
JS6749 15Jun2020
JS6749 01Jul2020
JS6749 15Jul2020
JS6749 01Aug2020
JS6749 30Jan2021
JS4524 15May2020
JS4524 30May2020
JS4524 01Jun2020
JS4524 10Jun2020
;
run;

data want;
   set example;
   by notsorted id;
   retain newdate;
   numdays= dif(date);
   if first.id then do;
      numdays=.;
      newdate= date;
   end;
   format newdate mmddyy10.;
   if numdays>30 then newdate=date;
run; 

When you use BY statement then SAS supplies automatic variables that indicate whether the current record is the first or last of the group. These are accessed using FIRST.variable and LAST.variable, they are numeric 1/0 which SAS uses for true/false so you can do things conditionally. The BY statement requires data to be sorted by default, your example data wasn't so the NOTSORTED option works with the example and expects data to be grouped by ID.

RETAIN will keep the values of a variable across the data step boundary. So you don't need all the If Lag values to reset.

 


@casmcfarland wrote:

I am trying to write code that pulls in data related to a program where kids receive services. If the time between services is 30 days or less than it is a follow-up, but if it is over 30 days it is a re-referral. I am trying to keep only the first date of referral and re-referral dates and not any of the follow-up dates.

Here are my data:

 

data example;
input id date date9. ;
datalines;
JS6749 15mar2019
JS6749 15Jun2020
JS6749 01Jul2020
JS6749 15Jul2020
JS6749 01Aug2020
JS6749 30Jan2021
JS4524 15May2020
JS4524 30May2020
JS4524 01Jun2020
JS4524 10Jun2020
;
run;

 

/*code I have so far that does not work to keep the correct dates*/
data fixcode (keep=id date numdays newdate);
set example;
if id = lag(patient_id_number)then do
numdays=date-lag(date);
if numdays>30 then newdate=date; else newdate=lag(date);
end;
if id ne lag(id) then newdate=date;
format newdate mmddyy10.;
run;

 

/*What I want the date to look like when I am done*/

ID date numdays newdate
JS6749 15-Mar-2019   3/15/2019
JS6749 15-Jun-2020 458 6/15/2020
JS6749 1-Jul-2020 16 6/15/2020
JS6749 15-Jul-2020 14 6/15/2020
JS6749 1-Aug-2020 17 6/15/2020
JS6749 30-Jan-2021 182 1/30/2021
JS4524 15-May-2020   5/15/2020
JS4524 30-May-2020 15 5/15/2020
JS4524 1-Jun-2020 2 5/15/2020
JS4524 10-Jun-2020 9 5/15/2020

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 380 views
  • 0 likes
  • 2 in conversation