BookmarkSubscribeRSS Feed
Jeff_DOC
Pyrite | Level 9

Good afternoon.

 

I need a data step to determine the number of days between succeeding rows based on a group (see the example). I've tried to use RETAIN to achieve this, but I haven't been able to make any progress at all.

 

Could someone point me in the correct direction to perform this? I don't have an example since I haven't gotten anything to work.

 

Thank you very much to anyone willing to help.

 

KEY REVIEW_DATE REVIEW_TYPE Days Comment
380 3/31/2023 INIT 0 All INIT with the same KEY begin with zero
380 9/6/2023 6MONTH 159 Subsequent rows with the same key determine the number of days from the preceding row REVIEW_DATE.
380 3/6/2024 6MONTH 182 Subsequent rows with the same key determine the number of days from the preceding row REVIEW_DATE.
380 9/18/2024 6MONTH 196 Subsequent rows with the same key determine the number of days from the preceding row REVIEW_DATE.
380 4/4/2025 6MONTH 198 Subsequent rows with the same key determine the number of days from the preceding row REVIEW_DATE.
381 2/22/2024 INIT 0 All INIT with the same KEY begin with zero
381 6/30/2024 6MONTH 129 Subsequent rows with the same key determine the number of days from the preceding row REVIEW_DATE.
381 7/1/2025 6MONTH 366 Subsequent rows with the same key determine the number of days from the preceding row REVIEW_DATE.
3 REPLIES 3
SASKiwi
PROC Star

Here's one way of doing this - using RETAIN requires use of a variable not being read in:

data Reviews;
  input @1 key $3. @6 Review_Date mmddyy10.;
  format Review_Date date9.;
  datalines;
380   3/31/2023
380	  9/6/2023
380	  3/6/2024
380	  9/18/2024
380	  4/4/2025
381	  2/22/2024
381	  6/30/2024
381	  7/1/2025
;
run;

data Reviews_Want;
  keep key Review_Date Days;
  set Reviews;
  by key;
  if first.key then Days = 0;
  else Days = Review_Date - Last_Review_Date;
  output;
  retain Last_Review_Date;
  Last_Review_Date = Review_Date;
run;

 

Tom
Super User Tom
Super User

First let's convert your listing into an actual SAS dataset so we have something to program with.

 

data have;
  infile cards dsd dlm='|' truncover ;
  input KEY REVIEW_DATE :mmddyy. REVIEW_TYPE $ Days Comment :$100. ;
  format review_date yymmdd10.;
cards;
380|3/31/2023|INIT|0|All INIT with the same KEY begin with zero
380|9/6/2023|6MONTH|159|Subsequent rows with the same key determine the number of days from the preceding row REVIEW_DATE.
380|3/6/2024|6MONTH|182|Subsequent rows with the same key determine the number of days from the preceding row REVIEW_DATE.
380|9/18/2024|6MONTH|196|Subsequent rows with the same key determine the number of days from the preceding row REVIEW_DATE.
380|4/4/2025|6MONTH|198|Subsequent rows with the same key determine the number of days from the preceding row REVIEW_DATE.
381|2/22/2024|INIT|0|All INIT with the same KEY begin with zero
381|6/30/2024|6MONTH|129|Subsequent rows with the same key determine the number of days from the preceding row REVIEW_DATE.
381|7/1/2025|6MONTH|366|Subsequent rows with the same key determine the number of days from the preceding row REVIEW_DATE.
;

So it looks like from they values you show of DAYS that you don't actually want to RETAIN anything.  Instead you just want to use the PREVIOUS value, which you can so with the LAG() function.  Make not to execute LAG() conditionally, it can only return values that you previously passed it.  Also make sure to remember to ignore/replace the values calculated for the FIRST observation in a group since the previous date value is from the previous group.

 

 

data want;
  set have;
  by key review_date;
  new_days = review_date - lag(review_date);
  if first.key then new_days=0;
run;

Results

 

Tom_0-1762823682247.png

You would want to use the RETAIN statement if your goal was instead to calculate the difference since the INIT record's date (usually called a BASELINE date).

data want ;
  set have;
  by key review_date ;
  retain baseline ;
  format baseline yymmdd10.;
  if first.key then call missing(baseline);
  if review_type='INIT' then baseline=review_date ;
  new_days = review_date - baseline;
run;

Tom_1-1762823804580.png

 

 

 

ballardw
Super User

Don't forget the DIF function:

 new_days = review_date - lag(review_date);

is the same as

 new_days = dif(review_date);

Same concern as LAG for conditional operation.

 

 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 147 views
  • 0 likes
  • 4 in conversation