BookmarkSubscribeRSS Feed
mariko5797
Pyrite | Level 9

I want to take the first and last date of a subject that meets a specific criteria. So where VISIT contains "PHASE #1", I want to store the first start date and time as TRTSDT and TRTSTM, respectively and the last end date and time as TRTEDT and TRTETM, respectively. Then I want to repeat the process for the remaining phases. Is there a way to do this without having to subset?

 

I essentially want something along the lines of: 

data want;
 set have;
 by ID;
	if findw(VISIT, 'PHASE#1') > 0 then do; 
		if 	TRTSDT = datepart(first.STDTC); TRTSTM = timepart(first.STDTC); 
			TRTEDT = datepart(last.ENDTC); TRTETM = timepart(last.ENDTC);
		end;
run;

I have also tried using 'if first.ID then TRTSDT = datepart(first.STDTC)...' but it seemed to only look at first ID and last ID within the whole dataset (e.g. PHASE #1 DAY1 and PHASE#3 DAY2 for subject 1)

 

mariko5797_0-1633634631630.png

mariko5797_1-1633634652259.png

 

 

4 REPLIES 4
PaigeMiller
Diamond | Level 26

How about this:

 

 

if first.stdtc then do;
    TRTSDT = datepart(STDTC); 
    trtstm = timepart(STDTC); 
end;

with an optional

 

retain trtsdt trstm;

depending on what you are trying to do, which isn't 100% clear to me.

 

--
Paige Miller
snoopy369
Barite | Level 11

Remember that the data step is row-by-row always, it (normally) can't seek about the dataset randomly.  (There is a way to do that, but it's nearly always the wrong approach - it's very slow and loses a bunch of functionality.)

 

What `first.var` does is it tells you whether the row you're on is *different from the previous row*.  That's it!  And what `last.var` does is it tells you if the row you're on is *different from the next row*.  (Yes, SAS does sort of peek at the next row for this, it's the exception to the rule.)  Mentally, in english, it tells you if you're on the "first" row of a sequence of that ID.  It just returns 1 or 0 - true or false.

 

Unfortunately, it only works with the `by` variables - so we need to make your data work for that.  We do need to subset the data, but we don't need to do it in a separate step - we can do it on the fly.

 

1.  Let's subset your dataset while we do this, using where to only include the Phase1 folks.

  where findw(VISIT, 'PHASE#1') > 0;

2.  Let's retain the start variables, because we'll need them to hang around until the last row.

retain trtsdt trtstm;

3.  Let's save aside the trtsdt and trtstm when we are on a first.id row.

if first.id then do; trtsdt = datepart(stdtc); trtstm = timepart(stdtc); end;

4.  Let's then save the trtedt/trtetm when we're on a last.id row, and output that row.

if last.id then do; trtedt = datepart(stdtc); trtetm = timepart(stdtc); output; end;

 

All together we have:

 

data want;
  set have;
  where findw(VISIT, 'PHASE#1') > 0;
  by id;
  retain trtsdt trtstm;
  if first.id then do;
    trtsdt = datepart(stdtc);
    trtstm = timepart(stdtc);
  end;
  if last.id then do;
    trtedt = datepart(stdtc);
    trtetm = timepart(stdtc);
    output;
  end;
keep id trtsdt trtstm trtedt trtetm; run;

 

 

Tom
Super User Tom
Super User

Taking the DATEPART() or TIMEPART() of a binary variable is just silly.  The DATEPART is always going to by zero, which is '01JAN1960'd.  And the TIMEPART() is either going to be midnight or one second after midnight.

 

If you want to find the min/max datetime value per PHASE then you need to have a variable that indicates the PHASE.

data phase;
  set have;
  phase = scan(visit,2,' ');
run;

data want;
 set phase ;
 by ID phase;
 if first.phase then start_dt=stddtc;
 retain start_dt;
 if last.phase;
 end_dt=enddtc;
 keep id phase start_dt end_dt;
run;
PaigeMiller
Diamond | Level 26
proc summary data=have nway;
    class id;
    var stdtc endtc;
    output out=_minmax_(drop=_:) min(stdtc)=min_stdtc max(etdtc)=max_etdtc;
run;
data want;
    merge have _minmax_;
    by id;
    trtsdt=datepart(min_stdtc);
    trtstm=timepart(min_stdtc);
    trtedt=datepart(max_etdtc);
    trtetm=timepart(max_etdtc);
    drop min_stdtc max_etdtc;
run;
    
--
Paige Miller

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1023 views
  • 0 likes
  • 4 in conversation