BookmarkSubscribeRSS Feed
c8826024
Calcite | Level 5

I have a dataset like this:

idHospital_dtT1_dateT2_dateT3_dateT4_dateT5_date
13/4/20111/1/20113/2/30113/5/2011
21/1/20111/1/20112/1/2011
32/15/2011
410/1/20108/1/20109/1/2010

I want to create a new variable which will show the data of the first treatment AFTER hospital date. It will look like this:

idHospital_dtT1_dateT2_dateT3_dateT4_dateT5_datefirst_Treatment_dt
13/4/20111/1/20113/2/30113/5/20113/5/2011
21/1/20111/1/20112/1/20111/1/2011
32/15/2011
410/1/20108/1/20109/1/2010
5 REPLIES 5
Tom
Super User Tom
Super User

How about using an array?

options ls=80;

data have;

  infile cards dlm='|' dsd truncover;

  input id  (Hospital_dt T1_date T2_date T3_date T4_date T5_date) (:mmddyy10.);

  format _all_ yymmdd10. id ;

cards;

1|3/4/2011|1/1/2011|3/2/3011|3/5/2011||

2|1/1/2011|1/1/2011|2/1/2011|||

3|2/15/2011|||||

4|10/1/2010|8/1/2010|9/1/2010|

run;

data want;

  set have ;

  array tdate t1_date t2_date t3_date t4_date t5_date ;

  do over tdate ;

    if tdate >= hospital_dt then first_treatment_dt = min(first_treatment_dt,tdate);

  end;

  format first_treatment_dt yymmdd10.;

run;

                                                                               f

                                                                               i

                                                                               r

                                                                               s

                                                                               t

                                                                               _

                                                                               t

             H                                                                 r

             o                                                                 e

             s                                                                 a

             p                                                                 t

             i          T          T          T          T          T          m

             t          1          2          3          4          5          e

             a          _          _          _          _          _          n

             l          d          d          d          d          d          t

O            _          a          a          a          a          a          _

b i          d          t          t          t          t          t          d

s d          t          e          e          e          e          e          t

1 1 2011-03-04 2011-01-01 3011-03-02 2011-03-05          .          . 2011-03-05

2 2 2011-01-01 2011-01-01 2011-02-01          .          .          . 2011-01-01

3 3 2011-02-15          .          .          .          .          .          .

4 4 2010-10-01 2010-08-01 2010-09-01          .          .          .          .

c8826024
Calcite | Level 5

That's quick and neat.

One more question, if I know each location where the patience was treated. But I only need the first location AFTER hospital date:

idHospital_dtT1_dateT2_dateT3_dateT4_dateT5_dateLocation1Location2Location3first_Treatment_dtlocation
13/4/20111/1/20113/2/30113/5/2011abc3/5/2011c
21/1/20111/1/20112/1/2011bd1/1/2011b
32/15/2011
410/1/20108/1/20109/1/2010ab
Linlin
Lapis Lazuli | Level 10

you can add an array for locations to Art's code.

data want (drop=i);

  set have;

  array dates(*) T1_date--T5_date;

  array location(*) location1--location5;

  format first_Treatment_dt mmddyy10.;

  do i=1 to dim(dates);

    if dates(i) ge Hospital_dt then do;

      first_Treatment_dt=dates(i);

            first_location=location(i);

      leave;

    end;

  end;

run;

art297
Opal | Level 21

Linlin, Agreed!  I was about to propose the exact same code, but didn't, as the results didn't come out as expected.  Then I realized that it wasn't working as expected because the second date, in the example data, was 3/2/3011.  Doubt if I'll be alive in 3011, but I couldn't figure out why 3/2 was being selected.

art297
Opal | Level 21

Same suggestion as Tom's, but I like to leave loops once a condition is met.  Conversely, if your data isn't in date order, then I'd definitely go with Tom's suggested code:

data want (drop=i);

  set have;

  array dates(*) T1_date--T5_date;

  format first_Treatment_dt mmddyy10.;

  do i=1 to dim(dates);

    if dates(i) ge Hospital_dt then do;

      first_Treatment_dt=dates(i);

      leave;

    end;

  end;

run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1049 views
  • 0 likes
  • 4 in conversation