BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Sotarkadin
Calcite | Level 5

I currently have a data set that has columns with different dates for when actions have occured.  I need to create a column with the stauts based on which column is the last to have a date.

 

Example Data:

 

OBS     Account #     open_dt       act1_dt       act2_dt      act3_dt       act4_dt    close_dt   

      1          123456           3/1/16

      2           23456           3/1/16        3/3/16

      3           34567           3/13/16     3/15/16      3/17/16       3/18/16                        3/20/16

 

I would like to create a column called status that sets a status based on which one is filled:

 

output would be all the data from above with one extra column for Status

 

OBS    ......    status

1                     open

2                      act1

3                      close

 

In explaning the logic it would go:

 

If open_dt not null then status = 'open'

if act1_dt not null then status = 'act1'

if act2_dt not null then status = 'act2'

if act3_dt not null then status = 'act3'

if act4_dt not null then status = 'act4'

if close_dt_dt not null then status = 'closed'

 

I feel like it would work like the above where it goes through them in order and just changes the status as it finds the next action.

 

I am just not sure how to actual right this in SAS

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

This works partially because the values you want are in the names of the variables and are separated by the _;

The main part is the array and looping through it backwards. Note the order of variables on the ARRAY statement is important.

The LEAVE statement says to stop looping when the first non-missing value is found. If there are no values for any of the _dt variables then status will be blank.

 

data have;
   informat account $6.;
   informat   open_dt act1_dt act2_dt act3_dt act4_dt close_dt mmddyy8.;
   format   open_dt act1_dt act2_dt act3_dt act4_dt close_dt mmddyy10.;
   input  Account open_dt act1_dt act2_dt act3_dt act4_dt close_dt;   
datalines;
123456  3/1/16    .        .          .         .     .
 23456  3/1/16    3/3/16   .          .         .     .
 34567  3/13/16   3/15/16   3/17/16    3/18/16  .     3/20/16
;
run;

data want;
   set have;
   length status $ 6;
   array dt open_dt act1_dt act2_dt act3_dt act4_dt close_dt;
   do i = dim(dt) to 1 by (-1);
      if not missing( dt[i] ) then do;
         status = scan(vname(dt[i]),1,'_');
         leave;
      end;
   end;
   drop i;
run;
 

View solution in original post

9 REPLIES 9
ballardw
Super User

This works partially because the values you want are in the names of the variables and are separated by the _;

The main part is the array and looping through it backwards. Note the order of variables on the ARRAY statement is important.

The LEAVE statement says to stop looping when the first non-missing value is found. If there are no values for any of the _dt variables then status will be blank.

 

data have;
   informat account $6.;
   informat   open_dt act1_dt act2_dt act3_dt act4_dt close_dt mmddyy8.;
   format   open_dt act1_dt act2_dt act3_dt act4_dt close_dt mmddyy10.;
   input  Account open_dt act1_dt act2_dt act3_dt act4_dt close_dt;   
datalines;
123456  3/1/16    .        .          .         .     .
 23456  3/1/16    3/3/16   .          .         .     .
 34567  3/13/16   3/15/16   3/17/16    3/18/16  .     3/20/16
;
run;

data want;
   set have;
   length status $ 6;
   array dt open_dt act1_dt act2_dt act3_dt act4_dt close_dt;
   do i = dim(dt) to 1 by (-1);
      if not missing( dt[i] ) then do;
         status = scan(vname(dt[i]),1,'_');
         leave;
      end;
   end;
   drop i;
run;
 
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

 

Could I just add a small function I found recently that could help shrink the code somewhat.  I recently discovered that you can do a do loop over an array:

data want;
  length status $6;
  set have;
  array dt close_dt act4_dt act3dt act2_dt act1_dt open_dt;
  do over dt;
    if not missing(dt) and missing(status) then status=scan(vname(dt),1,"_");
  end;
run;

You will note that I have reversed the array setup to get the last in the list otherwise more or less the same, but without the incrementor.  Just wanted to point it out as the do over command is quite cool.

ballardw
Super User

@RW9 wrote:

Hi,

 

You will note that I have reversed the array setup to get the last in the list otherwise more or less the same, but without the incrementor.  Just wanted to point it out as the do over command is quite cool.


Quite cool, quite old, I learned it in SAS 5.1 🙂

 

And as any loop there are many ways. I picked one that directly (as close as I understand the problem) that would emulate the paper and pencil approach as simple to understand.

Sotarkadin
Calcite | Level 5

Thanks for the replys all.  I am working on understanding the array.  It worked the first run through, I have one question/alteration.  There can be instances where all the dates are empty bc no action has been taking yet.  When this happened it returned the status as open since thats the first column it reads.  Is there a way for it to leave status blank if none of them have any dates?

ballardw
Super User

@Sotarkadin wrote:

Thanks for the replys all.  I am working on understanding the array.  It worked the first run through, I have one question/alteration.  There can be instances where all the dates are empty bc no action has been taking yet.  When this happened it returned the status as open since thats the first column it reads.  Is there a way for it to leave status blank if none of them have any dates?


 

 

 

The solution I posted leaves status blank when all of Open through close are missing.

Tom
Super User Tom
Super User

If you want to build that type of conditional logic you need to include the ELSE statements.

IF A then action1; ELSE IF B then action2; ELSE IF C ....

Tom
Super User Tom
Super User

Is the status determined by which variable has the latest date?

If so then you can use functions like MAX() and WHICHN() to figure out which variable has the maximum value.  If you want to control which status wins when there is a tie for maximum date then set the order of the variables in the ARRAY statement.

data test;
  length account $6 status $6;
  informat open_dt act1_dt act2_dt act3_dt act4_dt close_dt mmddyy8.;
  format   open_dt act1_dt act2_dt act3_dt act4_dt close_dt yymmdd10.;
  input  Account open_dt act1_dt act2_dt act3_dt act4_dt close_dt;  
  array dt open_dt act1_dt act2_dt act3_dt act4_dt close_dt ;
  if n(of dt(*)) then status=upcase(scan(vname(dt(whichn(max(of dt(*)),of dt(*)))),1,'_'));
datalines;
Open  3/1/16    .        .          .         .     .
ACT1  3/1/16    3/3/16   .          .         .     .
CLOSE 3/13/16   3/15/16   3/17/16    3/18/16  .     3/20/16
TIE   3/13/16   3/15/16   3/16/16    3/16/16  .     . 
None . . . . . .
;

proc print; 
run;

Capture.PNG

Sotarkadin
Calcite | Level 5

The status is not controlled by the date but by which action has a date.  Each of those actions could happen on the same date but they do happen in order.  It is opened, then act1, act2, etc.. until it's closed.  The last action to occur tells me where it is at in the process.  The close_dt is the last one as not all of the actions have to occur before it's closed.  

Tom
Super User Tom
Super User

@Sotarkadin wrote:

The status is not controlled by the date but by which action has a date.  Each of those actions could happen on the same date but they do happen in order.  It is opened, then act1, act2, etc.. until it's closed.  The last action to occur tells me where it is at in the process.  The close_dt is the last one as not all of the actions have to occur before it's closed.  


In that case the order of the variables in the array matter. So place them in decreasing order of status. So ties will go to the highest status.  closed,...act2,act1,opened.

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1200 views
  • 0 likes
  • 4 in conversation