BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
costasRO
Fluorite | Level 6

Hello SAS community,

I have a dataset that looks like the below, basically an id variable to flag different entities the then another variable that takes different values (can be one of four values: missing, Open,Char or Clos). For each id variable that data are sorted by time (i have a "proper" date variable, but here i include it as a number for simplicity). 


data have;
input id $ x $ time ;
cards;
1 Open 1
1 Open 2
1 Clos 3
1 Clos 4
1 Clos 5
1 Open 6
1 Open 7
2 Open 1
2 Clos 2
2 Char 3
2 Char 4
;;;;
run;

 

 

I would like to create another variable y that tells me (within each id) what is the next value of x when x is equal to "Clos", that is not "Clos", like below. So this variable takes a value when x is clos, telling you what is the next non-clos value within a given id.

 

Thank you all very much. 

 

 

data want;
input id $ x $ time y $ ;
cards;
1 Open 1 .
1 Open 2 .
1 Clos 3 open
1 Clos 4 open
1 Clos 5 open
1 Open 6 .
1 Open 7 .
2 Open 1 .
2 Clos 2 char
2 Char 3 .
2 Char 4 .
;;;;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Great idea, @Tom, and so much easier than other look-ahead techniques.

 

Small correction: I think call missing(last) in observations where first.id should be restricted to cases with x='Clos' to avoid losing a few values for variable want (example: the sample dataset HAVE with the last observation removed).

  if x ne 'Clos' then last=x;
  else do; 
    if first.id then call missing(last);
    want=last;
  end;

Or, alternatively, set last to missing in the last observation of each BY group:

  if x ne 'Clos' then last=x;
  else want=last;
  if last.id then call missing(last);

View solution in original post

5 REPLIES 5
Tom
Super User Tom
Super User

I cannot figure out what real world problem would want something like this.  Can you explain what you are trying to do?

 

In general it is much easier to REMEMBER something than it is to predict the FUTURE.

 

Sort the data in descending order and use a RETAINed variable to remember the latest non CLOS value.  You will want to a second non-retained variable to output.

data have;
  input id $ x $ time ;
  row+1;
cards4;
1 Open 1
1 Open 2
1 Clos 3
1 Clos 4
1 Clos 5
1 Open 6
1 Open 7
2 Open 1
2 Clos 2
2 Char 3
2 Char 4
;;;;

proc sort;
  by id descending row ;
run;

data want;
  set have;
  by id;
  if x ne 'Clos' then last=X;
  retain last;
  if first.id then call missing(last);
  if x = 'Clos' then want=last;
  drop last;
run;

proc sort;
  by id row;
run;

Result

Obs    id     x      time    row    want

  1    1     Open      1       1
  2    1     Open      2       2
  3    1     Clos      3       3    Open
  4    1     Clos      4       4    Open
  5    1     Clos      5       5    Open
  6    1     Open      6       6
  7    1     Open      7       7
  8    2     Open      1       8
  9    2     Clos      2       9    Char
 10    2     Char      3      10
 11    2     Char      4      11
FreelanceReinh
Jade | Level 19

Great idea, @Tom, and so much easier than other look-ahead techniques.

 

Small correction: I think call missing(last) in observations where first.id should be restricted to cases with x='Clos' to avoid losing a few values for variable want (example: the sample dataset HAVE with the last observation removed).

  if x ne 'Clos' then last=x;
  else do; 
    if first.id then call missing(last);
    want=last;
  end;

Or, alternatively, set last to missing in the last observation of each BY group:

  if x ne 'Clos' then last=x;
  else want=last;
  if last.id then call missing(last);
Tom
Super User Tom
Super User

Thanks. I was trying to avoid having to define the type and/or length of LAST by having the first place it appears be the assignment statement and missed that potential erasure.

 

costasRO
Fluorite | Level 6

Thanks everyone for the codes, they all work perfectly. Regards

Ksharp
Super User
/*
I think Tom's code is the best way to do.
But for having some fun.
*/

data have;
  input id $ x $ time ;
cards4;
1 Open 1
1 Open 2
1 Clos 3
1 Clos 4
1 Clos 5
1 Open 6
1 Open 7
2 Open 1
2 Clos 2
2 Char 3
2 Char 4
;;;;

data want ;
   if 0 then set have;
   if 0 then set have(rename=(x=y));
   if 0 then set have(rename=(x=want));
   declare hash h(ordered:'y');
   declare hiter hi('h');
   h.definekey('time');
   h.definedata('y');
   h.definedone();

do until(last.id);
 set have;
 by id;
 h.add(key:time,data:x);
end;
do until(last.id);
 set have;
 by id;
 call missing(want);
 if x='Clos' then do;
 rc=hi.setcur(key:time);
 do while(rc=0);
   if y ne 'Clos' then do;want=y;leave;end;
   rc=hi.next();
 end;
 end;
 output;
end;
h.delete();hi.delete();
drop y rc; run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1354 views
  • 9 likes
  • 4 in conversation