Hi everyone, I am trying to create a new variable based on another column. Here is an example of the HAVE data.
DATA have ;
input section $ id year action type $;
DATALINES;
first 8069 2002 0 .
first 8069 2003 0 .
first 8069 2004 0 ann1
first 8069 2005 1 .
first 8069 2006 1 .
first 8069 2007 1 .
first 8234 1988 0 .
first 8234 1989 0 .
first 8234 1990 0 .
first 8234 1991 1 ann1
first 8234 1992 0 .
first 8234 1993 0 .
first 8234 1994 0 ann2
first 8234 1995 1 .
first 8234 1996 1 .
first 8234 1997 1 .
first 8234 1998 0 .
first 8234 1999 0 .
second 1032 2011 1 ann1
second 1032 2012 0 .
second 1032 2013 0 .
second 1032 2014 0 .
second 8069 2005 0 .
second 8069 2006 0 ann1
second 8069 2007 0 .
second 8069 2008 0 ann2
second 8069 2009 1 .
second 8069 2010 1 .
second 8234 1999 0 .
second 8234 2000 0 .
second 8234 2001 0 ann1
;
RUN;
Here is an example of the WANT data:
DATA want ;
input section $ id year action type $ event;
DATALINES;
first 8069 2002 0 . .
first 8069 2003 0 . .
first 8069 2004 0 ann1 .
first 8069 2005 1 . 1
first 8069 2006 1 . .
first 8069 2007 1 . .
first 8234 1988 0 . .
first 8234 1989 0 . .
first 8234 1990 0 . .
first 8234 1991 1 ann1 1
first 8234 1992 0 . .
first 8234 1993 0 . .
first 8234 1994 0 ann2 .
first 8234 1995 1 . 1
first 8234 1996 1 . .
first 8234 1997 1 . .
first 8234 1998 0 . .
first 8234 1999 0 . .
second 1032 2011 1 ann1 1
second 1032 2012 0 . .
second 1032 2013 0 . .
second 1032 2014 0 . .
second 8069 2005 0 . .
second 8069 2006 0 ann1 .
second 8069 2007 0 . .
second 8069 2008 0 ann2 .
second 8069 2009 1 . 1
second 8069 2010 1 . .
second 8234 1999 0 . .
second 8234 2000 0 . .
second 8234 2001 0 ann1 .
;
RUN;
What I want to do is as follows. For each section and id, look at the "type" variable. If it is a nonmissing value, then look at the corresponding "action" variable in that year. If this action value is:
i) 0 and the following year's action value is 1, then set event = 1 for the following year;
ii) 0 and the following year's action value is 0, then set event = . for the following year;
iii) 1, then set event = 1 for the same year, regardless of what the value of action is in the following year.
All other event values are set to missing. Also, in cases where there is a nonmissing value for type and action = 0 for that year, but there is no data for the following year, then simply set event = . for the same year (see the last row in want).
Note: action only ever takes on the value 0 or 1. Type also only ever takes on the values ann1 or ann2, but this does not really matter, as long as it is a nonmissing value then the above rules apply.
What have you tried so far?
Try:
data want;
set have;
by section id;
retain _action;
if first.id then _action = .;
if not missing(_action) and action = 1 then do;
event = 1;
_action = .;
end;
if not missing(type) then do;
if action = 1 then do;
event = 1;
_action = .;
end;
else do;
_action = 0;
end;
end;
run;
What have you tried so far?
Try:
data want;
set have;
by section id;
retain _action;
if first.id then _action = .;
if not missing(_action) and action = 1 then do;
event = 1;
_action = .;
end;
if not missing(type) then do;
if action = 1 then do;
event = 1;
_action = .;
end;
else do;
_action = 0;
end;
end;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.