Dear All,
I am working on a dataset and am stuck at one point where I need to create a dummy variable. I have a set of information which includes the following information:
Firm Year INDEX
A 2001 0
A 2002 1
A 2003 1
A 2004 1
A 2005 1
A 2006 1
A 2007 1
A 2008 1
A 2009 0
B 2001 0
B 2002 0
B 2003 1
B 2004 1
B 2005 1
B 2006 1
B 2007 0
I want to create five dummy variables: ADD, OUT, ONGOING, ONGOING1, ONGOING2.
ADD is 1 for the first year when INDEX is 1 and 0 otherwise. OUT is 1 for the year following the last year of INDEX =1. ONGOING is 1 for years when INDEX is 1 excepte for the first year. ONGOING1 is 1 for the 2nd and the 3rd year of INDEX =1. ONGOING2 is 1 for the any year after the third year of INDEX =1.
In other words, I am trying to get the output to look like this:
Firm Year INDEX ADD OUT ONGOING ONGOING1 ONGOING2
A 2001 0 0 0 0 0 0
A 2002 1 1 0 0 0 0
A 2003 1 0 0 1 1 0
A 2004 1 0 0 1 1 0
A 2005 1 0 0 1 0 1
A 2006 1 0 0 1 0 1
A 2007 1 0 0 1 0 1
A 2008 1 0 0 1 0 1
A 2009 0 0 1 0 0 0
B 2001 0 0 0 0 0 0
B 2002 0 0 0 0 0 0
B 2003 1 1 0 0 0 0
B 2004 1 0 0 1 1 0
B 2005 1 0 0 1 1 0
B 2006 1 0 0 1 0 1
B 2007 0
Although I did have some success in creating the ADD, OUT, and ONGOING, I am not sure how to construct ONGOING1 and ONGOING2. I would appreciate if someone would help me with them.
Thank you.
data have;
input Firm $ Year INDEX ;
cards;
A 2001 0
A 2002 1
A 2003 1
A 2004 1
A 2005 1
A 2006 1
A 2007 1
A 2008 1
A 2009 0
B 2001 0
B 2002 0
B 2003 1
B 2004 1
B 2005 1
B 2006 1
B 2007 0
;
proc sort data=have out=index0;
by firm year;
where index=0;
run;
proc sort data=have out=index1;
by firm year;
where index=1;
run;
data index1;
set index1;
retain seq;
by firm year;
if first.firm then seq=1;
else seq+1;
if seq>1 then ongoing=1;
else ongoing=0;
if 2<=seq<=3 then ongoing1=1;
else ongoing1=0;
if seq>3 then ongoing2=1;
else ongoing2=0;
run;
data final;
set index0(in=a) index1;
if a then do;
ongoing=0;
ongoing1=0;
ongoing2=0;
end;
run;
hope this helps...
Regards
data have;
input Firm $ Year INDEX ;
cards;
A 2001 0
A 2002 1
A 2003 1
A 2004 1
A 2005 1
A 2006 1
A 2007 1
A 2008 1
A 2009 0
B 2001 0
B 2002 0
B 2003 1
B 2004 1
B 2005 1
B 2006 1
B 2007 0
;
data want;
set have;
add=0; out=0;ongoing=0;ongoing1=0;ongoing2=0;
by firm notsorted;
if (first.firm and index=1) or (not first.firm and index=1 and lag(index)=0) then add=1;
if index=0 and lag(index)=1 then out=1;
if index=1 and lag(index)=1 then ongoing=1;
ongoing1=ifn(lag(add)=1 or lag2(add)=1,1,0);
if index=1 and lag3(index)=1 then ongoing2=1;
run;
proc print;run;
Message was edited by: Linlin
data have;
input Firm $ Year INDEX ;
cards;
A 2001 0
A 2002 1
A 2003 1
A 2004 1
A 2005 1
A 2006 1
A 2007 1
A 2008 1
A 2009 0
B 2001 0
B 2002 0
B 2003 1
B 2004 1
B 2005 1
B 2006 1
B 2007 0
;
proc sort data=have out=index0;
by firm year;
where index=0;
run;
proc sort data=have out=index1;
by firm year;
where index=1;
run;
data index1;
set index1;
retain seq;
by firm year;
if first.firm then seq=1;
else seq+1;
if seq>1 then ongoing=1;
else ongoing=0;
if 2<=seq<=3 then ongoing1=1;
else ongoing1=0;
if seq>3 then ongoing2=1;
else ongoing2=0;
run;
data final;
set index0(in=a) index1;
if a then do;
ongoing=0;
ongoing1=0;
ongoing2=0;
end;
run;
hope this helps...
Regards
You can do it this way :
data have;
input Firm $ Year INDEX;
datalines;
A 2001 0
A 2002 1
A 2003 1
A 2004 1
A 2005 1
A 2006 1
A 2007 1
A 2008 1
A 2009 0
B 2001 0
B 2002 0
B 2003 1
B 2004 1
B 2005 1
B 2006 1
B 2007 0
;
/*
ADD is 1 for the first year when INDEX is 1 and 0 otherwise.
OUT is 1 for the year following the last year of INDEX =1.
ONGOING is 1 for years when INDEX is 1 excepte for the first year.
ONGOING1 is 1 for the 2nd and the 3rd year of INDEX =1.
ONGOING2 is 1 for the any year after the third year of INDEX =1.
*/
data want(drop=counti);
set have;
by firm index notsorted;
if first.index then counti=0;
counti + 1;
ADD = counti=1 and index;
OUT = counti=1 and not index and not first.firm;
ONGOING = INDEX and counti>1;
ONGOING1 = INDEX and counti in (2, 3);
ONGOING2 = INDEX and counti > 3;
run;
PG
Thank you all for your prompt replies.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.