Hello SAS experts!
I have a dataset as below which records the year that a firm is founded and the year that a firm end.
FirmID Begin End
24356W 1994 1999
25537H 1990 1996
I would like to create a panel data and also year dummies as below.
FirmID Year yd1990 yd1991 yd1992 yd1993 yd1994 yd1995 yd1996 yd1997 yd1998 yd1999
24356W 1994 0 0 0 0 1 0 0 0 0 0
24356W 1995 0 0 0 0 0 1 0 0 0 0
24356W 1996 0 0 0 0 0 0 1 0 0 0
24356W 1997 0 0 0 0 0 0 0 1 0 0
24356W 1998 0 0 0 0 0 0 0 0 1 0
24356W 1999 0 0 0 0 0 0 0 0 0 1
25537H 1990 1 0 0 0 0 0 0 0 0 0
25537H 1991 0 1 0 0 0 0 0 0 0 0
25537H 1992 0 0 1 0 0 0 0 0 0 0
25537H 1993 0 0 0 1 0 0 0 0 0 0
25537H 1994 0 0 0 0 1 0 0 0 0 0
25537H 1995 0 0 0 0 0 1 0 0 0 0
25537H 1996 0 0 0 0 0 0 1 0 0 0
what code do I need to use? Thanks!
Hi @dapenDaniel
You can try the code below:
data have;
input FirmID $ Begin End;
cards;
24356W 1994 1999
25537H 1990 1996
;
run;
proc sql noprint;
select min(Begin), max(End) into:min_date,:max_date trimmed from have;
quit;
data have2;
set have;
do Year = Begin to End;
output;
end;
run;
data want;
set have2;
array yd(&min_date:&max_date) %sysfunc(compress(yd&min_date)) - %sysfunc(compress(yd&max_date));
do i=&min_date to &max_date;
if i=year then yd(i)=1;
else yd(i)=0;
end;
drop i Begin End;
run;
proc print data=want;
run;
Hi @dapenDaniel
You can try the code below:
data have;
input FirmID $ Begin End;
cards;
24356W 1994 1999
25537H 1990 1996
;
run;
proc sql noprint;
select min(Begin), max(End) into:min_date,:max_date trimmed from have;
quit;
data have2;
set have;
do Year = Begin to End;
output;
end;
run;
data want;
set have2;
array yd(&min_date:&max_date) %sysfunc(compress(yd&min_date)) - %sysfunc(compress(yd&max_date));
do i=&min_date to &max_date;
if i=year then yd(i)=1;
else yd(i)=0;
end;
drop i Begin End;
run;
proc print data=want;
run;
data have;
input FirmID $ Begin End;
cards;
24356W 1994 1999
25537H 1990 1996
;
data temp;
set have;
retain value 1;
do year=begin to end;
output;
end;
keep FirmID year value;
run;
proc transpose data=temp out=want prefix=yr_;
by FirmID year;
var value;
id year;
run;
proc sql noprint;
select min(Begin) ,max(End) into : min,: max from have;
quit;
data want;
retain FirmID year yr_%left(&min) - yr_%left(&max);
set want;
drop _name_;
run;
proc stdize data=want out=want2 reponly missing=0;
var yr_:;
run;
For most purposes in SAS, you don't need your own dummy variables. Many analysis procedures in SAS create their own dummy variables internally, so you don't have to create them yourself.
Even if you absolutely have to create dummy variable for some reason, there are methods in PROC GLMMOD and PROC TRANSREG (and others) that exist so you don't have to write data step code to create dummy variables. Here is a long thread discussing how to create dummy variables: https://communities.sas.com/t5/SAS-Procedures/How-can-i-create-dummy-variables-How-can-i-change-my-c...
I agree with @PaigeMiller that you probably don't have to make the dummy variables for many SAS analysis procedures.
But in case you have to here:
data have;
input FirmID $ Begin End;
cards;
24356W 1994 1999
25537H 1990 1996
run;
data want (drop=y);
set have;
array dum {1990:1999} yd1990-yd1999 ;
do y=lbound(dum) to hbound(dum); ** Initialize the array**;
dum{y}=0;
end;
do y=begin to end; ** Write out 1 per year **;
dum{y}=1;
output;
dum{y}=0;
end;
run;
There are 2 features of the ARRAY statement particularly useful here:
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.