BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dapenDaniel
Obsidian | Level 7

Hello SAS experts!

 

I have a dataset as below which records the year that a firm is founded and the year that a firm end.

 

FirmID              Begin              End

24356W            1994              1999

25537H             1990              1996

 

I would like to create a panel data and also year dummies as below.

 

FirmID        Year     yd1990   yd1991    yd1992   yd1993  yd1994  yd1995  yd1996  yd1997  yd1998  yd1999  

24356W     1994        0             0               0            0            1             0         0             0           0            0    

24356W     1995        0             0               0            0            0             1         0             0           0            0 

24356W     1996        0             0               0            0            0             0         1             0           0            0 

24356W     1997        0             0               0            0            0             0         0             1           0            0 

24356W     1998        0             0               0            0            0             0         0             0           1            0 

24356W     1999        0             0               0            0            0             0         0             0           0            1 

25537H      1990        1             0               0            0            0             0         0             0           0            0  

25537H      1991        0             1               0            0            0             0         0             0           0            0

25537H      1992        0             0               1            0            0             0         0             0           0            0

25537H      1993        0             0               0            1            0             0         0             0           0            0

25537H      1994        0             0               0            0            1             0         0             0           0            0

25537H      1995        0             0               0            0            0             1         0             0           0            0

25537H      1996        0             0               0            0            0             0         1             0           0            0

 

what code do I need to use? Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @dapenDaniel 

 

You can try the code below:

data have;
	input FirmID $ Begin End;
	cards;
24356W 1994 1999
25537H 1990 1996
;
run;

proc sql noprint;
	select min(Begin), max(End) into:min_date,:max_date trimmed from have;
quit;

data have2;
	set have;
		do Year = Begin to End;
		output;
	end;
run;

data want;
	set have2;
	array yd(&min_date:&max_date) %sysfunc(compress(yd&min_date)) - %sysfunc(compress(yd&max_date));
	do i=&min_date to &max_date;
		if i=year then yd(i)=1;
		else yd(i)=0;
	end;
	drop i Begin End;
run;

proc print data=want;
run;

Capture d’écran 2019-11-24 à 12.14.17.png

View solution in original post

4 REPLIES 4
ed_sas_member
Meteorite | Level 14

Hi @dapenDaniel 

 

You can try the code below:

data have;
	input FirmID $ Begin End;
	cards;
24356W 1994 1999
25537H 1990 1996
;
run;

proc sql noprint;
	select min(Begin), max(End) into:min_date,:max_date trimmed from have;
quit;

data have2;
	set have;
		do Year = Begin to End;
		output;
	end;
run;

data want;
	set have2;
	array yd(&min_date:&max_date) %sysfunc(compress(yd&min_date)) - %sysfunc(compress(yd&max_date));
	do i=&min_date to &max_date;
		if i=year then yd(i)=1;
		else yd(i)=0;
	end;
	drop i Begin End;
run;

proc print data=want;
run;

Capture d’écran 2019-11-24 à 12.14.17.png

Ksharp
Super User
data have;
input FirmID  $        Begin              End;
cards;
24356W            1994              1999
25537H             1990              1996
;
data temp;
 set have;
 retain value 1;
 do year=begin to end;
   output;
 end;
 keep FirmID year value;
run;
proc transpose data=temp out=want prefix=yr_;
by FirmID year;
var value;
id year;
run;
proc sql noprint;
select min(Begin) ,max(End) into : min,: max from have;
quit;
data want;
 retain FirmID year yr_%left(&min) - yr_%left(&max);
 set want;
 drop _name_;
run;
proc stdize data=want out=want2 reponly missing=0;
var yr_:;
run;
PaigeMiller
Diamond | Level 26

For most purposes in SAS, you don't need your own dummy variables. Many analysis procedures in SAS create their own dummy variables internally, so you don't have to create them yourself.


Even if you absolutely have to create dummy variable for some reason, there are methods in PROC GLMMOD and PROC TRANSREG (and others) that exist so you don't have to write data step code to create dummy variables. Here is a long thread discussing how to create dummy variables: https://communities.sas.com/t5/SAS-Procedures/How-can-i-create-dummy-variables-How-can-i-change-my-c...

--
Paige Miller
mkeintz
PROC Star

I agree with @PaigeMiller that you probably don't have to make the dummy variables for many SAS analysis procedures.

 

But in case you have to here:

data have;
	input FirmID $ Begin End;
	cards;
24356W 1994 1999
25537H 1990 1996
run;

data want (drop=y);
  set have;
  array dum {1990:1999} yd1990-yd1999 ;

  do y=lbound(dum) to hbound(dum);   ** Initialize the array**;
    dum{y}=0;
  end;

  do y=begin to end;  ** Write out 1 per year **;
    dum{y}=1;
    output;
    dum{y}=0;
  end;
run;

There are 2 features of the ARRAY statement particularly useful here:

  1. You can specify the upper and lower bound of the array index,  Here they are 1990 and 1999.  That way, for Y=1990,  DUM{Y}=1 assigns a 1 to the first element in the DUM array.
  2. You can name whatever variables you want for the elements of the array.  The names would default to  DUM1, DUM2,. … DUM10, but I replaced those default names with YD1990, YD1991, … YD1999.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 989 views
  • 0 likes
  • 5 in conversation