Dear Sir,
I wish to create a year dummies using the yearly financial data. Each firm has a variable 'fyear' which represent fyear.
I have fyear ranging from 2001-2011, how do i create year dummies the fastest way.
fyear 2001=dummies yr2001;
fyear 2002=dummies yr 2002;
how do i write the codes?
Ha, EASY.
data x; do fyear=2000,2004 ,2006to 2008,2010,2012; output; end; fyear=2014;output; fyear=2020;output; run; proc sql noprint; select distinct cats('yr',fyear) into : list separated by ' ' from x; quit; data x; set x; array _y{*} &list ; do i=1 to dim(_y); _y{i}=0 ; end; do i=1 to dim(_y); if fyear=input(compress(vname(_y{i}),'yr'),best8.) then _y{i}=1; end; drop i; run;
Ksharp
PROC TRANSREG or PROC GLMMOD
If i write a long way, it should be like this, it is easy to do if it is only 12 years, but if let's say i have industries of 100 ind, it is going to be very tedious. how do i write the prog codes using proc transreg or proc glmmod?
data spi4;
set spi.spi3;
if fyear=2000 then yr2000=1;else yr2000=0;
if fyear=2001 then yr2001=1;else yr2001=0;
if fyear=2002 then yr2002=1;else yr2002=0;
if fyear=2003 then yr2003=1;else yr2003=0;
if fyear=2004 then yr2004=1;else yr2004=0;
if fyear=2005 then yr2005=1;else yr2005=0;
if fyear=2006 then yr2006=1;else yr2006=0;
if fyear=2007 then yr2007=1;else yr2007=0;
if fyear=2008 then yr2008=1;else yr2008=0;
if fyear=2009 then yr2009=1;else yr2009=0;
if fyear=2010 then yr2010=1;else yr2010=0;
if fyear=2011 then yr2011=1;else yr2011=0;
run;
Thanks
For TRANSREG look for an example in the documentation regarding using the DESIGN option.
All GLMMOD examples are DESIGN matrix related as that's all it does.
Array is a good friend.
data x; do fyear=2001 to 2011; output; end; run; data x; set x; array _y{*} yr2001-yr2011 ; do i=1 to dim(_y); _y{i}=0; end; _y{fyear-2000}=1; drop i; run;
Ksharp
What happens if you don't know how many levels of FYEAR?
Ha, EASY.
data x; do fyear=2000,2004 ,2006to 2008,2010,2012; output; end; fyear=2014;output; fyear=2020;output; run; proc sql noprint; select distinct cats('yr',fyear) into : list separated by ' ' from x; quit; data x; set x; array _y{*} &list ; do i=1 to dim(_y); _y{i}=0 ; end; do i=1 to dim(_y); if fyear=input(compress(vname(_y{i}),'yr'),best8.) then _y{i}=1; end; drop i; run;
Ksharp
If you say so. I would use PROC TRANSREG but I like easy.
Thanks for introducing proc transreg to us. I am still studying the doc. Always amazed by your knowledge on Procs and SAS overall.
Regards,
Haikuo
Well constructed! Thanks for sharing!
FWIW, using 'Do-Over' can save some typing:
data x;
do fyear=2000,2004 ,2006to 2008,2010,2012;
output;
end;
fyear=2014;output;
fyear=2020;output;
run;
proc sql noprint;
select distinct cats('yr',fyear) into : list separated by ' ' from x;
quit;
data x;
set x;
array _y &list ;
do over _y;
_y=ifn(fyear=compress(vname(_y),,'kd'), 1,0);
end;
run;
Regards,
Haikuo
Hi Ksharp:
I am not sure why there are only 20 observations generated from this program,
data x;
do fyear=2001 to 2011;
output;
end;
run;
data x;
set x;
array _y{*} yr2001-yr2011 ;
do i=1 to dim(_y);
_y{i}=0;
end;
_y{fyear-2000}=1;
drop i;
run;
Here is the log:
ERROR: Array subscript out of range at line 40 column 2.
GVKEY=011907 FYEAR=2000 ROE_lag1=. ROE_lag2=. ROE_lag3=. ROE=0.1123235312 ROE1_lag1=.
ROE1_lag2=. ROE1_lag3=. ROE1=. ROE2_lag1=. ROE2_lag2=. ROE2_lag3=. ROE2=0.1123235312 ROE3_lag1=.
ROE3_lag2=. ROE3_lag3=. ROE3=. SEQ_lag1=. SEQ=16.2210 LPERMNO=10012 AT_lag1=. AT=21.7630
SALE_lag1=. SALE=35.8230 OANCF_lag1=. OANCF_lead1=3.4480 OANCF=3.4620 DATADATE=20010228
GGROUP=4530 GIND=453010 GSECTOR=45 GSUBIND=45301020 NAICS=334413 SIC=3674 SPCINDCD=235
SPCSECCD=940 STATE=OH LPERMCO=7969 CONSOL=C INDFMT=INDL DATAFMT=STD POPSRC=D CURCD=USD COSTAT=I
CONM=DPAC TECHNOLOGIES CORP TIC=DPAC CUSIP=233269109 CIK=0000784770 EXCHG=19 FYR=2 FIC=USA
ACT=10.3730 CH=5.3460 CHE=5.3460 CSTK=24.8710 CSTKCV=1.1880 DLC=0.4570 DLTT=0.7870 DPACT=3.6840
INVT=1.4440 LCT=4.7550 LT=5.5420 PPEGT=9.0650 PPENT=5.3810 RE=-8.6500 RECD=0.1200 RECT=3.3010
WCAP=5.6180 COGS=24.2490 DP=1.6560 DVC=0.0000 DVP=0.0000 DVPD=. EBIT=1.7460 EBITDA=3.4020
EPSFI=0.0900 EPSFX=0.0900 EPSPI=0.0900 EPSPX=0.0900 IB=1.8220 NI=1.8220 REVT=35.8230 TXC=1.2260
TXDI=-0.3260 XAD=. XAGO=. XAGT=. XRD=1.6410 XSGA=8.1720 APALCH=-1.3840 CAPX=0.7370 CAPXFI=.
CDVC=. CHECH=2.3970 DPC=1.6560 DV=0.0000 FOPT=. INVCH=0.7910 IVCH=0.0000 IVSTCH=0.0000
RECCH=0.8660 UAOLOCH=. UTFDOC=. WCAPCH=. CSHO=20.9360 CSHR=10.4000 EMP=0.1120 GOVTOWN=. OPTRFR=.
OPTVOL=. PNRSHO=. PRSHO=. MKVALT=41.8720 NAICSH=334413 PRCC_F=2.0000 SICH=3674 sic2=36
siccode=36 OANCF_TA=. OANCF_lag1_TA=. OANCF_lead1_TA=. Accrual=. Accrual_TA=. R_Dechow=.
Abs_R_Dechow=. _ASSET=. SALESCHG_AT=. PPE_AT=. PPE1_AT=. R_francis=. Abs_R_Francis=.
R_francis_net=. Abs_R_Francis_net=. firmage=17 firmage_sc=17 Mktcap=41.872 Mktcap_mkval=0
mktcap_lag1=. Mkvalt_lag1=. BKVLPS=0.7748 mkt2bk1=2.5813451698 mkt2bk2=2.5813113061
mkt2bk_diff=0.0000338637 Std_ROE=. Std_ROE1=. Std_ROE2=. Std_ROE3=. DIV=0 AUDITOR_FKEY=.
AUDITOR_NAME= Audit_spec=0 RET_StdDev=. RET_N=. yr2001=0 yr2002=0 yr2003=0 yr2004=0 yr2005=0
yr2006=0 yr2007=0 yr2008=0 yr2009=0 yr2010=0 yr2011=0 i=12 _ERROR_=1 _N_=21
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 21 observations read from the data set SPI.SPI3.
WARNING: The data set FYEAR.SPI may be incomplete. When this step was stopped there were 20
observations and 165 variables.
Hi. mei
I don't find any problem in my code. Here is the LOG.
1 data x;
2 do fyear=2001 to 2011;
3 output;
4 end;
5 run;
NOTE: The data set WORK.X has 11 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.45 seconds
cpu time 0.03 seconds
6 data x;
7 set x;
8 array _y{*} yr2001-yr2011 ;
9 do i=1 to dim(_y);
10 _y{i}=0;
11 end;
12 _y{fyear-2000}=1;
13 drop i;
14 run;
NOTE: There were 11 observations read from the data set WORK.X.
NOTE: The data set WORK.X has 11 observations and 12 variables.
NOTE: DATA statement used (Total process time):
real time 0.15 seconds
cpu time 0.02 seconds
Or you could try my second code or HaiKuo's code which is more general .
Ksharp
Message was edited by: xia keshan
Thanks, sorry to mention that my original program codes are applying to spi.spi3 file that have 64602 observations for the year dummies to be created.
data fyear.spi;
set spi.spi3;
array _y{*} yr2001-yr2011 ;
do i=1 to dim(_y);
_y{i}=0;
end;
_y{fyear-2000}=1;
drop i;
run;
anyway, i have used your second code and hai kuo's code, that is marvellous!! however, i m not sure with the progtransreg codes.
Thanks
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.