Dear Sir,
I wish to create a year dummies using the yearly financial data. Each firm has a variable 'fyear' which represent fyear.
I have fyear ranging from 2001-2011, how do i create year dummies the fastest way.
fyear 2001=dummies yr2001;
fyear 2002=dummies yr 2002;
how do i write the codes?
Ha, EASY.
data x; do fyear=2000,2004 ,2006to 2008,2010,2012; output; end; fyear=2014;output; fyear=2020;output; run; proc sql noprint; select distinct cats('yr',fyear) into : list separated by ' ' from x; quit; data x; set x; array _y{*} &list ; do i=1 to dim(_y); _y{i}=0 ; end; do i=1 to dim(_y); if fyear=input(compress(vname(_y{i}),'yr'),best8.) then _y{i}=1; end; drop i; run;
Ksharp
PROC TRANSREG or PROC GLMMOD
If i write a long way, it should be like this, it is easy to do if it is only 12 years, but if let's say i have industries of 100 ind, it is going to be very tedious. how do i write the prog codes using proc transreg or proc glmmod?
data spi4;
set spi.spi3;
if fyear=2000 then yr2000=1;else yr2000=0;
if fyear=2001 then yr2001=1;else yr2001=0;
if fyear=2002 then yr2002=1;else yr2002=0;
if fyear=2003 then yr2003=1;else yr2003=0;
if fyear=2004 then yr2004=1;else yr2004=0;
if fyear=2005 then yr2005=1;else yr2005=0;
if fyear=2006 then yr2006=1;else yr2006=0;
if fyear=2007 then yr2007=1;else yr2007=0;
if fyear=2008 then yr2008=1;else yr2008=0;
if fyear=2009 then yr2009=1;else yr2009=0;
if fyear=2010 then yr2010=1;else yr2010=0;
if fyear=2011 then yr2011=1;else yr2011=0;
run;
Thanks
For TRANSREG look for an example in the documentation regarding using the DESIGN option.
All GLMMOD examples are DESIGN matrix related as that's all it does.
Array is a good friend.
data x; do fyear=2001 to 2011; output; end; run; data x; set x; array _y{*} yr2001-yr2011 ; do i=1 to dim(_y); _y{i}=0; end; _y{fyear-2000}=1; drop i; run;
Ksharp
What happens if you don't know how many levels of FYEAR?
Ha, EASY.
data x; do fyear=2000,2004 ,2006to 2008,2010,2012; output; end; fyear=2014;output; fyear=2020;output; run; proc sql noprint; select distinct cats('yr',fyear) into : list separated by ' ' from x; quit; data x; set x; array _y{*} &list ; do i=1 to dim(_y); _y{i}=0 ; end; do i=1 to dim(_y); if fyear=input(compress(vname(_y{i}),'yr'),best8.) then _y{i}=1; end; drop i; run;
Ksharp
If you say so. I would use PROC TRANSREG but I like easy.
Thanks for introducing proc transreg to us. I am still studying the doc. Always amazed by your knowledge on Procs and SAS overall.
Regards,
Haikuo
Well constructed! Thanks for sharing!
FWIW, using 'Do-Over' can save some typing:
data x;
do fyear=2000,2004 ,2006to 2008,2010,2012;
output;
end;
fyear=2014;output;
fyear=2020;output;
run;
proc sql noprint;
select distinct cats('yr',fyear) into : list separated by ' ' from x;
quit;
data x;
set x;
array _y &list ;
do over _y;
_y=ifn(fyear=compress(vname(_y),,'kd'), 1,0);
end;
run;
Regards,
Haikuo
Hi Ksharp:
I am not sure why there are only 20 observations generated from this program,
data x;
do fyear=2001 to 2011;
output;
end;
run;
data x;
set x;
array _y{*} yr2001-yr2011 ;
do i=1 to dim(_y);
_y{i}=0;
end;
_y{fyear-2000}=1;
drop i;
run;
Here is the log:
ERROR: Array subscript out of range at line 40 column 2.
GVKEY=011907 FYEAR=2000 ROE_lag1=. ROE_lag2=. ROE_lag3=. ROE=0.1123235312 ROE1_lag1=.
ROE1_lag2=. ROE1_lag3=. ROE1=. ROE2_lag1=. ROE2_lag2=. ROE2_lag3=. ROE2=0.1123235312 ROE3_lag1=.
ROE3_lag2=. ROE3_lag3=. ROE3=. SEQ_lag1=. SEQ=16.2210 LPERMNO=10012 AT_lag1=. AT=21.7630
SALE_lag1=. SALE=35.8230 OANCF_lag1=. OANCF_lead1=3.4480 OANCF=3.4620 DATADATE=20010228
GGROUP=4530 GIND=453010 GSECTOR=45 GSUBIND=45301020 NAICS=334413 SIC=3674 SPCINDCD=235
SPCSECCD=940 STATE=OH LPERMCO=7969 CONSOL=C INDFMT=INDL DATAFMT=STD POPSRC=D CURCD=USD COSTAT=I
CONM=DPAC TECHNOLOGIES CORP TIC=DPAC CUSIP=233269109 CIK=0000784770 EXCHG=19 FYR=2 FIC=USA
ACT=10.3730 CH=5.3460 CHE=5.3460 CSTK=24.8710 CSTKCV=1.1880 DLC=0.4570 DLTT=0.7870 DPACT=3.6840
INVT=1.4440 LCT=4.7550 LT=5.5420 PPEGT=9.0650 PPENT=5.3810 RE=-8.6500 RECD=0.1200 RECT=3.3010
WCAP=5.6180 COGS=24.2490 DP=1.6560 DVC=0.0000 DVP=0.0000 DVPD=. EBIT=1.7460 EBITDA=3.4020
EPSFI=0.0900 EPSFX=0.0900 EPSPI=0.0900 EPSPX=0.0900 IB=1.8220 NI=1.8220 REVT=35.8230 TXC=1.2260
TXDI=-0.3260 XAD=. XAGO=. XAGT=. XRD=1.6410 XSGA=8.1720 APALCH=-1.3840 CAPX=0.7370 CAPXFI=.
CDVC=. CHECH=2.3970 DPC=1.6560 DV=0.0000 FOPT=. INVCH=0.7910 IVCH=0.0000 IVSTCH=0.0000
RECCH=0.8660 UAOLOCH=. UTFDOC=. WCAPCH=. CSHO=20.9360 CSHR=10.4000 EMP=0.1120 GOVTOWN=. OPTRFR=.
OPTVOL=. PNRSHO=. PRSHO=. MKVALT=41.8720 NAICSH=334413 PRCC_F=2.0000 SICH=3674 sic2=36
siccode=36 OANCF_TA=. OANCF_lag1_TA=. OANCF_lead1_TA=. Accrual=. Accrual_TA=. R_Dechow=.
Abs_R_Dechow=. _ASSET=. SALESCHG_AT=. PPE_AT=. PPE1_AT=. R_francis=. Abs_R_Francis=.
R_francis_net=. Abs_R_Francis_net=. firmage=17 firmage_sc=17 Mktcap=41.872 Mktcap_mkval=0
mktcap_lag1=. Mkvalt_lag1=. BKVLPS=0.7748 mkt2bk1=2.5813451698 mkt2bk2=2.5813113061
mkt2bk_diff=0.0000338637 Std_ROE=. Std_ROE1=. Std_ROE2=. Std_ROE3=. DIV=0 AUDITOR_FKEY=.
AUDITOR_NAME= Audit_spec=0 RET_StdDev=. RET_N=. yr2001=0 yr2002=0 yr2003=0 yr2004=0 yr2005=0
yr2006=0 yr2007=0 yr2008=0 yr2009=0 yr2010=0 yr2011=0 i=12 _ERROR_=1 _N_=21
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 21 observations read from the data set SPI.SPI3.
WARNING: The data set FYEAR.SPI may be incomplete. When this step was stopped there were 20
observations and 165 variables.
Hi. mei
I don't find any problem in my code. Here is the LOG.
1 data x;
2 do fyear=2001 to 2011;
3 output;
4 end;
5 run;
NOTE: The data set WORK.X has 11 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.45 seconds
cpu time 0.03 seconds
6 data x;
7 set x;
8 array _y{*} yr2001-yr2011 ;
9 do i=1 to dim(_y);
10 _y{i}=0;
11 end;
12 _y{fyear-2000}=1;
13 drop i;
14 run;
NOTE: There were 11 observations read from the data set WORK.X.
NOTE: The data set WORK.X has 11 observations and 12 variables.
NOTE: DATA statement used (Total process time):
real time 0.15 seconds
cpu time 0.02 seconds
Or you could try my second code or HaiKuo's code which is more general .
Ksharp
Message was edited by: xia keshan
Thanks, sorry to mention that my original program codes are applying to spi.spi3 file that have 64602 observations for the year dummies to be created.
data fyear.spi;
set spi.spi3;
array _y{*} yr2001-yr2011 ;
do i=1 to dim(_y);
_y{i}=0;
end;
_y{fyear-2000}=1;
drop i;
run;
anyway, i have used your second code and hai kuo's code, that is marvellous!! however, i m not sure with the progtransreg codes.
Thanks
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.