05-05-2014 11:48 AM
I am trying to join the following 4 datasets and form a new dataset reit_etf_panel
reit_etf_framel is a balanced frame, which is 24 firms (i.e. PERMNO) * 336months=8064 observations.
reit_etf is also about the same 24 firms but some of them do not have observations during the whole 336-month period, so it is not a balanced dataset.
ff and liq is only the monthly data for 324 months.
What I want is to let reit_etf, liq, and ff join reit_etf_framel, becausereit_etf_framel is a balanced frame and I want to generate a whole balanced panel.
the unique linking variable should be PERMNOYm (when joining reit_etf and reit_etf_framel) or Ym (when joining liq and ff to the other two).
The code is like below:
proc sql noprint;
create table reit_etf_panel as
DATEliq, PS_LEVEL, PS_INNOV, PS_VWF,
dateff, mktrf, smb, hml, rf, umd,
datereit, SHRCD, SICCD, TICKER, COMNAM, SHRCLS, PERMCO, CUSIP, PRC, VOL, RET, SHROUT, ALTPRC, SPREAD, RETX, vwretd, vwretx, ewretd, ewretx, sprtrn
reit_etf_framel natural left join
liq natural left join
ff left join
reit_etf on reit_etf_framel.PERMNOYm=reit_etf.PERMNOYm
order by PERMNOYm;
Everything is ok but with one problem that, as you can see, in reit_etf_panel, from line 325 to 336 there are missing values for those variables from both ff and liq. Actually, the problem exists for the last year (i.e. from 201301 to 201312) for each PERMNO.
Can anybody help me to check what is the reason and how to deal with it? Thanks.