Hi all,
I am having issues with an inner join by two variables, FIPS and year. When I join by year in addition to FIPS, I get no output (zero rows). The year variable in both datasets is type num, length 8, but the year in one of them has a year4. format. I have checked the data and both contain years in my desired range (1997-2014) so the issue is not that there are no matches. Most of the year variable entries in one dataset (soccap3) are the result of using proc expand to extrapolate a variable (sk) to missing years, and I suspect this may be part of the issue because the output from proc expand assigns the same VAR number to multiple years. I have attached a screenshot of the output after proc expand for reference and the relevant code is below. Some things I have tried: deleting the year4. format on the soccap3 year, adding a year4. format to the funda_4 year, deleting b.year from the merge statement. None of these have worked, and I suspect, again, that this is because there is a problem with the year variable from using proc expand. Any help would be appreciated! Thank you.
/* create social capital values for missing years*/
PROC EXPAND DATA=soccap2 out=soccap3 from=year;
by fips;
id year;
RUN;
PROC SORT DATA=soccap3;
BY fips year;
RUN;
/*intermediate steps of importing funda data, adding variables omitted*/
/* merge fundamentals and social capital data*/
PROC SQL;
create table funda_final as
SELECT a.*, b.sk, b.year from funda_4 as a, soccap3 as b
WHERE a.fips=b.fips & a.year=b.year;
quit;
