48229 is a moderate size dataset... it shouldn't take forever to process. One problem might come from the combination of the left join and the ABS function. When there is no record on the right side of the join, i.e. when only one gvkey is present for a given sic2 and fyear, b.roa is missing and computing ABS(a.roa-b.roa) generates a note to the LOG. You could try avoiding those with the changes: proc sql; create table comp3 as select a.gvkey, a.sic2, a.fyear, a.roa, a.r, case b.roa when . then . else abs(a.roa-b.roa) end as diff_roa, (a.r-b.r) as pamjone from DAPAMJ_1 as a left join DAPAMJ_1 as b on a.sic2=b.sic2 and a.fyear=b.fyear and a.gvkey^=b.gvkey group by a.gvkey, a.fyear, a.sic2 having (a.roa-b.roa)**2 = min((a.roa-b.roa)**2); quit; PG
... View more