It looks like to me you are merging on just industry so I guessing you are getting a cartesian product for like industries between the two files. So here are the assumptions I am using to figure out what you want: 1) Both files were created from the same source file but the 5000 are the suspect = 1 (or what ever) and the totalcontrol is all the rest?? 2) you dont want to match company to company but you want to find a like company from the non suspect pool that is like the companies in the suspect pool??? If so you will probably get multiples and will have to have another step to pick the ones for each suspect that you want based on some criteria. I would try something like the following (untested): proc sql; create table potentialmatch1 as select a.year, a.sic, a.id, a.roa, b.id, b.roa from sample as a, totalcontrol as b where a. sic = b.sic and a.year = b.year and a.roa between (.7*b.roa) and (1.3*b.roa) ; quit; See if that helps EJ
... View more