Good days to all, I recently used SQL to generate possible pairs of geograhic zipcodes between one location to another. I have about 13,000 firms per year for 8 years. Therefore, number of firm-years is estimated to be 104,000. Then, I have to calculate the distance between every firm-year to all of the mutual fund companies (which is about 17,000 firms for each year). By the end, I have to calculate mutual fund density, which is defined as log (1+mutual fund companies with distance within 60 miles radius from the firm). The submitted program was run for 7 hours but I failed to generate output due to insufficient spaces. My notebook has free space about 365 GB but yet insufficient for storing the output. I am looking forward any response which can solve the program. My concerns: 1. How to submit the program in batches? 2. How can I estimate the processing time. I couldn't find any indicator on the estimated time and storage needed for running programs. 3. Any recommendations and suggestions...etc. The following is my program (with libname "geog"): proc sql; create table geog.fundcom as select a.fyear, tic, crsp_fundn, zipcitydistance(zipcode, zip)as mutualdist from geog.comp a , geog.Mutualcom b where a.fyear=b.year group by a.fyear,tic; having not missing(calculated mutualdist); quit; proc sql; create table geog.mutualnum as select fyear, tic, sum(0<=mutualdist<=60) as mutualnum from geog.fundcom group by tic, fyear; quit; run; data geog.mutualdensity; set geog.mutualnum; mutual_density = log (1+mutualnum); run; Thank you. Mspak
... View more