Yes, of course. We are trying to create a county profile on Kern and compare it to 6 other benchmarks from 1990-2012. We have the NETS database which includes about over 200 variables on every business in California between 1990-2012 giving use over 6 million observations. Originally I just created county specific datasets so that the information was more manageable and SAS took less time to run any of my commands. The issue with the different data sets is that then every single command needs to be run 7 times in each different data set. It also needs to be run for each different year because the data is formatted like NAICS90, NAICS91....EMP90, EMP91....FIPS90,FIPS91....SALES90.SALE91. So some of what we want to do is at what and how many companies move in and out of different regions based on NAICS code each year. Then we want to see where companies are moving too and from for each of the counties .The other issue though is that not all businesses started in 1990 so we will also want to keep track of the new business not necessarily that moved to a county but that started in the county as well. We need to be able to distinguish between the two throughout all 23 years of the data. We also want to be able to sum employment for every year by NAICS code and county so we can see employment shifts that occur in 4 digit NAICS code industries as well as total employment shifts at the county level. It is similar issues with everything we would want to look at from PayDex Score to Sales. I don't mind having to run a program 7 times for each county dataset, but it is running for all 7 counties and all 23 years. Like in the code elow, I am going to have to run it for each year since I need to proc sort the data. I would need to then do that for NAICS90. NAICS91, NAICS92.... since new companies are added and sometimes change NAICS numbers in different years. proc sort data=kern.county; by region naics90; run; proc means data=kern.county; by region naics90; var emp90; output out=summary sum(emp90)=TotEmp; run; At least the code above wouldn't have to be run for each region so I would only need to run it for each year, but I need to be able to distinguish movements and FIPS code changes which as you said the region variable doesn't give me that variation . Does all of that information makes sense? Sometimes it is difficult to describe data with words. I would attempt to send you the main file, but it is too big. Thank you so much for your help.
... View more