Extracting subset of observations from merging large datasets

dshills — Thu, 17 Nov 2016 19:50:18 GMT

Hello there:

I have FILE A which consists of 500,000 firms in a country. For each firm, I have FIRMID and STATENAME. I want to get annual employment data (4^th quarter) for these firms from 2000-2010.

The employment data is available as individual files for each state and each quarter from 2000 to 2010. There are several million firms overall across all the states. What is the most efficient way of extracting the 4^th quarter employment data each year from these files for just the 500,000 firms in FILE A?

Any help with the code would be much appreciated.

Thanks

Dana

Re: Extracting subset of observations from merging large datasets

Shmuel — Thu, 17 Nov 2016 21:13:51 GMT

Efficiency is dependent on input (that you described), on the target (what do you wand to do / analyze / report)

and on resources you have (like disk space, memory available);

Another issue - is your data already in SAS tables or in external data (CSV or text or DataBase and if DB - what kind)

All those have effect on programming code.

Even after getting information, as above, there maybe several ways to do the work efficiently.

If possible, describe what variables are in each file kind.

Meanwhile I understand that you have 2 kinds of data:

1) a table of firms in each state (variables: FIRMID, STATENAME)

2) employment data in (11 years X 4 quarters X number of states) files.

What are the names of those files - are they in a common format ?

topic Re: Extracting subset of observations from merging large datasets in SAS Programming

Extracting subset of observations from merging large datasets

Re: Extracting subset of observations from merging large datasets