Extracting subset of observations from merging large datasets

Reply
Occasional Contributor
Posts: 10

Extracting subset of observations from merging large datasets

Hello there:

 

 

I have FILE A which consists of 500,000 firms in a country. For each firm, I have FIRMID and STATENAME. I want to get annual employment data (4th quarter) for these firms from 2000-2010.

 

The employment data is available as individual files for each state and each quarter from 2000 to 2010. There are several million firms overall across all the states. What is the most efficient way of extracting the 4th quarter employment data each year from these files for just the 500,000 firms in FILE A? 

 

Any help with the code would be much appreciated.

Thanks

Dana

Trusted Advisor
Posts: 1,564

Re: Extracting subset of observations from merging large datasets

Efficiency is dependent on input (that you described), on the target (what do you wand to do / analyze / report)

and on resources you have (like disk space, memory available);

 

Another issue - is your data already in SAS tables or in external data (CSV or text or DataBase and if DB - what kind)

All those have effect on programming code.

 

Even after getting information, as above, there maybe several ways to do the work efficiently.

 

If possible, describe what variables are in each file kind.

Meanwhile I understand that you have 2 kinds of data:

1) a table of firms in each state  (variables: FIRMID, STATENAME)

2) employment data in (11 years X 4 quarters X number of states) files.

    What are the names of those files - are they in a common format ?

Ask a Question
Discussion stats
  • 1 reply
  • 167 views
  • 0 likes
  • 2 in conversation