Hi there
I have quarterly data on firms in each state in a country. The data consists of individual files for each state and each quarter from 1980 to 2005. There are 6 variables of interest in each file - STATENAME, FIRMID, YR_QTR, EMPL1, EMPL2, EMPL3 and several 100 variables that I am not interested in. I want to sum the employment in each quarter (EMPL1 EMPL2 EMPL3) and find the first date of non-zero employment for each firm. I am using PROC APPEND for now to create a combined dataset for each state and then identifying the first non-zero employment for each firm.
Is there a more efficient way to program this without creating a large dataset? Any help with the code would be much appreciated.c
Thanks
Dana
Yes! Summarise each table, and *then* append the summaries where employment is non-zero.
Something like
%macro loop;
%do year=1980 %to 2005;
%do qtr=1 %to 4;
proc summary data=DATA_&year.Q&qtr. nway;
id YR_QTR;
class STATENAME FIRMID;
var EMPL1 EMPL2 EMPL3 ;
output out=SUM_&year.Q&qtr. sum=;
run;
%end;
%end;
data WANT;
set
%do year=1980 %to 2005;
%do qtr=1 %to 4;
&SUM_&year.Q&qtr.
%end;
%end;
;
where EMPL1 & EMPL2 & EMPL3;
by YR_QTR STATENAME FIRMID;
... more logic
run;
%mend;
%loop;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.