I'm trying to do a reshape to long a dataset that preserves info about firm ids and establishment ids. A person (id) can have multiple jobs at a time. I can transpose everything to long, but what I want looks more like this:
id_1 firmid1 estabid1 q1_value1 firmid2 estabid2 q1_value2 firmid3 estabid3 q1_value3
id_1 firmid1 estabid1 q2_value1 firmid2 estabid2 q2_value2 firmid3 estabid3 q2_value3
id_1 firmid1 estabid1 q3_value1 firmid2 estabid2 q3_value2 firmid3 estabid3 q3_value3
id_1 firmid1 estabid1 q4_value1 firmid2 estabid2 q4_value2 firmid3 estabid3 q4_value3
id_2 firmid1 estabid1 q1_value1 . . . . . .
id_2 firmid1 estabid1 q2_value1 . . . . . .
id_2 firmid1 estabid1 q3_value1 . . . . . .
id_2 firmid1 estabid1 q4_value1 . . . . . .
(In the example above, id_1 has three jobs, id_2 had one, etc.)
This dataset is HUGE... think q1-q124 and millions of ids, so trying to do it as efficiently as possible.
Is there a way to loop using first/last statements to make the code below work? If there is a better way to do this, I'm open to any ideas.
data have;
input id $10. firmid $5. estabid $4. q1 q2 q3 q4;
cards;
123456789 1001 000 100 125 125 150
123456789 1002 000 25 25 . .
123456789 1003 000 5 . . .
223456789 1002 000 25 25 25 25
323456789 1002 001 50 50 . .
423456789 1004 000 100 125 125 100
423456789 1005 001 5 . . .
523456789 1006 001 100 . 100 .
523456789 1006 002 . 100 . 100
;
run;
proc transpose data=have out=want2 prefix=earn name=qtr;
by id firmid estabid;
run;
data want2(drop=q q2);
set want2(rename=(qtr=q));
q2 = compress(q, ' ', 'A');
qtr=input(q2,8.);
run;
proc sort data=want2;
by id;
run;
data want3 (keep=id qtr firmid1 estabid1 firmid2 estabid2 firmid3 estabid3 earn_job1 earn_job2 earn_job3);
set want2;
by id;
length firmid1-firmid3 $5. estabid1-estabid3 $4. earn_job1-earn_job3 8;
retain firmid1 estabid1 firmid2 estabid2 firmid3 estabid3 earn_job1 earn_job2 earn_job3;
array afirmid{1:3} firmid1-firmid3;
array aestabid{1:3} estabid1-estabid3;
array aearn{1:3} earn_job1-earn_job3;
if first.id then do;
do i=1 to 3;
afirmid{i}=" ";
aestabid{i}=" ";
aearn{i}=.;
end;
end;
/*I am stuck here! Something like:
if qtr ge 1 then do;
afirmid{qtr}=firmid;
aestabid{qtr}=estabid;
aearn{qtr}=earn1;
end;
output;
This obviously doesn't work */
run;
Thanks for taking a look!
The tutorials below cover all that you need, in this case, you may want to explore the last option - the double transpose for an extra wide data set or you may also want to consider the custom macro, A Better Way to Flip documented here: https://communities.sas.com/t5/SAS-Communities-Library/A-better-way-to-FLIP-i-e-transpose-make-wide-...
Transposing data tutorials:
Long to Wide:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-long-to-wide-using-proc-transpose/
https://stats.idre.ucla.edu/sas/modules/reshaping-data-long-to-wide-using-the-data-step/
Wide to Long:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/
https://stats.idre.ucla.edu/sas/modules/reshaping-data-wide-to-long-using-a-data-step/
And sometimes a double transpose is needed for extra wide data sets:
https://gist.github.com/statgeek/2321b6f62ab78d5bf2b0a5a8626bd7cd
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.