I'm trying to do a reshape to long a dataset that preserves info about firm ids and establishment ids. A person (id) can have multiple jobs at a time. I can transpose everything to long, but what I want looks more like this:
id_1 firmid1 estabid1 q1_value1 firmid2 estabid2 q1_value2 firmid3 estabid3 q1_value3
id_1 firmid1 estabid1 q2_value1 firmid2 estabid2 q2_value2 firmid3 estabid3 q2_value3
id_1 firmid1 estabid1 q3_value1 firmid2 estabid2 q3_value2 firmid3 estabid3 q3_value3
id_1 firmid1 estabid1 q4_value1 firmid2 estabid2 q4_value2 firmid3 estabid3 q4_value3
id_2 firmid1 estabid1 q1_value1 . . . . . .
id_2 firmid1 estabid1 q2_value1 . . . . . .
id_2 firmid1 estabid1 q3_value1 . . . . . .
id_2 firmid1 estabid1 q4_value1 . . . . . .
(In the example above, id_1 has three jobs, id_2 had one, etc.)
This dataset is HUGE... think q1-q124 and millions of ids, so trying to do it as efficiently as possible.
Is there a way to loop using first/last statements to make the code below work? If there is a better way to do this, I'm open to any ideas.
data have;
input id $10. firmid $5. estabid $4. q1 q2 q3 q4;
cards;
123456789 1001 000 100 125 125 150
123456789 1002 000 25 25 . .
123456789 1003 000 5 . . .
223456789 1002 000 25 25 25 25
323456789 1002 001 50 50 . .
423456789 1004 000 100 125 125 100
423456789 1005 001 5 . . .
523456789 1006 001 100 . 100 .
523456789 1006 002 . 100 . 100
;
run;
proc transpose data=have out=want2 prefix=earn name=qtr;
by id firmid estabid;
run;
data want2(drop=q q2);
set want2(rename=(qtr=q));
q2 = compress(q, ' ', 'A');
qtr=input(q2,8.);
run;
proc sort data=want2;
by id;
run;
data want3 (keep=id qtr firmid1 estabid1 firmid2 estabid2 firmid3 estabid3 earn_job1 earn_job2 earn_job3);
set want2;
by id;
length firmid1-firmid3 $5. estabid1-estabid3 $4. earn_job1-earn_job3 8;
retain firmid1 estabid1 firmid2 estabid2 firmid3 estabid3 earn_job1 earn_job2 earn_job3;
array afirmid{1:3} firmid1-firmid3;
array aestabid{1:3} estabid1-estabid3;
array aearn{1:3} earn_job1-earn_job3;
if first.id then do;
do i=1 to 3;
afirmid{i}=" ";
aestabid{i}=" ";
aearn{i}=.;
end;
end;
/*I am stuck here! Something like:
if qtr ge 1 then do;
afirmid{qtr}=firmid;
aestabid{qtr}=estabid;
aearn{qtr}=earn1;
end;
output;
This obviously doesn't work */
run;
Thanks for taking a look!
The tutorials below cover all that you need, in this case, you may want to explore the last option - the double transpose for an extra wide data set or you may also want to consider the custom macro, A Better Way to Flip documented here: https://communities.sas.com/t5/SAS-Communities-Library/A-better-way-to-FLIP-i-e-transpose-make-wide-...
Transposing data tutorials:
Long to Wide:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-long-to-wide-using-proc-transpose/
https://stats.idre.ucla.edu/sas/modules/reshaping-data-long-to-wide-using-the-data-step/
Wide to Long:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/
https://stats.idre.ucla.edu/sas/modules/reshaping-data-wide-to-long-using-a-data-step/
And sometimes a double transpose is needed for extra wide data sets:
https://gist.github.com/statgeek/2321b6f62ab78d5bf2b0a5a8626bd7cd
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.