Solved: Re: Proc Transpose from long to wide

SASLearner7 · Posted 06-12-2019 04:02 PM

I would like to transpose a long dataset to a fully multi-variate one while adding suffix of "_X" (where X is the trial number) according to the values contained in the trial variable the new variables it will create (ie. create variable trial_1, trial_2, trial_3, etc). The trial variable is always going to be in a sequence and will not skip and will always have the same number of trials per id. The original dataset will have a few hundred variables, so the resulting transposed dataset is going to be really "wide".

ID	trial	time	score	location
111	1	123	10	a
111	2	456	10	b
111	3	789	11	c
222	1	234	10	a
222	2	567	10	c
222	3	890	12	c
333	1	123	9	x
333	2	234	8	y
333	3	345	7	z

and would like the output to like:

ID	trial_1	trial_2	trial_3	time_1	time_2	time_3	score_1	score_2	score_3	location_1	location_2	location_3
111	1	2	3	123	456	789	10	10	11	a	b	c
222	1	2	3	234	567	890	10	10	12	a	c	c
333	1	2	3	123	234	345	9	8	7	x	y	z

Thank you.

Reeza · Posted 06-12-2019 04:10 PM

This is example 5 here - do three transposes and merge.

https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-long-to-wide-using-proc-transpose/

If you want a single step solution, there's a macro here that does what you need:

https://communities.sas.com/t5/SAS-Communities-Library/A-better-way-to-FLIP-i-e-transpose-make-wide-...

Or a third approach, a double transpose - first to a more long format and then to the wide format - not 100% sure this would work directly:

https://gist.github.com/statgeek/2321b6f62ab78d5bf2b0a5a8626bd7cd

If you're still having trouble please post the code and the log.

@SASLearner7 wrote:

I would like to transpose a long dataset to a fully multi-variate one while adding suffix of "_X" (where X is the trial number) according to the values contained in the trial variable the new variables it will create (ie. create variable trial_1, trial_2, trial_3, etc). The trial variable is always going to be in a sequence and will not skip and will always have the same number of trials per id. The original dataset will have a few hundred variables, so the resulting transposed dataset is going to be really "wide".

ID trial time score location

111 1 123 10 a

111 2 456 10 b

111 3 789 11 c

222 1 234 10 a

222 2 567 10 c

222 3 890 12 c

333 1 123 9 x

333 2 234 8 y

333 3 345 7 z

and would like the output to like:

ID trial_1 trial_2 trial_3 time_1 time_2 time_3 score_1 score_2 score_3 location_1 location_2 location_3

111 1 2 3 123 456 789 10 10 11 a b c

222 1 2 3 234 567 890 10 10 12 a c c

333 1 2 3 123 234 345 9 8 7 x y z

Thank you.

View solution in original post

Reeza · Posted 06-12-2019 04:10 PM

This is example 5 here - do three transposes and merge.

https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-long-to-wide-using-proc-transpose/

If you want a single step solution, there's a macro here that does what you need:

https://communities.sas.com/t5/SAS-Communities-Library/A-better-way-to-FLIP-i-e-transpose-make-wide-...

Or a third approach, a double transpose - first to a more long format and then to the wide format - not 100% sure this would work directly:

https://gist.github.com/statgeek/2321b6f62ab78d5bf2b0a5a8626bd7cd

If you're still having trouble please post the code and the log.

@SASLearner7 wrote:

I would like to transpose a long dataset to a fully multi-variate one while adding suffix of "_X" (where X is the trial number) according to the values contained in the trial variable the new variables it will create (ie. create variable trial_1, trial_2, trial_3, etc). The trial variable is always going to be in a sequence and will not skip and will always have the same number of trials per id. The original dataset will have a few hundred variables, so the resulting transposed dataset is going to be really "wide".

ID trial time score location

111 1 123 10 a

111 2 456 10 b

111 3 789 11 c

222 1 234 10 a

222 2 567 10 c

222 3 890 12 c

333 1 123 9 x

333 2 234 8 y

333 3 345 7 z

and would like the output to like:

ID trial_1 trial_2 trial_3 time_1 time_2 time_3 score_1 score_2 score_3 location_1 location_2 location_3

111 1 2 3 123 456 789 10 10 11 a b c

222 1 2 3 234 567 890 10 10 12 a c c

333 1 2 3 123 234 345 9 8 7 x y z

Thank you.

ballardw · Posted 06-12-2019 04:28 PM

@SASLearner7 wrote:

I would like to transpose a long dataset to a fully multi-variate one while adding suffix of "_X" (where X is the trial number) according to the values contained in the trial variable the new variables it will create (ie. create variable trial_1, trial_2, trial_3, etc). The trial variable is always going to be in a sequence and will not skip and will always have the same number of trials per id. The original dataset will have a few hundred variables, so the resulting transposed dataset is going to be really "wide".

I would question the need for this if you already have "a few hundred variables" as you are likely to end up with "several thousands of variables".

Which can make writing any sort of code to work more than a little challenging.

Can you describe what you will be doing with the transposed version of this data? Often there are other approaches.

For example, we have seen requests to make a wide table like this for a report when one of the report procedures like Proc Report or Tabulate will create a "wide" table and provide nice formatting at the same time.

SASLearner7 · Posted 06-13-2019 08:58 AM

Thank you for all your help. This was a data request for a customer and is what they have asked for to be able to do their analysis.

ballardw · Posted 06-13-2019 12:05 PM

@SASLearner7 wrote:

Thank you for all your help. This was a data request for a customer and is what they have asked for to be able to do their analysis.

'Nuff said, this follows the Golden Rule: He who has the gold makes the rules.

But I have had some pretty silly customers in the past...

Ksharp · Posted 06-13-2019 08:12 AM

1) proc summary

data have;
infile cards expandtabs;
input ID	trial	time	score	location $;
cards;
111	1	123	10	a
111	2	456	10	b
111	3	789	11	c
222	1	234	10	a
222	2	567	10	c
222	3	890	12	c
333	1	123	9	x
333	2	234	8	y
333	3	345	7	z
;
run;

proc sql noprint;
select max(n) into : n
 from (select count(*) as n from have group by id);
quit;
proc summary data=have ;
by id;
output out=want idgroup(out[&n] (trial	time	score	location)=);
run;

Ksharp · Posted 06-13-2019 08:12 AM

"Transpose Datset by MERGE"
http://support.sas.com/resources/papers/proceedings15/2785-2015.pdf

SAS Innovate 2025: Call for Content

Classroom Training Available!