BookmarkSubscribeRSS Feed
wisej
Calcite | Level 5

Hi,

 

I have a dataset with 169 variables and multiple observations per subject. The dataset is in the long format, but I need to transpose it to the wide format. Each subject has a different number of observations. I am familiar with arrays and do loops, and proc transpose, but I have never worked with a dataset this large. Any suggestions?

 

I am unable to post a snapshot of the data as it is medical data and would violate privacy rules. 

 

Thanks so much!

3 REPLIES 3
Tom
Super User Tom
Super User

Make a simple example with just one or two subjects and a few variables to reflect what you have.

Are the variables all numeric (or all character) or mixed?

If you are going from tall to wide do you have a variable that can be used to generate the name of the variable to be created in the wide dataset?

Sometimes people actually have a intermediate structure that first needs to be made even taller and then can be converted to wide.  Basically you run two PROC TRANSPOSE steps with perhaps some steps in the middle to help generate variable names or set order of variable creation.

Reeza
Super User

Someone else posted a very similar question recently, maybe you can leverage their sample data?

 

In general, you may need to do several transposes/merges but it depends really on your data. The posts below cover transposing fairly thoroughly, the fastest programming option is likely splitting them into multiple data sets and merging, while the fastest from a processing standpoint will likely be a data step transpose in a single step. If you can pre-calculate the size of the arrays needed this is definitely the way to go. 

 

Transposing data tutorials:
Long to Wide:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-long-to-wide-using-proc-transpose/

https://stats.idre.ucla.edu/sas/modules/reshaping-data-long-to-wide-using-the-data-step/

Wide to Long:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/

https://stats.idre.ucla.edu/sas/modules/reshaping-data-wide-to-long-using-a-data-step/

And sometimes a double transpose is needed for extra wide data sets:
https://gist.github.com/statgeek/2321b6f62ab78d5bf2b0a5a8626bd7cd

Ksharp
Super User
data Have;
  input Month $ Policy_Number $ IRDA_Amt OpsRefNo $10.;
  datalines;
Nov-2020 XYZ 1000 20201005-M
Nov-2020 XYZ 500 20201015-A
Nov-2020 XYZ 100 20201222-A
;
run;
data have;
 set have;
 by Policy_Number;
 if first.Policy_Number  then n=0;
 n+1;
run;
proc sql noprint nowarn;
select distinct catt('have(where=(n=',n,' ) rename=(IRDA_Amt=IRDA_Amt_',n,' OpsRefNo=OpsRefNo_',n,'))') 
 into : merge separated by ' '
  from have
   order by n;
quit;
data want;
 merge &merge ;
 by Policy_Number;
 drop Month n;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 578 views
  • 0 likes
  • 4 in conversation