Why is proc transpose (wide to long) creating extra columns?

miriamhartig · Posted 05-08-2019 02:49 PM

I am trying to perform a transpose from wide to long. There are 27 variables that I am transposing. The data looks essentially like this:

id var1 var2 var3 ... var27

1 a b c

In the past when I have done this transpose, the result looks like this:

id _name_ COL1

1 var1 a

1 var2 b

1 var3 c

...

1 var27

However, today the result I am getting looks like this:

id _name_ COL1 COL2 COL3 ... COL25

1 var1 a a a

1 var2 b b b

1 var3 c c c

...

1 var27

This is the code I used:

proc transpose data = data out = data_long ;
by id;
var var1 - var27;
run;

What am I doing wrong here? Why and I getting so many columns?

My variables are all consecutive in the data set and all are the same format- $8.

Using SAS 9.4 on Windows

Tom · Posted 05-08-2019 02:54 PM

The BY variable values are not unique. The number of variables ("columns") created is equal to the largest number of observations ("rows") for any single BY group.

miriamhartig · Posted 05-08-2019 03:02 PM

That makes sense Tom, thank you. So removing duplicate IDs should solve this then right?

ballardw · Posted 05-08-2019 03:59 PM

@miriamhartig wrote:

That makes sense Tom, thank you. So removing duplicate IDs should solve this then right?

Yes, no and maybe.

The issue may then become which of the value set you want for the original combination of values for var1 - varn. If all of the Id have the same var1 - varn then selection of which observation (unique id value) to keep and transform is easy.

But what if, for some reason we can't tell at this point you want the var1 value from the first record with Id=1 and the var2 value from the 4th record with Id=1 (and so on for other vars)? Then how to get a single ID for transpose may be a bit tricky.

And if you really need all the values perhaps a manual transpose is in order:

data example;
   input id var1 $ var2 $ var3$;
datalines;
1 a b c
2 b c d
3 c d e
1 d e c
2 p d q
;
run;

data manualtrans (keep= id _name_ value);
   set example;
   array v var1-var3;
   length _name_ $10.;
   do i = 1 to dim(v);
      _name_ = vname(v[i]);
      value = v[i];
      output;
   end;
run;

Tom · Posted 05-08-2019 05:53 PM

@miriamhartig wrote:

That makes sense Tom, thank you. So removing duplicate IDs should solve this then right?

Or adding another variable (or more) to the BY statement so that each row is identified uniquely.

data fix; 
  set have ;
  by idvar ;
  if first.idvar then row_num=0;
  row_num+1;
run;
proc transpose data=fix .... ;
  by idvar row_num;
  ....
run;

DrAbhijeetSafai · Posted 12-12-2023 11:53 PM

Yes, it has at least helped me.

Thank you.

- Dr. Abhijeet Safai

Dr. Abhijeet Safai
Certified Base and Clinical SAS Programmer
Associate Data Analyst
Actu-Real

DrAbhijeetSafai · Posted 12-12-2023 07:40 AM

Thank you. Still my problem is not solved on which I am working, but this answer has given me a good hint. Thanks once again.

- Dr. Abhijeet Safai

Dr. Abhijeet Safai
Certified Base and Clinical SAS Programmer
Associate Data Analyst
Actu-Real

DrAbhijeetSafai · Posted 12-12-2023 11:53 PM

Now I understood it. Yes, you rightly pointed out that by variable is not unique. It helped me to solve my issue. Thanks once again.

- Dr. Abhijeet Safai

Dr. Abhijeet Safai
Certified Base and Clinical SAS Programmer
Associate Data Analyst
Actu-Real

DrAbhijeetSafai · Posted 02-22-2024 11:57 PM

I would like to post it again that the solution given by @Tom is wonderful. He said that the records are not unique and that gave me hint and I was able to find out a variable which was not previously included in the by statement of proc transpose and this problem was solved. I am posting here again because again I faced the same problem and it got solved by addition of a variable in by statement which was not added previously. So the solution to this kind of problem is mostly - to find out the variable which is not added in by statement of proc transpose and to add it.

Thank you.

- Dr. Abhijeet Safai

Dr. Abhijeet Safai
Certified Base and Clinical SAS Programmer
Associate Data Analyst
Actu-Real

Why is proc transpose (wide to long) creating extra columns?

Re: Why is proc transpose (wide to long) creating extra columns?

Re: Why is proc transpose (wide to long) creating extra columns?

Re: Why is proc transpose (wide to long) creating extra columns?

Re: Why is proc transpose (wide to long) creating extra columns?

Re: Why is proc transpose (wide to long) creating extra columns?

Re: Why is proc transpose (wide to long) creating extra columns?

Re: Why is proc transpose (wide to long) creating extra columns?

Re: Why is proc transpose (wide to long) creating extra columns?

SAS Innovate 2025: Save the Date