BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lady8506
Quartz | Level 8

I am fairly new to using PROC TRANSPOSE and DO loops and am having some problems getting them to do exactly what I want them to do. Neither feature is working quite the way I need it to. 

 

I have a dataset that contains repeat measures (many variables are repeated(i.e. SurgProcedure1 SurgProcedure2 SurgProcedure3), and sometimes observations, too, like repeat patients who are in the data set more than once due to two different cancers). This dataset is in wide format and I'd like to transpose it into long format.

 

When I use PROC TRANSPOSE such as:

 

PROC TRANSPOSE DATA = have OUT = want; 

BY Fname Lname;

VAR SurgProcedure1 SurgProcedure2 SurgProcedure3;

RUN;


PROC TRANSPOSE deletes every single non-transposed variable in the dataset. That's not what I want. I want it to keep all my other variables and just transpose the ones I request to be transposed.

 

So, I switched to trying a Do loop.

 

DATA want; SET have;

array Side[3] Side1-Side3;
DO I=1 to 3;
Sides=Side(i);
output;
end;
RUN;

 

But the DO loop will create duplicate observations of ALL patients regardless of whether or not they actually had more than one side of their body presenting symptoms (Side1 may contain a value, but Side2 is empty, yet the patient still gets duplicated). That's not what I want, either. I don't want it to duplicate a patient if an observation for a variable is missing.  

 

If anyone could help me fix my problem, I'd appreciate it. Thanks! 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Almost there on the do loop:

DATA want; 
   SET have;
   array Side[3] Side1-Side3;
   DO I=1 to 3;
      Sides=Side(i);
      if not missing(sides) then output;
   end;
RUN;

Proc transpose has the COPY statement to bring in the values of other variables but may not always do what you want depending on your actual data have and want.

 

View solution in original post

7 REPLIES 7
ballardw
Super User

Almost there on the do loop:

DATA want; 
   SET have;
   array Side[3] Side1-Side3;
   DO I=1 to 3;
      Sides=Side(i);
      if not missing(sides) then output;
   end;
RUN;

Proc transpose has the COPY statement to bring in the values of other variables but may not always do what you want depending on your actual data have and want.

 

lady8506
Quartz | Level 8

Thank you for the suggestion.

 

I got it to work with a modifaction of what you wrote: IF sides ne "." then output;

 

However, an unintended mistake happened. SAS deleted patients who did not have observations for any of the Side(i) variables. =( Definitely didn't want them deleted, just not uneccessarily duplicated. Trying to figure out how to fix this now.

ballardw
Super User

Or something like:

DATA want; 
   SET have;
   array Side[3] Side1-Side3;
   /* output if all Side variables missing*/
   if max( of side(*)) = . then output;
   else DO I=1 to 3;
      /* only output when Sides not missing*/
      Sides=Side(i);
      if not missing(sides) then output;
   end;
RUN;
lady8506
Quartz | Level 8

This worked perfect. Thanks so much. Sorry I didn't get back to you yesterday. I'd already gone home from work.

lady8506
Quartz | Level 8

Do you know if there is a less tedious way to do the same procedure, maybe with a DO OVER loop, on over 50 variables?

 

I was practicing and this is as far as I got:

 

DATA loops; SET Felibertiloop;
array aloops Mets1 Mets2 Mets3 Location1 Location2 ClinSize1 ClinSize2 RadSize1 RadSize2 RadSize3;    /*This is just a small sampling of more than 50 vars and for practice only. Want to make sure it works first.*/
IF max( of aloops(*)) = . then output;
ELSE DO OVER aloops;

 /*Now how do I make sure that Mets1-3 gets put into a variable called Mets, and Location1-2 get put in a variable called Location...and so on and so forth?*/

 

 

 

ballardw
Super User

@lady8506 wrote:

Do you know if there is a less tedious way to do the same procedure, maybe with a DO OVER loop, on over 50 variables?

 

 

 


You are probably at the point where you will have to provide example input and what what the output looks like.

If the number of the groups of variables as indicated with

Mets1 Mets2 Mets3 Location1 Location2 ClinSize1 ClinSize2 RadSize1 RadSize2 RadSize3

where Mets and Radsize have 3 elements and Location and ClinSize have 2 then it gets pretty tricky to determine what you may need without a concrete example.

The general approach is each related variable that should end up in the same output(transposed) variable should be in a separate array.

Then when all of the first elements of a array are set then output. BUT that is where the different numbers of elements cause a big complication as to what the output should look like.

 

Some dummy code to transpose the Mets and Radsize:

array m Mets1-Mets3;

array r  Radsize1-Radsize3;

do i = 1 to dim(m);

   Mets = m[i];

   Radsize= r[i];

   /* and this would be where the output goes*/

end;

BUT without knowing what to do with Location and Clinsize since there are only two shown they complicate things. Should the first Location go with Mets1, Mets2 or Mets3?

lady8506
Quartz | Level 8

 

That's a good question and one I asked myself. In the dataset I was given, it is unfortunately not specified which goes with which. So that means if a patient has multiple surgical procedures listed and multiple tumor sizes listed, those procedures could have been done on either tumor at the same time or different times.

 

I think, at least for the purposes of anaylsis, it doesn't matter. I will be doing things like PROC FREQ and CHISQ and PROC LOGISTIC.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2559 views
  • 1 like
  • 2 in conversation