BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lady8506
Quartz | Level 8

I am fairly new to using PROC TRANSPOSE and DO loops and am having some problems getting them to do exactly what I want them to do. Neither feature is working quite the way I need it to. 

 

I have a dataset that contains repeat measures (many variables are repeated(i.e. SurgProcedure1 SurgProcedure2 SurgProcedure3), and sometimes observations, too, like repeat patients who are in the data set more than once due to two different cancers). This dataset is in wide format and I'd like to transpose it into long format.

 

When I use PROC TRANSPOSE such as:

 

PROC TRANSPOSE DATA = have OUT = want; 

BY Fname Lname;

VAR SurgProcedure1 SurgProcedure2 SurgProcedure3;

RUN;


PROC TRANSPOSE deletes every single non-transposed variable in the dataset. That's not what I want. I want it to keep all my other variables and just transpose the ones I request to be transposed.

 

So, I switched to trying a Do loop.

 

DATA want; SET have;

array Side[3] Side1-Side3;
DO I=1 to 3;
Sides=Side(i);
output;
end;
RUN;

 

But the DO loop will create duplicate observations of ALL patients regardless of whether or not they actually had more than one side of their body presenting symptoms (Side1 may contain a value, but Side2 is empty, yet the patient still gets duplicated). That's not what I want, either. I don't want it to duplicate a patient if an observation for a variable is missing.  

 

If anyone could help me fix my problem, I'd appreciate it. Thanks! 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Almost there on the do loop:

DATA want; 
   SET have;
   array Side[3] Side1-Side3;
   DO I=1 to 3;
      Sides=Side(i);
      if not missing(sides) then output;
   end;
RUN;

Proc transpose has the COPY statement to bring in the values of other variables but may not always do what you want depending on your actual data have and want.

 

View solution in original post

7 REPLIES 7
ballardw
Super User

Almost there on the do loop:

DATA want; 
   SET have;
   array Side[3] Side1-Side3;
   DO I=1 to 3;
      Sides=Side(i);
      if not missing(sides) then output;
   end;
RUN;

Proc transpose has the COPY statement to bring in the values of other variables but may not always do what you want depending on your actual data have and want.

 

lady8506
Quartz | Level 8

Thank you for the suggestion.

 

I got it to work with a modifaction of what you wrote: IF sides ne "." then output;

 

However, an unintended mistake happened. SAS deleted patients who did not have observations for any of the Side(i) variables. =( Definitely didn't want them deleted, just not uneccessarily duplicated. Trying to figure out how to fix this now.

ballardw
Super User

Or something like:

DATA want; 
   SET have;
   array Side[3] Side1-Side3;
   /* output if all Side variables missing*/
   if max( of side(*)) = . then output;
   else DO I=1 to 3;
      /* only output when Sides not missing*/
      Sides=Side(i);
      if not missing(sides) then output;
   end;
RUN;
lady8506
Quartz | Level 8

This worked perfect. Thanks so much. Sorry I didn't get back to you yesterday. I'd already gone home from work.

lady8506
Quartz | Level 8

Do you know if there is a less tedious way to do the same procedure, maybe with a DO OVER loop, on over 50 variables?

 

I was practicing and this is as far as I got:

 

DATA loops; SET Felibertiloop;
array aloops Mets1 Mets2 Mets3 Location1 Location2 ClinSize1 ClinSize2 RadSize1 RadSize2 RadSize3;    /*This is just a small sampling of more than 50 vars and for practice only. Want to make sure it works first.*/
IF max( of aloops(*)) = . then output;
ELSE DO OVER aloops;

 /*Now how do I make sure that Mets1-3 gets put into a variable called Mets, and Location1-2 get put in a variable called Location...and so on and so forth?*/

 

 

 

ballardw
Super User

@lady8506 wrote:

Do you know if there is a less tedious way to do the same procedure, maybe with a DO OVER loop, on over 50 variables?

 

 

 


You are probably at the point where you will have to provide example input and what what the output looks like.

If the number of the groups of variables as indicated with

Mets1 Mets2 Mets3 Location1 Location2 ClinSize1 ClinSize2 RadSize1 RadSize2 RadSize3

where Mets and Radsize have 3 elements and Location and ClinSize have 2 then it gets pretty tricky to determine what you may need without a concrete example.

The general approach is each related variable that should end up in the same output(transposed) variable should be in a separate array.

Then when all of the first elements of a array are set then output. BUT that is where the different numbers of elements cause a big complication as to what the output should look like.

 

Some dummy code to transpose the Mets and Radsize:

array m Mets1-Mets3;

array r  Radsize1-Radsize3;

do i = 1 to dim(m);

   Mets = m[i];

   Radsize= r[i];

   /* and this would be where the output goes*/

end;

BUT without knowing what to do with Location and Clinsize since there are only two shown they complicate things. Should the first Location go with Mets1, Mets2 or Mets3?

lady8506
Quartz | Level 8

 

That's a good question and one I asked myself. In the dataset I was given, it is unfortunately not specified which goes with which. So that means if a patient has multiple surgical procedures listed and multiple tumor sizes listed, those procedures could have been done on either tumor at the same time or different times.

 

I think, at least for the purposes of anaylsis, it doesn't matter. I will be doing things like PROC FREQ and CHISQ and PROC LOGISTIC.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1977 views
  • 1 like
  • 2 in conversation