BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
epigrad123
Calcite | Level 5

This question might be fairly straight forward but I am new to SAS so I do not know how to execute this.

 

I have a dataset based on encounter level data for clinic visits. At each encounter, a person had various information collected. There are people with multiple encounters in the dataset and I would like to transpose the long encouter-based data to wide individual-based data (each individual has an spefific ID). However, I have about 50 variables per encounter (again, a lot of information was collected). From my research, proc transpose seems to only handle one variable at a time? Also, I would like a way to label the transposed information by visit number, based off visit date. Thus I would want to see question1_visit1 rather than question1_11-24-2010 and also have a way to see the date of the visit (i.e. have a date_visit1 column with the date). I imagine I will have to do this all in several steps, since the visit number wont be known until the data is transposed. If anyone can atleast point me in the right direction I would be thankful.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You asked two different questions. 

First to generate a VISIT number you could just use a simple data step with a BY statement.  Reset the counter to zero when you start a new patient and increment it when you see a new date to indiciate a new visit.  

data new ;
   set have ;
   by id date ;
   if first.id then visit=0;
   if first.date then visit+1;
run;

You can find hundreds of posts on transposing data.  One way is the double transpose.  First to convert your data in name/value pairs and then another to convert it back into columns.

 

proc tranpose data=new out=middle ;
   by id visit ;
   var date field1 field2 .... ;
run;

proc transpose data=middle out=want ;
   by id ;
   id _name_ visit ;
   var COL1 ;
run;

This might not work that well if you have mixed character and numeric variables.

View solution in original post

5 REPLIES 5
Astounding
PROC Star

Here is what I expect you will see.

 

You will get 10 responses, all urging you not to do this and explaining why it is a bad idea.

 

You will get 1 response containing a complex macro that might do the job, but which few people will understand.

 

I'm in the first group.  There are many programming techniques that will let you program with the data as is without transposing.  If you list a few programming objectives, I (and many others) would be happy to show the matching programming techniques.

 

Good luck.

epigrad123
Calcite | Level 5

Each encouter was a patient coming in for a test. I am looking at retesting and repositivity rates.

 

For restesting, I want to assess associations with risk of not coming back to get retested after  a positive test results (patients should come back within 3-6 months). I wanted to transpose the data because I planed on having these variables 1) Postive test date 2)Re-tested within a year? (yes or no) 3) if yes revisit date 4) if yes, time between positive test and revisit. In addition to these, I want the behavior and demographic questions from the first visit because I will be building a  Cox proportional hazard regression model with these variables.

 

Next I will be building a logisitc regression model focused on only those who came back for a visit within a year after their positive test result. I will be using demographic/behavior variables from both first postive visit and the revisit to investigate associations with repositivity (so whether or not when the person returns for their second test, if that test is also positive). This means I really just need retest results for this one, but only from those whose last test was a positive.

 

If there is a way to get to these analyses with long data, then suggestions on how to do so are very welcome!!

 

 

Reeza
Super User

You should post sample data. The way I'm reading it you do need to split your data into multiple data sets for easier analysis or a wide data set. 

 

 

Tom
Super User Tom
Super User

You asked two different questions. 

First to generate a VISIT number you could just use a simple data step with a BY statement.  Reset the counter to zero when you start a new patient and increment it when you see a new date to indiciate a new visit.  

data new ;
   set have ;
   by id date ;
   if first.id then visit=0;
   if first.date then visit+1;
run;

You can find hundreds of posts on transposing data.  One way is the double transpose.  First to convert your data in name/value pairs and then another to convert it back into columns.

 

proc tranpose data=new out=middle ;
   by id visit ;
   var date field1 field2 .... ;
run;

proc transpose data=middle out=want ;
   by id ;
   id _name_ visit ;
   var COL1 ;
run;

This might not work that well if you have mixed character and numeric variables.

epigrad123
Calcite | Level 5

Ahhhh thank you so much!!! This was exaclty what I needed and not complicated at all.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2272 views
  • 0 likes
  • 4 in conversation