BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
EMc
Calcite | Level 5 EMc
Calcite | Level 5

I have 180 columns containing the same character data reported each year over many years (from a longitudinal data set).   I am looking to combine these columns into one singular column so that I can compute two way frequencies etc.   I have looked into "append" however I understand this is for datasets not variables.  I have also tried a transpose, however I end up with 9000 columns (number of patients).  Other similar posts suggest an array for this type of data, however they all require you to enter datelines.  My dataset is much too large to record  each individual dateline!   Hope someone has insight as to how to do this, or possibly a better method! 

 

Current Data: 

Patient ID      Year 1.    Year 2.    Year 3.    Year 4.   Year 5

1.                    ABC         ABB        ABA        ABC      BAC  

2                     ACA         ABC        ABB        ABA      AAC  

3                     ABB         ABA        ACA        ABC      AAC  

 

What I want: 

Patient ID   Health Condition

1.                    ABC        

1.                    ABB      

1.                    ABA        

1.                    ABC      

1.                    BAC  

2.                    ACA        

2.                    ABC        

2.                    ABB        

2.                    ABA      

2.                    AAC  

3.                    ABB

3.                    ABA        

3.                    ACA        

3.                    ABC      

3.                    AAC  

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

It works for me

data have;
  infile datalines truncover dsd dlm=' ';
  input (Patient_ID Year_1 Year_2 Year_3 Year_4 Year_5) ($);
  datalines;
1 ABC ABB ABA ABC BAC
2 ACA ABC ABB ABA AAC
3 ABB ABA ACA ABC AAC
;

PROC TRANSPOSE data=have out=want name=Year prefix=Health_Condition;
  var Year_1 Year_2 Year_3 Year_4 Year_5;
  by Patient_ID;
run;

Patrick_0-1618711331326.png

 

Or here a data step approach which I would expect to perform better.

data want2;
  set have;
  array _y {*} year_:;
  do _i=1 to dim(_y);
    Year=input(vname(_y[_i]),$6.);
    Health_Condition=_y[_i];
    output;
  end;
  keep Patient_ID Year Health_Condition;
run;

 

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26
Use PROC TRANSPOSE with a BY PATIENTID; statement.
--
Paige Miller
EMc
Calcite | Level 5 EMc
Calcite | Level 5

Hi @PaigeMiller,
Thanks for your advice! This is the right idea, however I am looking for it to be the actual data (health condition e.g., ABC) in each cell that is listed rather than the column name. I am sure there is a way to do this, but can't seem to figure it out. My code and output using PROC TRANSPOSE is shown below, please refer to my original post for what I would like my data to look like!

PROC TRANSPOSE data=file out=file_transposed;
         var Year_1 Year_2 Year_3 Year_4 Year_5; 
         by PatientID;
run;


PatientID. NAME
1. Year 1
1. Year 2
1. Year 3
1. Year 4
1. Year 5
2. Year 1
2. Year 2
2. Year 3
2. Year 4
2. Year 5
3. Year 1
3. Year 2
3. Year 3
3. Year 4
3. Year 5

Patrick
Opal | Level 21

It works for me

data have;
  infile datalines truncover dsd dlm=' ';
  input (Patient_ID Year_1 Year_2 Year_3 Year_4 Year_5) ($);
  datalines;
1 ABC ABB ABA ABC BAC
2 ACA ABC ABB ABA AAC
3 ABB ABA ACA ABC AAC
;

PROC TRANSPOSE data=have out=want name=Year prefix=Health_Condition;
  var Year_1 Year_2 Year_3 Year_4 Year_5;
  by Patient_ID;
run;

Patrick_0-1618711331326.png

 

Or here a data step approach which I would expect to perform better.

data want2;
  set have;
  array _y {*} year_:;
  do _i=1 to dim(_y);
    Year=input(vname(_y[_i]),$6.);
    Health_Condition=_y[_i];
    output;
  end;
  keep Patient_ID Year Health_Condition;
run;

 

EMc
Calcite | Level 5 EMc
Calcite | Level 5
Thanks a lot! Data step was exactly what I needed.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 780 views
  • 1 like
  • 3 in conversation