hello--
I have a wide dataset (100 columns) that I would like to transpose by patientID to long. I have 30 columns that all have the same prefix diag_1,diag_2, etc.
Is there a way to write macro within a proc sql loop that would select union through each of the diag_ variables?
tia!
What are the other 69 columns? Or do you not want them included in the long set?
Does PatientId repeat in the data, such as on different dates or different facilities? Is the date/facility/other information to be in the result?
Assuming that you have a date along with the patient Id this is how I would start this:
Proc sort data=have; by PatientId Visitdate; run; Proc transpose data=have out=long; by patientid visitdate; var diag_: ; run;
You want to include enough variables in the BY statement in both Proc Sort and transpose to uniquely identify the source record. The Diag_: is a list that tells SAS to use all the variables whose names start with Diag_ . If there are some that are different than your numbered variables that you would not want then use the list Diag_1 - Diag_30 (assumes they are sequentially numbered).
Basically there's a few ways to transpose data and SQL is the last option in the list of options by preference because you have to manually do everything. Using a data step and array, proc transpose are the best options.
Transposing data tutorials:
Wide to Long:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/
https://stats.idre.ucla.edu/sas/modules/reshaping-data-wide-to-long-using-a-data-step/
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.