Hi! I'm a novice SAS user and looking for help with a dataset. It is survey data where each row corresponds to a respondents' answer to one of 35 questions. There are close to 9,000 respondents and questions could be skipped so this is quite a mess. I'd like to transform the dataset into something useable for analysis such as one row for each respondent and one column for each question variable. I'm not sure how to do that in SAS Studio. I've read about merging, but that seems to apply more to merging two datasets, not one jumbled up mess of a dataset. Below are examples of the current dataset format and what I'm trying to transform it into. Any suggestions or guidance would be greatly appreciated. Thank you so much!
Jen
Example of current dataset:
Respondent ID | Question | Answer |
A | 1 | Yes |
B | 1 | No |
C | 1 | No |
E | 1 | Yes |
A | 2 | No |
B | 2 | Yes |
D | 2 | Yes |
B | 3 | Yes |
D | 3 | No |
E | 3 | No |
Example of what I'd like to create:
Respondent ID | Question 1 | Question 2 | Question 3 |
A | Yes | No | missing |
B | No | Yes | Yes |
C | No | missing | missing |
D | missing | Yes | No |
E | Yes | missing | No |
Hey @JosieB! PROC TRANSPOSE can take care of this for you to get it in the format you're looking for. First sort your data by Respondent_ID, then use PROC TRANSPOSE.
proc sort data=have;
by Respondent_ID;
run;
proc transpose data = have
out = want(drop=_NAME_)
prefix = Question_;
by Respondent_ID;
id Question;
var Answer;
run;
Here are the three key things that make this transpose work the:
1. By-group processing for each Respondent_ID
2. Identifying the question number with the Question variable
3. Giving each new variable a prefix of "Question_"
If we did not give it a prefix, each variable's name would simply be "1", "2", and "3". That's why this option is there - it's really handy!
Most analyses can be done without transforming this data set. What analysis/plots/tables are you going to create?
Hi Paige! I need to do one-way and two-way proc freq tables. Thanks!
PROC FREQ will do this without transposing the data.
proc freq data=have;
tables question*answer;
run;
Hey @JosieB! PROC TRANSPOSE can take care of this for you to get it in the format you're looking for. First sort your data by Respondent_ID, then use PROC TRANSPOSE.
proc sort data=have;
by Respondent_ID;
run;
proc transpose data = have
out = want(drop=_NAME_)
prefix = Question_;
by Respondent_ID;
id Question;
var Answer;
run;
Here are the three key things that make this transpose work the:
1. By-group processing for each Respondent_ID
2. Identifying the question number with the Question variable
3. Giving each new variable a prefix of "Question_"
If we did not give it a prefix, each variable's name would simply be "1", "2", and "3". That's why this option is there - it's really handy!
Thanks @Stu_SAS! That's super helpful! I'll check out PROC TRANSPOSE. There are so many things to learn with SAS. One baby step at a time! 😀
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.