Here is what my data looks like currently,
ID | timepoint | age | education | survey1 | survey2 | survey3 |
1 | 1 | 26 | 3 | 4 | 5 | |
1 | 2 | BS | 4 | 5 | 6 | |
1 | 2 | 5 | 6 | 7 | ||
2 | 1 | 25 | 3 | 4 | 5 | |
2 | 2 | BA | 4 | 5 | 6 | |
2 | 3 | 5 | 6 | 7 |
I want the data to by wide format by ID.
ID | age | education | survey1_1 | survey1_2 | survey1_3 | survey2_1 | survey2_2 | survey2_3 | survey3_1 | survey3_2 | survey3_3 |
1 | 26 | BS | 3 | 4 | 5 | 4 | 5 | 6 | 5 | 6 | 7 |
2 | 25 | BA | 3 | 4 | 5 | 4 | 5 | 6 | 5 | 6 | 7 |
Please post data step to display your data .
data have;
infile cards expandtabs truncover;
input ID timepoint age education $ survey1 survey2 survey3;
cards;
1 1 26 . 3 4 5
1 2 . BS 4 5 6
1 3 . . 5 6 7
2 1 25 . 3 4 5
2 2 . BA 4 5 6
2 3 . . 5 6 7
;
proc sql;
create table temp as
select id,timepoint,max(age) as age,max(education) as education,survey1,survey2,survey3
from have
group by id
order by 1,2,3,4;
select max(_n) into : n
from (select id,count(*) as _n from have group by id);
quit;
proc summary data=temp;
by id age education;
output out=want idgroup(out[&n] (survey1 survey2 survey3)=);
run;
Please post data step to display your data .
data have;
infile cards expandtabs truncover;
input ID timepoint age education $ survey1 survey2 survey3;
cards;
1 1 26 . 3 4 5
1 2 . BS 4 5 6
1 3 . . 5 6 7
2 1 25 . 3 4 5
2 2 . BA 4 5 6
2 3 . . 5 6 7
;
proc sql;
create table temp as
select id,timepoint,max(age) as age,max(education) as education,survey1,survey2,survey3
from have
group by id
order by 1,2,3,4;
select max(_n) into : n
from (select id,count(*) as _n from have group by id);
quit;
proc summary data=temp;
by id age education;
output out=want idgroup(out[&n] (survey1 survey2 survey3)=);
run;
@Ksharp thank you for your help. it worked.
Quick question: If the table had up to survey10, how would the code change?
What does the order by 1,2,3,4 statement does in this code?
Thank you so much!!
@Nrjn7 wrote:
Quick question: If the table had up to survey10, how would the code change?
What does the order by 1,2,3,4 statement does in this code?
A good reason not to convert a long data set to wide. SAS coding is much easier when you keep long data sets long instead of turning them into wide. Using a wide format data set, if your number of variables changes, then your code has to change. With a long data set, this is not true, SAS coding doesn't care if you have variables up to survey3 or survey10.
Just because you can convert it to long doesn't mean you should. See Maxim 19 "Long beats Wide".
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.