I have a dataset where each participant received three tests, and the results of each test are in individual rows such that there are three rows for each participant. I want to collapse the rows so that each participant is represented by only one row. The data looks like this:
ID | Test1 | Test2 | Test3 |
V04001 | 0 | ||
V04001 | 1 | ||
V04001 | 0 | ||
V04002 | 0 | ||
V04002 | 0 | ||
V04002 | 0 | ||
V04003 | 0 | ||
V04003 | 1 | ||
V04003 | 1 |
And this is what I want:
ID | Test1 | Test2 | Test3 |
V04001 | 0 | 1 | 0 |
V04002 | 0 | 0 | 0 |
V04003 | 1 | 1 | 0 |
Most of the searching I've done for a solution ends up with PROC TRANSPOSE, but I can't seem to figure out how to make that work with what I need. Any help would be much appreciated!
data have;
input ID $ Test1 Test2 Test3;
cards;
V04001 . . 0
V04001 . 1 .
V04001 0 . .
V04002 . . 0
V04002 . 0 .
V04002 0 . .
V04003 . . 0
V04003 . 1 .
V04003 1 . .
;
data want;
update have(obs=0) have;
by id;
run;
data have;
input ID $ Test1 Test2 Test3;
cards;
V04001 . . 0
V04001 . 1 .
V04001 0 . .
V04002 . . 0
V04002 . 0 .
V04002 0 . .
V04003 . . 0
V04003 . 1 .
V04003 1 . .
;
data want;
update have(obs=0) have;
by id;
run;
Hi @lh50
Please try this
proc sql;
create table want as
select id, sum(test1) as test1,
sum(test2) as test2,
sum(test3) as test3
from have
group by id;
quit;
If your TEST variables are all numeric then an approach with proc summary/means and the MAX function may be appropriate:
Proc summary data = have nway; class id; var test1 test2 test3; output out=want (drop=_type_ _freq_) sum =; run;
If the values are not numeric this will not work as VAR variables in summary must be numeric.
Other considerations arise if there are other variables in your data as well. Which values should be kept in the "collapsing" process would need to be specified to provide a different solution as the above will remove any other variables.
That is very easy, using the UPDATE statement:
data have;
input ID $ test1-test3;
cards;
V04001 . . 0
V04001 . 1 .
V04001 0 . .
V04002 . . 0
V04002 . 0 .
V04002 0 . .
V04003 . . 0
V04003 . 1 .
V04003 1 . .
;run;
data want;
update have(obs=0) have;
by id;
run;
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.