I have multiple observations for people and want to end up with one observation per person. I also want to retain any YES responses and drop any NO responses from any of the observations for about 20 variables with very different names.
The sample data is not in the form of a working data step, so here is untested code, but at least I know the general layout now:
data want;
set have;
by name notsorted;
array vars took_medication -- low_bp;
array tmp {20} _temporary_;
if first.name then call missing(of tmp{*});
do over vars;
if vars='YES' then tmp{_i_}=1;
end;
if last.name;
do over vars;
if tmp{_i_}=1 then vars='YES';
else vars=' ';
end;
run;
Make sure the size of the TMP array is at least as many as there are variables to be examined. TMP can exceed that number with no ill effects.
I don't follow what your input, or desired output looks like. Could you please provide the input data in the form of a working data step? That would facilitate generating test code. And then please show what the desired output should look like.
The dataset I'm starting with is like this:
OBS | NAME | DOB | TOOK_MEDICATION | HIGH_BP | LOW_BP |
1 | DOE, JOHN | 01/01/1950 | NO | NO | NO |
2 | DOE, JOHN | 01/01/1950 | NO | NO | YES |
3 | DOE, JOHN | 01/01/1950 | NO | YES | NO |
4 | DOE, JOHN | 01/01/1950 | YES | YES | NO |
5 | DOE, JANE | 12/31/1955 | NO | NO | NO |
6 | DOE, JANE | 12/31/1955 |
| YES | NO |
I want to end up with this:
OBS | NAME | DOB | TOOK_MEDICATION | HIGH_BP | LOW_BP |
1 | DOE, JOHN | 01/01/1950 | YES | YES | YES |
2 | DOE, JANE | 12/31/1955 |
| YES |
|
The sample data is not in the form of a working data step, so here is untested code, but at least I know the general layout now:
data want;
set have;
by name notsorted;
array vars took_medication -- low_bp;
array tmp {20} _temporary_;
if first.name then call missing(of tmp{*});
do over vars;
if vars='YES' then tmp{_i_}=1;
end;
if last.name;
do over vars;
if tmp{_i_}=1 then vars='YES';
else vars=' ';
end;
run;
Make sure the size of the TMP array is at least as many as there are variables to be examined. TMP can exceed that number with no ill effects.
That problem would be really easy to solve if the ccc-vars would have 1 and 0 instead of yes/no as values.
proc format;
value YesNo
1 = 'Yes'
0 = 'No'
;
run;
proc summary data=work.have nway;
class Name;
var took_medication -- low_bp;
format took_medication -- low_bp YesNo.;
output out= work.want max=;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.