I have patients with multiple rows and variables diabetes (1=yes and 2=no) and hypertension (1=yes and 2=no). For each patient, if it is 1 in any row, it should be considered as 1. Please see the initial data set:
data have;
input pt_id diabetes hypertension;
datalines;
1 0 0
1 1 0
1 1 0
1 0 0
2 0 0
2 0 0
2 0 0
2 0 0
3 1 1
3 0 1
3 0 1
run;
Final data set should look like this:
data have;
input pt_id diabetes hypertension;
datalines;
1 1 0
2 0 0
3 1 1
run;
Thank you for your help!
The question subject is misstated as "Long to wide". The question is really one of "Max over Group", or more generically "Result over Group"
The result over group for this question "presence within group".
A "DOW" loop is very effective for processing data sorted into BY groups, and outputting one row per group. (The technique can also be used to apply the computed result to each row in the group).
Example:
data want(keep=pt_id diabetes hypertension); do until (last.pt_id); SET have (rename=(diabetes=_dia hypertension=_htn)); BY pt_id; diabetes = _dia OR diabetes; hypertension = _htn OR hypertension; end; run;
The secret to the DOW loop is placing the SET and BY statements inside a DO LOOP. An implicit OUTPUT occurs at the end of the step with the desired computed results. The rename= is necessary because the programmer wants the aggregate result variables (diabetes/hypertension) to be the same name as the individual flag variables.
Try proc sql -- you desire the max of each flag for each pt_id:
data have;
input pt_id diabetes hypertension;
datalines;
1 0 0
1 1 0
1 1 0
1 0 0
2 0 0
2 0 0
2 0 0
2 0 0
3 1 1
3 0 1
3 0 1
;
run;
data desired;
input pt_id diabetes hypertension;
datalines;
1 1 0
2 0 0
3 1 1
;
run;
*Solution;
proc sql;
create table want as
select
pt_id
, max(diabetes) as diabetes
, max(hypertension) as hypertension
from have
group by pt_id
order by pt_id
;
quit;
*Compare want to desired;
proc compare data=desired comp=want;
run;
Data and description don't match, please clarify: do you have 1=yes and 0=no or 1=yes and 2=no?
The follow step works if 0=no:
data want;
set have;
by pt_id;
retain _diabetes _hypertension 0;
if first.pt_id then do;
_diabetes = diabetes;
_hypertension = hypertension;
end;
else do;
_diabetes = diabetes or _diabetes;
_hypertension = hypertension or _hypertension;
end;
if last.pt_id then do;
output;
end;
drop diabetes hypertension;
rename
_diabetes = diabetes
_hypertension = hypertension
;
run;
The simple code below will work
proc sort data=have ;
by pt_id descending diabetes descending hypertension;
run;
data want;
set have;
by pt_id descending diabetes descending hypertension;
if first.pt_id;
run;
The question subject is misstated as "Long to wide". The question is really one of "Max over Group", or more generically "Result over Group"
The result over group for this question "presence within group".
A "DOW" loop is very effective for processing data sorted into BY groups, and outputting one row per group. (The technique can also be used to apply the computed result to each row in the group).
Example:
data want(keep=pt_id diabetes hypertension); do until (last.pt_id); SET have (rename=(diabetes=_dia hypertension=_htn)); BY pt_id; diabetes = _dia OR diabetes; hypertension = _htn OR hypertension; end; run;
The secret to the DOW loop is placing the SET and BY statements inside a DO LOOP. An implicit OUTPUT occurs at the end of the step with the desired computed results. The rename= is necessary because the programmer wants the aggregate result variables (diabetes/hypertension) to be the same name as the individual flag variables.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.