Hi everyone.
I have this sample data.
proc = procedures (1 = did the procedure, 0 otherwise)
length = length of artery (if proc = 0 then length = blank)
data hosp;
input AdmissionDate date9. patientname $ 1-20 gender proc1 proc2 proc3 length1 length2 length3;
datalines;
03May2011 Maria F 1 1 0 8 2 .
10Jun2012 Maria F 0 1 1 . 5 3
31Dec2015 Jim M 0 0 1 . . 7
19Apr2002 Luna F 1 0 1 1 . 4
run;
i would like to have the following report.
Name of patient Gender No. of procedures Length<5 Length>=5
Maria F 4 2 2
Jim M 1 0 1
Luna F 2 2 0
Thanks heaps!
Yoyong
data hosp ;
input patientname :$20. gender :$1. proc1-proc3 length1-length3 ;
cards ;
Maria F 1 1 0 8 2 .
Maria F 0 1 1 . 5 3
Jim M 0 0 1 . . 7
Luna F 1 0 1 1 . 4
;
run ;
proc sql;
create table want as
select patientname,gender,
sum(proc1)+sum(proc2)+sum(proc3) as n_proc,
sum(.<length1<5)+sum(.<length2<5)+sum(.<length3<5) as len_lt_5,
sum(length1>=5)+sum(length2>=5)+sum(length3>=5) as len_ge_5
from hosp
group by patientname,gender;
quit;
First, your input file DATA step cannot compile since you've missed $ after gender. Second, the variable AdmissionDate is irrelevant in the context of your question, so there's no need to show it. That having been said, if your input file is grouped by (patientname, gender), as in your sample data, then it's a simple job for the DoW loop (note that below I've changed your input step accordingly):
data hosp ;
input patientname :$20. gender :$1. proc1-proc3 length1-length3 ;
cards ;
Maria F 1 1 0 8 2 .
Maria F 0 1 1 . 5 3
Jim M 0 0 1 . . 7
Luna F 1 0 1 1 . 4
run ;
data want (drop = proc: length:) ;
do until (last.gender) ;
set hosp ;
by patientname gender notsorted ;
n_procs = sum (n_procs, sum (of proc:)) ;
array ls length: ;
do over ls ;
if cmiss (ls) then continue ;
if ls < 5 then len_lt5 = sum (len_lt5, 1) ;
else len_ge5 = sum (len_ge5, 1) ;
end ;
len_lt5 = sum (len_lt5, 0) ;
len_ge5 = sum (len_ge5, 0) ;
end ;
label patientname = "Name of patient"
n_procs = "No. of procedures"
len_lt5 = "Length<5"
len_ge5 = "Length>=5"
;
run ;
If HOSP isn't really grouped by (patientname, gender), sort it by these keys beforehand.
If HOSP is neither grouped nor sorted and you don't want to sort it, the job still can be done by using the aggregation power of the hash object:
data _null_ ;
if _n_ = 1 then do ;
dcl hash h (ordered:"A") ;
h.definekey ("patientname", "gender") ;
h.definedata ("patientname", "gender", "n_procs", "len_lt5", "len_ge5") ;
h.definedone () ;
end ;
set hosp end = z ;
if h.find() ne 0 then do ;
n_procs = 0 ;
len_lt5 = 0 ;
len_ge5 = 0 ;
end ;
n_procs + sum (of proc:) ;
array ls length: ;
do over ls ;
if cmiss (ls) then continue ;
if ls < 5 then len_lt5 + 1 ;
else len_ge5 + 1 ;
end ;
h.replace() ;
if z then h.output (dataset: "want") ;
label patientname = "Name of patient"
n_procs = "No. of procedures"
len_lt5 = "Length<5"
len_ge5 = "Length>=5"
;
run ;
Kind regards
Paul D.
data hosp ;
input patientname :$20. gender :$1. proc1-proc3 length1-length3 ;
cards ;
Maria F 1 1 0 8 2 .
Maria F 0 1 1 . 5 3
Jim M 0 0 1 . . 7
Luna F 1 0 1 1 . 4
;
run ;
proc sql;
create table want as
select patientname,gender,
sum(proc1)+sum(proc2)+sum(proc3) as n_proc,
sum(.<length1<5)+sum(.<length2<5)+sum(.<length3<5) as len_lt_5,
sum(length1>=5)+sum(length2>=5)+sum(length3>=5) as len_ge_5
from hosp
group by patientname,gender;
quit;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.