Hi all, I am trying to do a couple of things with the sample below and have search online a lot for it but it isn't working. Will it be possible to do using data step? 1. I want to find the top two diagnosis(diag) with a table showing the count from the entire data set per patient or simply from diag1 if not possible. Something like AA|3, BB|4. Instead of a full list of all the diagnosis which will be thousands in a large data set. Tried various freq and count commands but it doesn't work. 2. Couldn't figure this out but is it possible to find associated diagnosis per patient? For example patients diagnosed with 'AA' also have 'JI' and patients with 'BB' also have 'GE'. 3. In a large data set will it be possible to link agegroups to diag1 , to see which diagnosis is common to each agegroup? Thank you data sample; input MYDOB ADMIN$ SEX$ ID DIAG1$ DIAG2$ DIAG3$ DIAG4$ ; DATALINES; 111924 KL M 1 AA JI GE . 121926 OH M 2 BB AA . GE 121982 KL F 3 BB . . AA 101980 KL M 4 AA . . . 111979 OH F 5 JI . AA . 101959 KL M 6 GE . . . 121999 OH F 7 BB . . . 102008 KL M 8 AA JI . . 112001 KL F 9 JI . . AA 102016 OH M 10 BB . GE . RUN; data age2; set SAMPLE; newdate1=mdy(12,31,2016); format newdate1 mmyyn.; run; /* Calculate age*/ data cleaned (keep= ADMIN SEX ID DIAG1 diag2 diag3 diag4 AGE agegroup); retain ID AGE ADMIN SEX DIAG1 DIAG2 DIAG3 DIAG4 agegroup; /* rearrange*/ set age2; OPTIONS YEARCUTOFF=1917; a = put(MYDOB,6.); NEWDOB = input (a, mmddyy6.); format NEWDOB MMYYN.; age = INT(YRDIF(NEWDOB, newdate1,'ACTUAL')); /*create age group*/ if age<=18 then agegroup ='Young'; else if 19<=age<64 then agegroup='Intemediate'; else if age >64 then agegroup= 'Retired'; run;
... View more