I have a dataset that includes multiple diagnosis codes in one ";" delimited variable sorted by patient and year.
For example.
Patient Year Codes
1 18 0123;4444;2346;0912
2. 17 1234;4444
3 17 0912
4 16 0123;2346;0912
5 15 4444
I would like to know the frequency of the individual codes grouped by year without having to manually enter all of the codes.
Example output
2015 2016 2017 2018 Total
0123 n n n n n
4444
2346
0912
I have tried with proc freq but it will not look within the strings. Proc tabulate will work but only if I manually enter each code.
Any help with a more efficient way of doing this?
Any help would be appreciated.
Simon
transpose using scan of codes one by one
proc summary/freq using yearcodes
then transpose again
data have;
input Patient Year Codes $40.;
cards4;
1 18 0123;4444;2346;0912
2 17 1234;4444
3 17 0912
4 16 0123;2346;0912
5 15 4444
;;;;
run;
data temp;
set have;
do i=1 to countw(Codes,';');
code=scan(Codes,i,';');output;
end;
run;
proc tabulate data=temp;
class year code;
table code=' ',year=' '*n all='Total';
keylabel n=' ';
run;
Also note, the form in which you have stored your data is relatively useless. Take a good look at the form that comes out of @Ksharp's DATA step. That would be much more flexible for both calculating and reporting.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.