I have a dataset that includes multiple diagnosis codes in one ";" delimited variable sorted by patient and year.
For example.
Patient Year Codes
1 18 0123;4444;2346;0912
2. 17 1234;4444
3 17 0912
4 16 0123;2346;0912
5 15 4444
I would like to know the frequency of the individual codes grouped by year without having to manually enter all of the codes.
Example output
2015 2016 2017 2018 Total
0123 n n n n n
4444
2346
0912
I have tried with proc freq but it will not look within the strings. Proc tabulate will work but only if I manually enter each code.
Any help with a more efficient way of doing this?
Any help would be appreciated.
Simon
transpose using scan of codes one by one
proc summary/freq using yearcodes
then transpose again
data have;
input Patient Year Codes $40.;
cards4;
1 18 0123;4444;2346;0912
2 17 1234;4444
3 17 0912
4 16 0123;2346;0912
5 15 4444
;;;;
run;
data temp;
set have;
do i=1 to countw(Codes,';');
code=scan(Codes,i,';');output;
end;
run;
proc tabulate data=temp;
class year code;
table code=' ',year=' '*n all='Total';
keylabel n=' ';
run;
Also note, the form in which you have stored your data is relatively useless. Take a good look at the form that comes out of @Ksharp's DATA step. That would be much more flexible for both calculating and reporting.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.