Hi everyone,
I was wondering if it is possible to use data instead of proc to count the number of categorical variables on a row as shown in 'count' example below. This will allow me to further use the data e.g COUNT=1 or COUNT > 1 to check morbidity.
Also will it be possible to then count the number of each diagnosis in the entire data set per patient while accounting for duplicates if there is any? For example there are 3 CB's and 2 AA's in this data set but CB should be 2 because Patient 2 had it recorded twice.
Thank you for your time and have a lovely new year.
You could do something like:
Data want;
set your_inputdataset;
array g(*) diag1-diag4;
count=dim(g)-cmiss(of g(*));
run;
You could do something like:
Data want;
set your_inputdataset;
array g(*) diag1-diag4;
count=dim(g)-cmiss(of g(*));
run;
Thank you, that worked perfectly. Could you help with the other query? Grouping the same type of diagnosis?
Thanks
@exbalterate Sure, we the community are here to help you and sas users across the world. If it is a new query, i would suggest you to open up a new thread so that you may get various responses and you can pick the best. Also, if you are satisfied with the answer you got here and are not waiting for more responses, kindly mark the question as answered. Thank you
"Also will it be possible to then count the number of each diagnosis in the entire data set per patient while accounting for duplicates"
Yes, that's certainly possible but please provide sample data in form of a SAS data step and not as a screenshot. Provide also the desired result. This allows us to post code which is actually tested.
Please make also sure that your sample data is representative, i.e. if there can be multiple rows for a patient then include such a case in your sample data. If there can't be multiple rows then tell us.
It also always helps if you post some of your code whether already working or not. This makes it much easier for us to understand where you're coming from and also helps us to understand your level of SAS proficiency so that we can better tailor our answers to your current SAS expertise.
you could use the nmiss function: 4 - nmiss(of, diag1-diag4)?
@pau13rown Sir, Since OP's example has char values for the variable Diag1-4, I'm afraid NMISS function will unnecessarily warrant a note : Character values have been converted to numeric values at the places given by: (Line):(Column).
125:10 125:12
Cmiss handles both char and numeric values. Also using dim(array) makes it dynamic as opposed to using constant 4. Just my 2 cents
@exbalterate you can do in in the following way also
data test;
infile cards dlm= '09'x missover;
input patient diag1$ diag2$ diag3$ diag4$;
cards;
1 AA J9 HH6
2 CB CB
3 J10 AA CB J10
4 BB K90
5 J10
;
RUN;
data want;
set test ;
array a (*) diag1-diag4;
count = 0;
do i = 1 to dim(a);
if a(i) ne " " then count+1;
end;
drop i;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.