DATA Step, Macro, Functions and more

Counting categorical variables using data command

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Counting categorical variables using data command

Hi everyone,

 

I was wondering if it is possible to use data instead of proc to count the number of categorical variables on a row as shown in 'count' example below. This will allow me to further use the data e.g COUNT=1 or COUNT > 1 to check morbidity.

 

Also will it be possible to then count the number of each diagnosis in the entire data set per patient while accounting for duplicates if there is any? For example there are 3 CB's and 2 AA's in this data set but CB should be 2 because Patient 2 had it recorded twice.

 

Thank you for your time and have a lovely new year.

 

Untitledsas.png


Accepted Solutions
Solution
2 weeks ago
PROC Star
Posts: 603

Re: Counting categorical variables using data command

[ Edited ]
Posted in reply to exbalterate

You could do something like:

 

Data want;

set your_inputdataset;

array g(*)  diag1-diag4;

count=dim(g)-cmiss(of g(*));

run;

 

View solution in original post


All Replies
Solution
2 weeks ago
PROC Star
Posts: 603

Re: Counting categorical variables using data command

[ Edited ]
Posted in reply to exbalterate

You could do something like:

 

Data want;

set your_inputdataset;

array g(*)  diag1-diag4;

count=dim(g)-cmiss(of g(*));

run;

 

New Contributor
Posts: 4

Re: Counting categorical variables using data command

Posted in reply to novinosrin

Thank you, that worked perfectly. Could you help with the other query? Grouping the same type of diagnosis?

 

 

 

Thanks

PROC Star
Posts: 603

Re: Counting categorical variables using data command

Posted in reply to exbalterate

@exbalterate Sure, we the community are here to help you and sas users across the world. If it is a new query, i would suggest you to open up a new thread so that you may get various responses and you can pick the best. Also, if you are satisfied with the answer you got here and are not waiting for more responses, kindly mark the question as answered. Thank you

Respected Advisor
Posts: 4,236

Re: Counting categorical variables using data command

[ Edited ]
Posted in reply to exbalterate

@exbalterate

"Also will it be possible to then count the number of each diagnosis in the entire data set per patient while accounting for duplicates"

 

Yes, that's certainly possible but please provide sample data in form of a SAS data step and not as a screenshot. Provide also the desired result. This allows us to post code which is actually tested.

Please make also sure that your sample data is representative, i.e. if there can be multiple rows for a patient then include such a case in your sample data. If there can't be multiple rows then tell us.

 

It also always helps if you post some of your code whether already working or not. This makes it much easier for us to understand where you're coming from and also helps us to understand your level of SAS proficiency so that we can better tailor our answers to your current SAS expertise.  

Contributor
Posts: 27

Re: Counting categorical variables using data command

[ Edited ]
Posted in reply to exbalterate

you could use the nmiss function: 4 - nmiss(of, diag1-diag4)?

PROC Star
Posts: 603

Re: Counting categorical variables using data command

Posted in reply to PaulBrownPhD

@PaulBrownPhD Sir, Since OP's example has char values for the variable Diag1-4, I'm afraid NMISS function will unnecessarily warrant a note : Character values have been converted to numeric values at the places given by: (Line)Smiley SadColumn).
125:10 125:12

Cmiss handles both char and numeric values. Also using dim(array) makes it dynamic as opposed to using constant 4. Just my 2 cents

Highlighted
Contributor
Posts: 21

Re: Counting categorical variables using data command

Posted in reply to exbalterate

@exbalterate you can do in in the following way also 

 

data test;
infile cards dlm= '09'x missover;
input patient diag1$ diag2$ diag3$ diag4$;
cards;
1 AA J9 HH6
2 CB CB
3 J10 AA CB J10
4 BB K90
5 J10
;
RUN;

data want;
set test ;
array a (*) diag1-diag4;
count = 0;
do i = 1 to dim(a);
if a(i) ne " " then count+1;
end;
drop i;
run;

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 192 views
  • 4 likes
  • 5 in conversation