BookmarkSubscribeRSS Feed
Xyp
Calcite | Level 5 Xyp
Calcite | Level 5
How do I count all the non missing observations of a bunch of categorical variables then present these counts on a bar chart?

I have tried proc freq, output to a data set then use proc means sum option.

This is very time consuming because I need to do that to like 10 variables and I am sure there’s a smarter way to do this

Also need to compare the counts

Thank you!!
4 REPLIES 4
PaigeMiller
Diamond | Level 26

See Re: proc freq one way table for multiple vars- export to data set - SAS Support Communities

 

The data set named WANT seems to be what you are looking for. From this, plotting bar charts should be relatively simple, you would do the bar charts in PROC SGPLOT with a BY VARIABLE; statement.

 

In that example, the problem shows how to do what you want on numeric variables, but it works just as well on character variables. Instead of tables _numeric_; you could use tables _character_; or just list the character variables you want to work on.

 

Also need to compare the counts

This is too vague for me to understand what you want. Please be specific.

--
Paige Miller
Stu_SAS
SAS Employee

Hey @Xyp! I think this will get you what you need if I understand your question correctly. You can use PROC FREQ to do this by using the underlying table that is created for each character column. Let's say we have data like this:

data have;
    array category[10] $;

    do i = 1 to 10;
        do j = 1 to dim(category);
            select(mod(i, 3));
                when(1)   category[j] = 'a';
                when(2)   category[j] = 'b';
                otherwise category[j] = 'c';
            end;

            if(i < j) then category[j] = '';
        end;
        output;
    end;

    drop i j;
run;
category1 category2 category3 ... category10
a                             
b         b                   
c         c         c        
...       ...       ...       ...
a         a         a             a

We're expecting  a bar chart with 10 non-missings for values for category1, 9 non-missings for category2, etc.

 

If we run PROC FREQ, we'll see that it knows how many values there are for each level of each category, as well as the total missings:

proc freq data=have;
   tables _CHARACTER_;
run;

Stu_SAS_0-1746456755267.png

 

We want the underlying table that this output is based off of, so we're going to snag it with ODS OUTPUT. Note that if you wanted to figure out the name of this table, you can turn ods trace on to get it.

ods select none;

proc freq data=have;
   tables _CHARACTER_;
   ods output OneWayFreqs=OneWayFreqs(keep=table frequency);
run;

ods select all;

 

Now we have a table that looks like this:

Table	            Frequency
Table category1	    4
Table category1	    3
Table category1	    3
Table category2	    3
Table category2	    3
Table category2	    3
Table category3	    3
Table category3	    2
Table category3	    3

 

These are our non-missing counts for each category. Let's clean up the name and plot it. sgplot will do all of the summing for us, so we don't need to do any additional post-processing.

data want;
    set OneWayFreqs;
    table = scan(table, -1);
    rename table = category;
run;

proc sgplot data=want;
    vbar category / freq=frequency categoryorder=respdesc;
run;

Stu_SAS_1-1746456907171.png

 

Hope this helps!

Kathryn_SAS
SAS Employee

If you want the total non-missing count for each variable, the sample below may also provide an alternative:

https://support.sas.com/kb/44/124.html

 

ballardw
Super User

@Xyp wrote:
How do I count all the non missing observations of a bunch of categorical variables then present these counts on a bar chart?



The above statement can be interpreted at least two ways: counts of each level of the categorical variable or total count of non-missing observations.

It is always a good idea to provide example data, or use one of the training data set supplied by SAS such as SASHELP.CLASS, and then show the expected results. Since SASHELP.CLASS only has 19 observations it is usually pretty easy to do the calculations by hand.

 

Second you talk about using Proc Freq then Proc Means but do not provide any details of exactly how you are using them.

Then the actual bar chart code you use. Some of this may very well be possible with just the graph procedure (and possibly a custom format to assign all non-missing values to a single value of "not missing" or similar).

Are your categorical variables numeric, character or a mix?

 

Here's an example with a custom format and summary using Proc Freq using ODS OUTPUT and plotting one of the variables. Without an example of what your plots are supposed to look like can't go much further.

 

data work.class;
   set sashelp.class;
   if name in ('Alfred' 'Henry') then call missing(sex);
   if name in ('Barbara' 'Jane' 'John') then call missing(age);
run;

proc format;
value num_notmiss 
.='Missing'
other= 'Not missing'
;
value $char_notmiss
' '='Missing'
other= 'Not missing'
;
run;

proc freq data=work.class;
   ods output onewayfreqs= work.classcount;
   table sex age;
   format sex $char_notmiss. age num_notmiss.;
run;

proc sgplot data=work.classcount;
  vbar f_age / freq=frequency datalabel;
run;

Note the way the output data set from Proc Freq looks and the use of the F_ named variable to have the custom formatted text as the only category reported.

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 830 views
  • 0 likes
  • 5 in conversation