BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Sean100
Calcite | Level 5

Hi all,

I have a dataset which has a variable that by default is null, but is changed to one of 6 or 7 category names should an incident occur.
I want to count the number of records in each category, including the default null category. Here's my code.

proc sql;

create table no_in_cats as

select

var1,

count(var1) as no_in_cat

from dataset

group by var1;

quit;

 

I get output like this:
var1    no_in_cat
           0
A        65
B        46596
C        3229
D        26
E        104
F        102317

I know that the number of nulls should be the largest number by far, so these results are concerning!
Any idea about where I'm going wrong?

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Its hard to say, can't see any data.  Your sql:

proc sql;
  create table no_in_cats as
  select var1,count(*) as no_in_cat
  from dataset
  group by var1;
quit;

Seems fine (though wouldn't call a dataset "dataset").  Perhaps post some test data in the form of a datastep so we have something to see.

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Its hard to say, can't see any data.  Your sql:

proc sql;
  create table no_in_cats as
  select var1,count(*) as no_in_cat
  from dataset
  group by var1;
quit;

Seems fine (though wouldn't call a dataset "dataset").  Perhaps post some test data in the form of a datastep so we have something to see.

Sean100
Calcite | Level 5
Hi,
The * you have in your count function is not in my code, I changed that and as a result my code works! Why is this?
Thanks.

P.S. I agree with your point about datasets, I changed the names in my code for the post 🙂
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, I believe it is because nulls are not counted:

https://www.techonthenet.com/sql/count.php

 

I always tend to use * - or check all variables - to do a count of rows in case blanks are present.

Tom
Super User Tom
Super User

COUNT(varname) will count the number of non-missing values for a variable. To just count observations use COUNT(*).  I have seen some people put a constant there instead of the *.  So COUNT(1).  Although COUNT(5) would work just the same as COUNT(1).

RW9
Diamond | Level 26 RW9
Diamond | Level 26

I believe that when you put a number in like that it noramlly reflects the logical position of the variable in the dataset, so:

group by 1,3,4;

Groups by variable which is found in position 1 of the data, and then by position 3 variable, then by position 4 variable.  I would avoid such notation as a change to the data can drastically alter code output.

Tom
Super User Tom
Super User

You can use position numbers in GROUP BY or ORDER BY clauses. Not in aggregate function calls.

 

Test code

data test;
 set sashelp.class ;
 if _n_ in (1,3) then call missing(age);
 if _n_ in (2,3,4) then call missing(height);
run;

proc sql;
 select
  count(*) as count_star
, count(1) as count_1
, count(1000) as count_1000
, count(.) as count_missing
, count(' ') as count_blank
, count(name) as count_name
, count(age) as count_age
, count(height) as count_height
from test
;
quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 5290 views
  • 1 like
  • 3 in conversation