SAS Programming

Sean100 · Posted 05-03-2018 07:35 AM

Hi all,

I have a dataset which has a variable that by default is null, but is changed to one of 6 or 7 category names should an incident occur.
I want to count the number of records in each category, including the default null category. Here's my code.

proc sql;

create table no_in_cats as

select

var1,

count(var1) as no_in_cat

from dataset

group by var1;

quit;

I get output like this:
var1   no_in_cat
           0
A       65
B        46596
C        3229
D        26
E        104
F        102317

I know that the number of nulls should be the largest number by far, so these results are concerning!
Any idea about where I'm going wrong?

Thanks

RW9 · Posted 05-03-2018 08:29 AM

Its hard to say, can't see any data. Your sql:

proc sql;
  create table no_in_cats as
  select var1,count(*) as no_in_cat
  from dataset
  group by var1;
quit;

Seems fine (though wouldn't call a dataset "dataset"). Perhaps post some test data in the form of a datastep so we have something to see.

View solution in original post

RW9 · Posted 05-03-2018 08:29 AM

Its hard to say, can't see any data. Your sql:

proc sql;
  create table no_in_cats as
  select var1,count(*) as no_in_cat
  from dataset
  group by var1;
quit;

Seems fine (though wouldn't call a dataset "dataset"). Perhaps post some test data in the form of a datastep so we have something to see.

Sean100 · Posted 05-03-2018 09:27 AM

Hi,
The * you have in your count function is not in my code, I changed that and as a result my code works! Why is this?
Thanks.

P.S. I agree with your point about datasets, I changed the names in my code for the post 🙂

RW9 · Posted 05-03-2018 09:38 AM

Well, I believe it is because nulls are not counted:

https://www.techonthenet.com/sql/count.php

I always tend to use * - or check all variables - to do a count of rows in case blanks are present.

Tom · Posted 05-03-2018 10:05 AM

COUNT(varname) will count the number of non-missing values for a variable. To just count observations use COUNT(*). I have seen some people put a constant there instead of the *. So COUNT(1). Although COUNT(5) would work just the same as COUNT(1).

RW9 · Posted 05-03-2018 10:13 AM

I believe that when you put a number in like that it noramlly reflects the logical position of the variable in the dataset, so:

group by 1,3,4;

Groups by variable which is found in position 1 of the data, and then by position 3 variable, then by position 4 variable. I would avoid such notation as a change to the data can drastically alter code output.

Tom · Posted 05-03-2018 10:22 AM

You can use position numbers in GROUP BY or ORDER BY clauses. Not in aggregate function calls.

Test code

data test;
 set sashelp.class ;
 if _n_ in (1,3) then call missing(age);
 if _n_ in (2,3,4) then call missing(height);
run;

proc sql;
 select
  count(*) as count_star
, count(1) as count_1
, count(1000) as count_1000
, count(.) as count_missing
, count(' ') as count_blank
, count(name) as count_name
, count(age) as count_age
, count(height) as count_height
from test
;
quit;

SAS Programming

SAS sql not counting null values as expected.

Re: SAS sql not counting null values as expected.

Re: SAS sql not counting null values as expected.

Re: SAS sql not counting null values as expected.

Re: SAS sql not counting null values as expected.

Re: SAS sql not counting null values as expected.

Re: SAS sql not counting null values as expected.

Re: SAS sql not counting null values as expected.

Follow Us

What is...

SAS Programming

Special offer for SAS Communities members

SAS Training: Just a Click Away

Follow Us

What is...