## count number of observations with certain values

Solved
Super Contributor
Posts: 328

# count number of observations with certain values

proc sort data=mydata;
by age;
run;

data count;
set mydata;
by age;
zero=0;
hundred=0;
if age <0 then zero+1;
else if age >100 then hundred+1;
run;

with above code, I can get zero and hundred as two new variables added in the data set. I actually only want to have a simple number of how many cases with age <0 and how many with age>100. How to do that?

Thanks!

Accepted Solutions
Solution
‎01-08-2016 04:26 PM
Super Contributor
Posts: 328

## Re: count number of observations with certain values

proc sql;
select count(*) as N_obs
from mydata
where age >100;
quit;

this works

All Replies
Solution
‎01-08-2016 04:26 PM
Super Contributor
Posts: 328

## Re: count number of observations with certain values

proc sql;
select count(*) as N_obs
from mydata
where age >100;
quit;

this works
Posts: 1,252

## Re: count number of observations with certain values

Great that you found a good solution using PROC SQL.

• You don't actually perform BY group processing. Hence, the BY statement in the data step and, most importantly, the (possibly resource-intensive) PROC SORT step could be omitted.
• The initialization of ZERO and HUNDRED to 0 is redundant. It is implicitly done if you use the "variable+increment" syntax (sum statement).
• If you like your code very short, you could make use of the fact that logical expressions evaluate to 0 (if they are false) or 1 (if they are true).

With these suggestions your initial code could have looked like this:

data count;
set mydata;
zero+(age<0);
hundred+(age>100);
run;

You should be aware that not only negative values, but also missing values of AGE would satisfy the condition AGE<0. If you want to restrict your count to negative values, the third line should read zero+(.<age<0) or, to exclude special missing values as well, zero+(.z<age<0).

Of course, you don't need to create a new (possibly large) dataset COUNT to count those observations. You could avoid this with a data step as the following:

data _null_;
set mydata end=last;
zero+(age<0);
hundred+(age>100);
if last then put zero= hundred=;
run;

By default, the PUT statement would write the desired numbers to the log (not to the output window like PROC SQL).

🔒 This topic is solved and locked.

Discussion stats
• 2 replies
• 3125 views
• 1 like
• 2 in conversation