Hi all,
I am trying to use count in proc sql in order to calculate some averages, please see my code below.
The issue seems to be that proc sql returns numeric value, if the count is larger than zero and a missing value if the count is zero, so I tried to add some if statements to ensure that the averate value is zero, if the count values are missing.
Any ideas what is wrong with my approach. Code no longer throws errors due to missing values, but neither does it calculate any averages, so all my averages are currently showing as zeros.
Many thanks for your help in advance,
Olli
proc sql /* noprint */;
SELECT COUNT(complaint_id)
into :all_complaints
FROM pg.complaint
WHERE complaint_date >= "&r_start"d
AND complaint_date < "&r_end"d;
quit;
proc sql /* noprint */;
SELECT count(distinct(officer_id))
into :all_officers
FROM pg.complaint
WHERE complaint_date >= "&r_start"d
AND complaint_date < "&r_end"d;
quit;
%let complaint_avg=0;
data _NULL_;
if not missing(&all_officers) and not missing(&all_complaints) then call symput('complaint_avg',&all_complaints/&all_officers);
run;
This method of computing an average involves a lot of unnecessary work. And, there's absolutely no reason to store counts in macro variables so then you can create an average from those macro variable values.
UNTESTED CODE
proc sql;
create table want as count(complaint_id)/count(distinct officer_id) as avg
from pg.complaint
where ...
;
quit;
I am also concerned that your average is biased, since officers with zero complaints will not be included in the denominator.
This method of computing an average involves a lot of unnecessary work. And, there's absolutely no reason to store counts in macro variables so then you can create an average from those macro variable values.
UNTESTED CODE
proc sql;
create table want as count(complaint_id)/count(distinct officer_id) as avg
from pg.complaint
where ...
;
quit;
I am also concerned that your average is biased, since officers with zero complaints will not be included in the denominator.
You're welcome, @ojaro.
Please consider my comment earlier about how the average computed this way is biased because officers with zero complaints are not included in the denominator.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.