Howdy,
This is probably common when coding but I find myself switching between using a sql or a sas/data step depending on what I am most comfortable with. However, one of my pet-peeves is wanting to understand how to accomplish the same code output via both avenues. An example would be aggregation of a metric based on criteria via "group by" with SQL and "retain" with SAS.
Currently I'm using sql trying to select multiple variables, sum a couple of them on the fly, and deal with nulls, within my select statement. Using the example below, "var_d" contains both numeric values and null (.) values. Summing var_d without addressing the nulls produces nulls, numeric + null = null.
proc sql;
create table demo_table as
select t1.var_a, t1.var_b, t1.var_c, sum(t1.var_d) as var_sumd
from work.another_table t1
group by t1.var_a, t1.var_b;
quit;
Wondering what best pratice would be. I'll re-iterate, desired result. I'd like to be able to take variable "var_d", replace nulls with 0's, then sum "var_d" based at the desired level.
within a data step I could simply redefine a variable, and then proceed with summation
if var_d = . then var_d = 0;
within proc sql I become a little confused, as it seems like I'd have to accomplish everything within my select statement. I was thinking Icould redefine via case/when or ifnull or maybe coalesce...(I think)
sum(ifnull(var_d),0)) as var_sum
sum(coalesce(var_d,0)) as var_sum
sum(case when ??
Thoughts? If I've been unclear I can rectify
TS
Are you working on SAS data sets or another database?
The only way that SAS generates a missing value using a native SAS dataset for sum in code like your example is for all values of the summed variable to be missing for the group.
data junk;
input group $ x;
datalines;
a .
a 1
a 1
b 2
b 2
b 2
;
run;
proc sql;
create table summary as
select group, sum(x) as total
from junk
group by group;
quit;
The sum for group A is 2 not missing.
The SUM for Proc Report, Tabulate, Means and Summary will likewise only give a missing for Sum statistice if all values are missing for the group when grouping occurs in the procedure.
per the previous question, I am working within SAS, on a SAS dataset. I'm wanting to execute proc sql against it.
I'm having issues with the ifnull(t1.disc_dollars, 0) as disc_sales... after reviewing the syntax in the SAS documentation this should work....
proc sql;
create table sql_construct_b as
select t1.user_id, t1.store_number, t1.transaction_date, t1.gross_sales, ifnull(t1.disc_dollars,0) as disc_sales
from sql_construct t1;
quit;
It's not recognizing the function ifnull.... which seems odd to me.
TS
SAS Proc SQL is not exactly anyone else's SQL. IFNULL doesn't exist in SAS Proc SQL. You might try the COALESCE function.
BUT you have changed topic from your initial question which involved SUM.
I would use the SQL aggregation function SUM() first and then deal with the missing values.
coalesce(sum(var_d),0)
Note that IFNULL() is not a valid SAS function.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.