Write and run SAS programs in your web browser

PROC SQL vs. DATA step aggregating w/ null values

Reply
Contributor
Posts: 47

PROC SQL vs. DATA step aggregating w/ null values

Howdy,

 

This is probably common when coding but I find myself switching between using a sql or a sas/data step depending on what I am most comfortable with. However, one of my pet-peeves is wanting to understand how to accomplish the same code output via both avenues. An example would be aggregation of a metric based on criteria via "group by" with SQL and "retain" with SAS.

 

Currently I'm using sql trying to select multiple variables, sum a couple of them on the fly, and deal with nulls, within my select statement. Using the example below, "var_d" contains both numeric values and null (.) values. Summing var_d without addressing the nulls produces nulls, numeric + null = null.

 

proc sql;

create table demo_table as

select t1.var_a, t1.var_b, t1.var_c, sum(t1.var_d) as var_sumd

from work.another_table t1

group by t1.var_a, t1.var_b;

quit;

 

Wondering what best pratice would be. I'll re-iterate, desired result. I'd like to be able to take variable "var_d", replace nulls with 0's, then sum "var_d" based at the desired level.

 

within a data step I could simply redefine a variable, and then proceed with summation

if var_d = . then var_d = 0;

 

within proc sql I become a little confused, as it seems like I'd have to accomplish everything within my select statement. I was thinking Icould redefine via case/when or ifnull or maybe coalesce...(I think)

sum(ifnull(var_d),0)) as var_sum

sum(coalesce(var_d,0)) as var_sum

sum(case when ??

 

Thoughts? If I've been unclear I can rectify

 

TS

 

 

Grand Advisor
Posts: 10,210

Re: PROC SQL vs. DATA step aggregating w/ null values

Are you working on SAS data sets or another database?

 

The only way that SAS generates a missing value using a native SAS dataset for sum in code like your example is for all values of the summed variable to be missing for the group.

 

data junk;
   input group $ x;
datalines;
a   .
a   1
a   1
b   2
b   2
b   2
;
run;

proc sql;
   create table summary as
   select group, sum(x) as total
   from junk
   group by group;
quit;

The sum for group A is 2 not missing.

The SUM for Proc Report, Tabulate, Means and Summary will likewise only give a missing for Sum statistice if all values are missing for the group when grouping occurs in the procedure.

Contributor
Posts: 47

Re: PROC SQL vs. DATA step aggregating w/ null values

per the previous question, I am working within SAS, on a SAS dataset. I'm wanting to execute proc sql against it.

 

I'm having issues with the ifnull(t1.disc_dollars, 0) as disc_sales... after reviewing the syntax in the SAS documentation this should work....

 

proc sql;
create table sql_construct_b as
select t1.user_id, t1.store_number, t1.transaction_date, t1.gross_sales, ifnull(t1.disc_dollars,0) as disc_sales
from sql_construct t1;
quit;

 

It's not recognizing the function ifnull.... which seems odd to me.

 

TS

Grand Advisor
Posts: 10,210

Re: PROC SQL vs. DATA step aggregating w/ null values

SAS Proc SQL is not exactly anyone else's SQL. IFNULL doesn't exist in SAS Proc SQL. You might try the COALESCE function.

 

BUT you have changed topic from your initial question which involved SUM.

Super User
Super User
Posts: 6,330

Re: PROC SQL vs. DATA step aggregating w/ null values

I would use the SQL aggregation function SUM() first and then deal with the missing values.

 

 

coalesce(sum(var_d),0)

 

Note that IFNULL() is not a valid SAS function.

 

Ask a Question
Discussion stats
  • 4 replies
  • 1614 views
  • 2 likes
  • 3 in conversation