<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic PROC SQL vs. DATA step aggregating w/ null values in SAS Studio</title>
    <link>https://communities.sas.com/t5/SAS-Studio/PROC-SQL-vs-DATA-step-aggregating-w-null-values/m-p/276368#M628</link>
    <description>&lt;P&gt;Howdy,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is probably common when coding but&amp;nbsp;I find myself switching between using a sql or a sas/data step depending on what I am most comfortable with. However, one of my pet-peeves is wanting to understand how to accomplish the same code output via both avenues. An example would be aggregation of a metric based on criteria via "group by" with SQL and "retain" with SAS.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Currently I'm using sql trying to select multiple variables, sum a couple of them on the fly, and deal with nulls, within my select statement. Using the example below, "var_d" contains both numeric values and null (.) values. Summing var_d without addressing the nulls produces nulls, numeric + null = null.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc sql;&lt;/P&gt;
&lt;P&gt;create table demo_table as&lt;/P&gt;
&lt;P&gt;select t1.var_a, t1.var_b, t1.var_c, sum(t1.var_d) as var_sumd&lt;/P&gt;
&lt;P&gt;from work.another_table t1&lt;/P&gt;
&lt;P&gt;group by t1.var_a, t1.var_b;&lt;/P&gt;
&lt;P&gt;quit;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="pun"&gt;Wondering what best pratice would be. I'll re-iterate, desired result. I'd like to be able to take variable "var_d", replace nulls with 0's, then sum "var_d" based at the desired level.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;within a data step I could simply redefine a variable, and then proceed with summation&lt;/P&gt;
&lt;P&gt;if var_d = . then var_d = 0;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;within proc sql I become a little confused, as it seems like I'd have to accomplish everything within my select statement. I was thinking Icould redefine via case/when or ifnull or maybe coalesce...(I think)&lt;/P&gt;
&lt;P&gt;sum(ifnull(var_d),0)) as var_sum&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="pln"&gt;sum&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;(&lt;/SPAN&gt;&lt;SPAN class="kwd"&gt;coalesce&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;(&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;var_d&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;,&lt;/SPAN&gt;&lt;SPAN class="lit"&gt;0&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;)) as var_sum&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="pun"&gt;sum(case when ??&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="pun"&gt;Thoughts? If I've been unclear I can rectify&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="pun"&gt;TS&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 09 Jun 2016 21:03:06 GMT</pubDate>
    <dc:creator>GalacticAbacus</dc:creator>
    <dc:date>2016-06-09T21:03:06Z</dc:date>
    <item>
      <title>PROC SQL vs. DATA step aggregating w/ null values</title>
      <link>https://communities.sas.com/t5/SAS-Studio/PROC-SQL-vs-DATA-step-aggregating-w-null-values/m-p/276368#M628</link>
      <description>&lt;P&gt;Howdy,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is probably common when coding but&amp;nbsp;I find myself switching between using a sql or a sas/data step depending on what I am most comfortable with. However, one of my pet-peeves is wanting to understand how to accomplish the same code output via both avenues. An example would be aggregation of a metric based on criteria via "group by" with SQL and "retain" with SAS.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Currently I'm using sql trying to select multiple variables, sum a couple of them on the fly, and deal with nulls, within my select statement. Using the example below, "var_d" contains both numeric values and null (.) values. Summing var_d without addressing the nulls produces nulls, numeric + null = null.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc sql;&lt;/P&gt;
&lt;P&gt;create table demo_table as&lt;/P&gt;
&lt;P&gt;select t1.var_a, t1.var_b, t1.var_c, sum(t1.var_d) as var_sumd&lt;/P&gt;
&lt;P&gt;from work.another_table t1&lt;/P&gt;
&lt;P&gt;group by t1.var_a, t1.var_b;&lt;/P&gt;
&lt;P&gt;quit;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="pun"&gt;Wondering what best pratice would be. I'll re-iterate, desired result. I'd like to be able to take variable "var_d", replace nulls with 0's, then sum "var_d" based at the desired level.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;within a data step I could simply redefine a variable, and then proceed with summation&lt;/P&gt;
&lt;P&gt;if var_d = . then var_d = 0;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;within proc sql I become a little confused, as it seems like I'd have to accomplish everything within my select statement. I was thinking Icould redefine via case/when or ifnull or maybe coalesce...(I think)&lt;/P&gt;
&lt;P&gt;sum(ifnull(var_d),0)) as var_sum&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="pln"&gt;sum&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;(&lt;/SPAN&gt;&lt;SPAN class="kwd"&gt;coalesce&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;(&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;var_d&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;,&lt;/SPAN&gt;&lt;SPAN class="lit"&gt;0&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;)) as var_sum&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="pun"&gt;sum(case when ??&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="pun"&gt;Thoughts? If I've been unclear I can rectify&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="pun"&gt;TS&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jun 2016 21:03:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/PROC-SQL-vs-DATA-step-aggregating-w-null-values/m-p/276368#M628</guid>
      <dc:creator>GalacticAbacus</dc:creator>
      <dc:date>2016-06-09T21:03:06Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL vs. DATA step aggregating w/ null values</title>
      <link>https://communities.sas.com/t5/SAS-Studio/PROC-SQL-vs-DATA-step-aggregating-w-null-values/m-p/276397#M629</link>
      <description>&lt;P&gt;Are you working on SAS data sets or another database?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The only way that SAS generates a missing value using a native SAS dataset for sum in code like your example is for all values of the summed variable to be missing for the group.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data junk;
   input group $ x;
datalines;
a   .
a   1
a   1
b   2
b   2
b   2
;
run;

proc sql;
   create table summary as
   select group, sum(x) as total
   from junk
   group by group;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The sum for group A is 2 not missing.&lt;/P&gt;
&lt;P&gt;The SUM for Proc Report, Tabulate, Means and Summary will likewise only give a missing for Sum statistice if all values are missing for the group when grouping occurs in the procedure.&lt;/P&gt;</description>
      <pubDate>Fri, 10 Jun 2016 02:01:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/PROC-SQL-vs-DATA-step-aggregating-w-null-values/m-p/276397#M629</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2016-06-10T02:01:03Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL vs. DATA step aggregating w/ null values</title>
      <link>https://communities.sas.com/t5/SAS-Studio/PROC-SQL-vs-DATA-step-aggregating-w-null-values/m-p/276496#M632</link>
      <description>&lt;P&gt;per the previous question, I am working within SAS, on a SAS dataset. I'm wanting to execute proc sql against it.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm having issues with the ifnull(t1.disc_dollars, 0) as disc_sales... after reviewing the syntax in the SAS documentation this should work....&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc sql;&lt;BR /&gt;create table sql_construct_b as&lt;BR /&gt;select t1.user_id, t1.store_number, t1.transaction_date, t1.gross_sales, ifnull(t1.disc_dollars,0) as disc_sales&lt;BR /&gt;from sql_construct t1;&lt;BR /&gt;quit;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It's not recognizing the function ifnull.... which seems odd to me.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;TS&lt;/P&gt;</description>
      <pubDate>Fri, 10 Jun 2016 13:44:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/PROC-SQL-vs-DATA-step-aggregating-w-null-values/m-p/276496#M632</guid>
      <dc:creator>GalacticAbacus</dc:creator>
      <dc:date>2016-06-10T13:44:30Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL vs. DATA step aggregating w/ null values</title>
      <link>https://communities.sas.com/t5/SAS-Studio/PROC-SQL-vs-DATA-step-aggregating-w-null-values/m-p/276526#M633</link>
      <description>&lt;P&gt;SAS Proc SQL is not exactly anyone else's SQL. IFNULL doesn't exist in SAS Proc SQL. You might try the COALESCE function.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;BUT you have changed topic from your initial question which involved SUM.&lt;/P&gt;</description>
      <pubDate>Fri, 10 Jun 2016 15:25:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/PROC-SQL-vs-DATA-step-aggregating-w-null-values/m-p/276526#M633</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2016-06-10T15:25:45Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL vs. DATA step aggregating w/ null values</title>
      <link>https://communities.sas.com/t5/SAS-Studio/PROC-SQL-vs-DATA-step-aggregating-w-null-values/m-p/276548#M634</link>
      <description>&lt;P&gt;I would use the SQL aggregation function SUM() first and then deal with the missing values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;coalesce(sum(var_d),0)
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note that IFNULL() is not a valid SAS function.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Jun 2016 16:42:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/PROC-SQL-vs-DATA-step-aggregating-w-null-values/m-p/276548#M634</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2016-06-10T16:42:39Z</dc:date>
    </item>
  </channel>
</rss>

