<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: proc sql summary report in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/proc-sql-summary-report/m-p/917891#M361574</link>
    <description>&lt;P&gt;Try PROC SUMMARY. I would strongly advise you abandon the idea of doing lots of statistics across lots of groups using PROC SQL, especially for HUGE data sets. PROC SUMMARY ought to run faster than PROC SQL in this task.&amp;nbsp;Just because you can do it in PROC SQL does not mean you should do it in PROC SQL.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=have nway;
    class z1 z2 yyyymm customer_id;
    var amount;
    output out=sums sum= n=n;
run;

proc summary data=sums nway;
    class z1 z2 yyyymm;
    var amount n;
    output out=want sum(amount n)=amount nr_rows;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In the output data set WANT, the value of _FREQ_ is the number of unique customers for each grouping of Z1 Z2 YYYYMM.&lt;/P&gt;</description>
    <pubDate>Mon, 26 Feb 2024 12:14:17 GMT</pubDate>
    <dc:creator>PaigeMiller</dc:creator>
    <dc:date>2024-02-26T12:14:17Z</dc:date>
    <item>
      <title>proc sql summary report</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sql-summary-report/m-p/917867#M361560</link>
      <description>&lt;P&gt;Hello&lt;/P&gt;
&lt;P&gt;I have a data set with 25 million rows and 15 columns.&lt;/P&gt;
&lt;P&gt;I am using proc sql to calculate aggregation statistics .&lt;/P&gt;
&lt;P&gt;The query is running very slowly (It takes long time to complete run the query).&lt;/P&gt;
&lt;P&gt;My question- May anyone suggest another code that run quickly?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
proc sql;
create table want as
select  Z1,Z2,YYYYMM,
       count(distinct customer_ID) as nr_customers,&lt;BR /&gt;       count(*) as nr_rows,
	   sum(amount)/1000000 as amount_in_Millions
from  have
group by Z1,Z2,YYYYMM
;
quit;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 26 Feb 2024 06:19:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sql-summary-report/m-p/917867#M361560</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-02-26T06:19:18Z</dc:date>
    </item>
    <item>
      <title>Re: proc sql summary report</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sql-summary-report/m-p/917874#M361564</link>
      <description>&lt;P&gt;When dealing with optimization, a string point is to add&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;options msglevel=i stimer;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;And provide the log.&lt;/P&gt;
&lt;P&gt;To your code, to better understand what the potential bottlenecks.&lt;/P&gt;
&lt;P&gt;When ti comes to you code, it's not much you can about it.&lt;/P&gt;
&lt;P&gt;You could try PROC SUMMARY instead, it's quicker in some instances.&lt;/P&gt;
&lt;P&gt;Assuming your data is in Base, state if this is not the case.&lt;/P&gt;
&lt;P&gt;Make sure you can use as much RAM as possible, and potentially increasing MEMSIZE snd SORTSIZE.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 26 Feb 2024 07:43:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sql-summary-report/m-p/917874#M361564</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2024-02-26T07:43:10Z</dc:date>
    </item>
    <item>
      <title>Re: proc sql summary report</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sql-summary-report/m-p/917891#M361574</link>
      <description>&lt;P&gt;Try PROC SUMMARY. I would strongly advise you abandon the idea of doing lots of statistics across lots of groups using PROC SQL, especially for HUGE data sets. PROC SUMMARY ought to run faster than PROC SQL in this task.&amp;nbsp;Just because you can do it in PROC SQL does not mean you should do it in PROC SQL.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=have nway;
    class z1 z2 yyyymm customer_id;
    var amount;
    output out=sums sum= n=n;
run;

proc summary data=sums nway;
    class z1 z2 yyyymm;
    var amount n;
    output out=want sum(amount n)=amount nr_rows;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In the output data set WANT, the value of _FREQ_ is the number of unique customers for each grouping of Z1 Z2 YYYYMM.&lt;/P&gt;</description>
      <pubDate>Mon, 26 Feb 2024 12:14:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sql-summary-report/m-p/917891#M361574</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2024-02-26T12:14:17Z</dc:date>
    </item>
    <item>
      <title>Re: proc sql summary report</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sql-summary-report/m-p/917915#M361584</link>
      <description>&lt;P&gt;Try a DATA step method:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort
  data=have (keep=z1 z2 yyyymm customer_id amount)
  out=int
;
by z1 z2 yyyymm customer_id;
run;

data want;
set int;
by z1 z2 yyyymm customer_id;
retain nr_rows amount_in_millions;
if first.yyyymm
then do;
  nr_rows = 0;
  amount_in_millions = 0;
end;
if first.customer_id then nr_rows + 1;
amount_in_millions + amount;
if last.yyyymm
then do;
  amount_in_millions = amount_in_millions / 1000000;
  output;
end;
drop amount customer_id;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 26 Feb 2024 15:19:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sql-summary-report/m-p/917915#M361584</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2024-02-26T15:19:58Z</dc:date>
    </item>
  </channel>
</rss>

