<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: query performance in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235205#M43060</link>
    <description>&lt;P&gt;You can do a pass through to the oracle database:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a001044413.htm" target="_self"&gt;https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a001044413.htm&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then sql the statements will be executed directly on oracle.&lt;/P&gt;</description>
    <pubDate>Wed, 18 Nov 2015 12:19:59 GMT</pubDate>
    <dc:creator>Eva</dc:creator>
    <dc:date>2015-11-18T12:19:59Z</dc:date>
    <item>
      <title>query performance</title>
      <link>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235204#M43059</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;I am running one query in oracle that takes 20-25 minutes to execute but when I execute the same in SAS it takes 4 hours.&lt;BR /&gt;Can anyone please suggest how to check SAS longops and what are the ways to increase query performance.&lt;BR /&gt;Below is Oracle and SAS code&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Oracle Code:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;create table total_db nologging as&lt;BR /&gt;select * from Sales where CAMP='FPP';&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;select product&lt;BR /&gt;,count(unique case when INITCAP(age_ranges)=INITCAP('GenY') then id end) "GenY and Younger[18 thru 35)"&lt;BR /&gt;,count(unique case when INITCAP(age_ranges)=INITCAP('GenX') then id end) "GenX [35 thru 50)"&lt;BR /&gt;,count(unique case when INITCAP(age_ranges)=INITCAP('Boomers') then id end) "Boomers [50 thru 68)"&lt;BR /&gt;,count(unique case when INITCAP(age_ranges)=INITCAP('Older') then id end) "Older [68+)"&lt;BR /&gt;,count(unique case when INITCAP(age_ranges)=INITCAP('UNKNOWN') then id end) "Unknown or &amp;lt; 18"&lt;BR /&gt;,count(unique case when PRES_OF_CHILDREN='Y' then id end) "Yes"&lt;BR /&gt;,count(unique case when PRES_OF_CHILDREN='N' then id end) "No"&lt;BR /&gt;from total_db&lt;BR /&gt;group by product;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SAS Code;&lt;/P&gt;
&lt;P&gt;libname MYDB 'L:\REP';&lt;BR /&gt;libname mydblib oracle user='ABCD' password='ABCD123' path='prod2' schema='CIRC';&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc sql;&lt;BR /&gt;create table MYDB.Total_DB as&lt;BR /&gt;select * from Sales where CAMP='FPP';&lt;BR /&gt;quit;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;libname MYDB 'L:\REP';&lt;BR /&gt;ods CSV file="L:\BIA\brand_age.csv";&lt;/P&gt;
&lt;P&gt;proc sql;&lt;BR /&gt;select product,&lt;BR /&gt;count(distinct case when PROPCASE(age_ranges)=PROPCASE('GenY') then id end) as GenY&lt;BR /&gt;,count(distinct case when PROPCASE(age_ranges)=PROPCASE('GenX') then id end) as GenX&lt;BR /&gt;,count(distinct case when PROPCASE(age_ranges)=PROPCASE('Boomers') then id end) as Boomers&lt;BR /&gt;,count(distinct case when PROPCASE(age_ranges)=PROPCASE('Older') then id end) as Older&lt;BR /&gt;,count(distinct case when PROPCASE(age_ranges)=PROPCASE('UNKNOWN') then id end) as Unknown_age&lt;BR /&gt;,count(distinct case when PRES_OF_CHILDREN='Y' then id end)as Pres_Yes&lt;BR /&gt;,count(distinct case when PRES_OF_CHILDREN='N' then id end) as pres_No&lt;BR /&gt;from MYDB.total_db&lt;BR /&gt;group by product;&lt;BR /&gt;quit;&lt;BR /&gt;ods CSV close;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Please suggest where is the performance issue and how to monitor SAS query status&lt;/P&gt;</description>
      <pubDate>Wed, 18 Nov 2015 12:09:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235204#M43059</guid>
      <dc:creator>CG1</dc:creator>
      <dc:date>2015-11-18T12:09:56Z</dc:date>
    </item>
    <item>
      <title>Re: query performance</title>
      <link>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235205#M43060</link>
      <description>&lt;P&gt;You can do a pass through to the oracle database:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a001044413.htm" target="_self"&gt;https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a001044413.htm&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then sql the statements will be executed directly on oracle.&lt;/P&gt;</description>
      <pubDate>Wed, 18 Nov 2015 12:19:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235205#M43060</guid>
      <dc:creator>Eva</dc:creator>
      <dc:date>2015-11-18T12:19:59Z</dc:date>
    </item>
    <item>
      <title>Re: query performance</title>
      <link>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235206#M43061</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;libname MYDB 'L:\REP';&lt;BR /&gt;ods CSV file="L:\BIA\brand_age.csv";&lt;/P&gt;
&lt;P&gt;proc sql;&lt;BR /&gt;select product,&lt;BR /&gt;count(distinct case when PROPCASE(age_ranges)=PROPCASE('GenY') then id end) as GenY&lt;BR /&gt;,count(distinct case when PROPCASE(age_ranges)=PROPCASE('GenX') then id end) as GenX&lt;BR /&gt;,count(distinct case when PROPCASE(age_ranges)=PROPCASE('Boomers') then id end) as Boomers&lt;BR /&gt;,count(distinct case when PROPCASE(age_ranges)=PROPCASE('Older') then id end) as Older&lt;BR /&gt;,count(distinct case when PROPCASE(age_ranges)=PROPCASE('UNKNOWN') then id end) as Unknown_age&lt;BR /&gt;,count(distinct case when PRES_OF_CHILDREN='Y' then id end)as Pres_Yes&lt;BR /&gt;,count(distinct case when PRES_OF_CHILDREN='N' then id end) as pres_No&lt;BR /&gt;from MYDB.total_db&lt;BR /&gt;group by product;&lt;BR /&gt;quit;&lt;BR /&gt;ods CSV close;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="line-height: 20px;"&gt;above is the query that is taking too much time to execute.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;MYDB.total_db is dataset in SAS.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Problem is not table creation when I am fetching data from&amp;nbsp;&lt;SPAN&gt;MYDB.total_db then it takes 4 hours&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 18 Nov 2015 12:26:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235206#M43061</guid>
      <dc:creator>CG1</dc:creator>
      <dc:date>2015-11-18T12:26:37Z</dc:date>
    </item>
    <item>
      <title>Re: query performance</title>
      <link>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235226#M43062</link>
      <description>&lt;P&gt;It would be better if you posted a sample of your data in the form of a datastep so that we could run it and see. &amp;nbsp;At a guess I would say that the reason its taking so long is all the sub nesting. &amp;nbsp;I would suggest separting the logic out into two blocks. &amp;nbsp;Block one consists of adding flags to the data, based on the logic you supplied, in a datastep. &amp;nbsp;This should be pretty quick, something like:&lt;/P&gt;
&lt;P&gt;data inter;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; set have;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; if upcase(age_ranges)="GENY" then flag1=1;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; if upcase(age_ranges)="GENX" then flag2=1;&lt;/P&gt;
&lt;P&gt;...&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then run one of the&amp;nbsp;stats methods on the data, for example from the documentation:&lt;/P&gt;
&lt;P&gt;ods output nlevels=nlevels;&lt;BR /&gt;proc freq data=sashelp.class nlevels;&lt;BR /&gt;&amp;nbsp; tables name age sex;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One final thing you might want to try is to do the aggregates separately and then join them back to a distinct list of products. &amp;nbsp;I.e. In your code each part of the code - the count, the distinct, the propcase() are all done on every row, do them in a subquery and they should only be done on the relevant data.&lt;/P&gt;
&lt;PRE&gt;proc sql;
  create table WANT as
  select  A.PRODUCT
          ,B.GENX
          ,C.GENY
          ...
  from    (select distinct PRODUCT from HAVE) A
  left join (select distinct PRODUCT,count(distinct ID) as GENX from HAVE where upcase(AGE_RANGES)="GENX") B
  on      A.PRODUCT=B.PRODUCT
  left join (select distinct PRODUCT,count(distinct ID) as GENY from HAVE where upcase(AGE_RANGES)="GENY") C
  on      A.PRODUCT=C.PRODUCT
  ...;
quit;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 18 Nov 2015 13:58:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235226#M43062</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2015-11-18T13:58:47Z</dc:date>
    </item>
    <item>
      <title>Re: query performance</title>
      <link>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235239#M43064</link>
      <description>&lt;P&gt;Why are you copying the data to SAS if that takes longer time? Do the query in Oracle, and direct the output&amp;nbsp;directly to your ODS destination.&lt;/P&gt;
&lt;P&gt;Not sure if PROPCASE can be translated automatically to PL SQL, but you should be able to use UPCASE() instead.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 18 Nov 2015 14:29:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235239#M43064</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2015-11-18T14:29:30Z</dc:date>
    </item>
    <item>
      <title>Re: query performance</title>
      <link>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235268#M43071</link>
      <description>&lt;P&gt;As MYDB.total_db is a local SAS File I suggest writing that to the oracle DB (create table) and then submit the sql statements with oracle pass through. Then everything is done on the database which is much faster.&lt;/P&gt;</description>
      <pubDate>Wed, 18 Nov 2015 15:33:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235268#M43071</guid>
      <dc:creator>Eva</dc:creator>
      <dc:date>2015-11-18T15:33:50Z</dc:date>
    </item>
    <item>
      <title>Re: query performance</title>
      <link>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235273#M43073</link>
      <description>&lt;P&gt;The problem there though, is you then have the overhead of writing the data to the database, and then sending the data back out from the database. &amp;nbsp;So savings on code execution would be lost in read/write. &amp;nbsp;Of course if the data is in the database already, then your right it does make sense to o the processing there and export directly to CSV from the database.&lt;/P&gt;</description>
      <pubDate>Wed, 18 Nov 2015 15:42:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235273#M43073</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2015-11-18T15:42:08Z</dc:date>
    </item>
    <item>
      <title>Re: query performance</title>
      <link>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235276#M43075</link>
      <description>&lt;P&gt;You're right it depends on read/write time. Yet often it is worth trying that's what my experience is. Also depends on size of table. Hopefully the local one is smaller &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 18 Nov 2015 15:52:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235276#M43075</guid>
      <dc:creator>Eva</dc:creator>
      <dc:date>2015-11-18T15:52:31Z</dc:date>
    </item>
    <item>
      <title>Re: query performance</title>
      <link>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235310#M43088</link>
      <description>&lt;P&gt;Ok, there are&amp;nbsp;some things that might be checked.&lt;/P&gt;
&lt;P&gt;First, the performance of the SAS server.&lt;/P&gt;
&lt;P&gt;options fullstimer msglevel=i;&lt;/P&gt;
&lt;P&gt;will give you&amp;nbsp;information on what resources are consumed.&lt;/P&gt;
&lt;P&gt;Compare that with your SORTSIZE and MEMSIZE setting for the SAS session, and of course, physical memory available.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To understand how PROC SQL is planning the query, add the _method option to the PROC SQL statement.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The count(distinct...) construct is quite&amp;nbsp;expensive to execute, so I'm pretty sure that there are things to trim here.&lt;/P&gt;
&lt;P&gt;You got an example from &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13521"&gt;@RW﻿&lt;/a&gt; already. Here's another example (not SAS specific):&lt;/P&gt;
&lt;P&gt;&lt;A href="https://www.periscopedata.com/blog/use-subqueries-to-count-distinct-50x-faster.html" target="_blank"&gt;https://www.periscopedata.com/blog/use-subqueries-to-count-distinct-50x-faster.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 18 Nov 2015 17:11:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235310#M43088</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2015-11-18T17:11:32Z</dc:date>
    </item>
    <item>
      <title>Re: query performance</title>
      <link>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235402#M43098</link>
      <description>&lt;P&gt;Remove the ODS CSV if it is a large file. Writing large CSV files via ODS is not recommended as it is very resource hungry. An alternative is PROC EXPORT.&lt;/P&gt;</description>
      <pubDate>Thu, 19 Nov 2015 00:38:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/query-performance/m-p/235402#M43098</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2015-11-19T00:38:54Z</dc:date>
    </item>
  </channel>
</rss>

