<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: PROC SQL update always slow? Why? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507701#M136299</link>
    <description>&lt;P&gt;But then that is using the SQL Server parser, and pass through is it not?&amp;nbsp; &amp;nbsp;My impression of the question was that he was using SQL within SAS alone.&amp;nbsp; If its running on a DB then yes, it should be a lot quicker.&lt;/P&gt;</description>
    <pubDate>Fri, 26 Oct 2018 10:49:44 GMT</pubDate>
    <dc:creator>RW9</dc:creator>
    <dc:date>2018-10-26T10:49:44Z</dc:date>
    <item>
      <title>PROC SQL update always slow? Why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507657#M136281</link>
      <description>&lt;P&gt;I am running some proc sql statements which first uses alters the table a new column and then updates the column.&lt;/P&gt;&lt;P&gt;This is taking forever, it is faster to do a select all, even though it contains 50+ columns and create a new column in the select statement. This does not make sense.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried finding an answer to this and the only thing I found was this comment&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;"&lt;SPAN class="comment-copy"&gt;FYI - SQL Update unfortunately as supported in SAS is almost always terrible. SQL has many uses in SAS, but update should almost never be one, unless it's a single value update that's not sourced from a table. For whatever reason, that&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;I&gt;always&lt;/I&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;seems to be incredibly slow, even for not-all-that-large datasets, and even for updates that in SQL Server are quite quick.&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;–&amp;nbsp;&lt;/SPAN&gt;&lt;A title="55,090 reputation" href="https://stackoverflow.com/users/1623007/joe" target="_blank"&gt;Joe&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="comment-date"&gt;&lt;SPAN class="relativetime-clean"&gt;&lt;A href="https://stackoverflow.com/questions/30708324/proc-sql-update-efficiency-for-large-datasets#comment49571438_30708324" target="_blank"&gt;Jun 10 '15 at 14:27&lt;/A&gt;"&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="comment-date"&gt;&lt;SPAN class="relativetime-clean"&gt;From&amp;nbsp;&lt;A href="https://stackoverflow.com/questions/30708324/proc-sql-update-efficiency-for-large-datasets" target="_blank"&gt;https://stackoverflow.com/questions/30708324/proc-sql-update-efficiency-for-large-datasets&lt;/A&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="comment-date"&gt;&lt;SPAN class="relativetime-clean"&gt;How come is update so inefficient?&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 26 Oct 2018 07:16:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507657#M136281</guid>
      <dc:creator>Jeg123</dc:creator>
      <dc:date>2018-10-26T07:16:11Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL update always slow? Why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507669#M136284</link>
      <description>&lt;P&gt;Proc SQL can be slow all the time, especially with large data.&amp;nbsp; The reason, its not the primary language, it is a sub component.&amp;nbsp; Its provide as a procedure with a basic SQL compiler.&amp;nbsp; Base SAS is the programming language, and is optimised for data processing.&amp;nbsp; It is always recommended to use Base SAS above SQL, unless there is any specific benefits to using SQL.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As for why, put _method and _tree on and see what the SQL compiler is doing;&lt;/P&gt;
&lt;P&gt;&lt;A href="http://www2.sas.com/proceedings/sugi30/101-30.pdf" target="_blank"&gt;http://www2.sas.com/proceedings/sugi30/101-30.pdf&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;I suspect its likely reading/writing each observation individually as there is no PDV.&amp;nbsp; Using a datastep reads each observation into the PDV then writes it out, being more efficient.&amp;nbsp; SQL does not have that, and has none of the tricks that databases use to speed up code running.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Another point, if your aiming to use SQL a lot, then look at normalising your data.&amp;nbsp; SQL works best with few columns and many rows.&amp;nbsp; 50+ columns is a lot of columns.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 26 Oct 2018 08:08:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507669#M136284</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2018-10-26T08:08:28Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL update always slow? Why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507696#M136295</link>
      <description>&lt;P&gt;Please post your full log including elapsed time and row numbers.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I've had good success getting reasonable performance with SQL Server updates by choosing an appropriate method along with tweaking database update options.&lt;/P&gt;</description>
      <pubDate>Fri, 26 Oct 2018 10:37:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507696#M136295</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2018-10-26T10:37:22Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL update always slow? Why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507701#M136299</link>
      <description>&lt;P&gt;But then that is using the SQL Server parser, and pass through is it not?&amp;nbsp; &amp;nbsp;My impression of the question was that he was using SQL within SAS alone.&amp;nbsp; If its running on a DB then yes, it should be a lot quicker.&lt;/P&gt;</description>
      <pubDate>Fri, 26 Oct 2018 10:49:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507701#M136299</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2018-10-26T10:49:44Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL update always slow? Why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507764#M136317</link>
      <description>&lt;P&gt;Here is a example&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Create random data set&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data A;
call streaminit(123);     
do i = 1 to 100000000;
   u = rand("Uniform");    
   output;
end;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Some random query which creates a new column and also writes a whole new data set&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table B AS
	SELECT
		*
	,	1 AS id
FROM A
;quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Perform same as above, but using alter table&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
alter table A
	add id num
;quit;

proc sql;
update A
	set id = monotonic()
;quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Solution 1: 10.2 seconds&lt;/P&gt;&lt;P&gt;Solution 2: 44 seconds&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It takes 4x longer, even though the first solution also has to copy all the additional columns. This is also exponentially increasing, increasing to 100 million rows it takes 40 seconds vs 7.5 minutes.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 26 Oct 2018 14:44:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507764#M136317</guid>
      <dc:creator>Jeg123</dc:creator>
      <dc:date>2018-10-26T14:44:30Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL update always slow? Why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507917#M136357</link>
      <description>&lt;P&gt;Monotonic is an unsupported SAS function. Does it perform similarly with a supported function like rand?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I confess I have only used your solution 1 so I've never struck the problem in solution 2. Using SET in SAS SQL is seriously limiting as a separate sub-query is required to populate each SET from another dataset. There are better ways to update in SAS.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I update mainly with DATA steps and that certainly provides best performance with SAS data.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 26 Oct 2018 22:54:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507917#M136357</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2018-10-26T22:54:01Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL update always slow? Why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507952#M136371</link>
      <description>&lt;P&gt;I see that my example was wrong, as I am using a constant 1 in one example and then monotonic() in another. But you can replace the&lt;/P&gt;&lt;P&gt;monotonic with 1 and experience the same slowness.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Do you mind showing a short example how you do when you update a column in data step?&lt;/P&gt;</description>
      <pubDate>Sat, 27 Oct 2018 06:12:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/507952#M136371</guid>
      <dc:creator>Jeg123</dc:creator>
      <dc:date>2018-10-27T06:12:32Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL update always slow? Why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/508264#M136480</link>
      <description>&lt;P&gt;The equivalent of the monotonic() example would be:&lt;/P&gt;
&lt;PRE&gt;data a;&lt;BR /&gt;  set a;&lt;BR /&gt;  id=_n_;&lt;BR /&gt;run;&lt;/PRE&gt;
&lt;P&gt;_n_ is an automatic internal variable which is observation number.&lt;/P&gt;
&lt;P&gt;It is very easy to update, alter, manipulate, data using datastep language - it is at the end of the day&amp;nbsp;&lt;U&gt;&lt;STRONG&gt;the basis of the whole SAS system.&lt;/STRONG&gt;&lt;/U&gt;&amp;nbsp; The SQL is a simple ANSI parser component accessed through Base SAS and hence is never going to be a good as the base language.&amp;nbsp; Think of it like a VB script to generate a C++ program.&lt;/P&gt;</description>
      <pubDate>Mon, 29 Oct 2018 08:32:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/508264#M136480</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2018-10-29T08:32:39Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL update always slow? Why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/508276#M136489</link>
      <description>&lt;P&gt;Aware of the _n_. But is this really the "proper" way to add a column using&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;data a;
  set a;
  id=_n_;
run;&lt;/PRE&gt;&lt;P&gt;From my understanding, the above will overwrite a total new file, which seems inefficient when only adding one column. Or is this the "correct" way to add a new column?&lt;/P&gt;</description>
      <pubDate>Mon, 29 Oct 2018 09:15:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/508276#M136489</guid>
      <dc:creator>Jeg123</dc:creator>
      <dc:date>2018-10-29T09:15:39Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL update always slow? Why?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/508281#M136494</link>
      <description>&lt;P&gt;That depends on scenario, however in general, yes, that is the proper way to add variables.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In certain circumstances the datastep update function can be applied:&lt;/P&gt;
&lt;P&gt;&lt;A href="http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202975.htm" target="_blank"&gt;http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202975.htm&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And in big data scenarios then the whole programming part tends to be quite different.&amp;nbsp; However in most programming, the datastep is the way to go.&amp;nbsp; It will also be the most efficient as its a one directional stream.&amp;nbsp; SQL creates intermediary steps where the data is sorted, merged and otherwise processed, which can lead to lots of reads and writes.&amp;nbsp; On a proper database there are processes in place to handle this.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can find a whole raft of videos about most SAS related items here:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://video.sas.com/category/videos/how-to-tutorials" target="_blank"&gt;https://video.sas.com/category/videos/how-to-tutorials&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 29 Oct 2018 09:32:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-update-always-slow-Why/m-p/508281#M136494</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2018-10-29T09:32:59Z</dc:date>
    </item>
  </channel>
</rss>

