<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Using views to improve performance in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Using-views-to-improve-performance/m-p/400014#M96932</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using views in some data steps, in order to reduce I/O operations and improve performance when using large datasets.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have tested this in some programs, in my program I have a result datasets and other than are intermediate (stored in work library).&lt;/P&gt;
&lt;P&gt;I am changing the intermediate dataset for a view and I have found that if I change this intermediate dataset for a view in one or two steps then the performance is better, but if I made more than two steps with views then the performance is equal than using datasets or even worse. In both situations I create a results dataset (nor view) in the final step.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I would like to know best practices of using views, if there are good noly for one or tow steps or for some type of operations. I am doing data /set with new fileds, joins, agregations.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks in advance&lt;/P&gt;</description>
    <pubDate>Sat, 30 Sep 2017 17:42:54 GMT</pubDate>
    <dc:creator>juanvg1972</dc:creator>
    <dc:date>2017-09-30T17:42:54Z</dc:date>
    <item>
      <title>Using views to improve performance</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-views-to-improve-performance/m-p/400014#M96932</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using views in some data steps, in order to reduce I/O operations and improve performance when using large datasets.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have tested this in some programs, in my program I have a result datasets and other than are intermediate (stored in work library).&lt;/P&gt;
&lt;P&gt;I am changing the intermediate dataset for a view and I have found that if I change this intermediate dataset for a view in one or two steps then the performance is better, but if I made more than two steps with views then the performance is equal than using datasets or even worse. In both situations I create a results dataset (nor view) in the final step.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I would like to know best practices of using views, if there are good noly for one or tow steps or for some type of operations. I am doing data /set with new fileds, joins, agregations.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks in advance&lt;/P&gt;</description>
      <pubDate>Sat, 30 Sep 2017 17:42:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-views-to-improve-performance/m-p/400014#M96932</guid>
      <dc:creator>juanvg1972</dc:creator>
      <dc:date>2017-09-30T17:42:54Z</dc:date>
    </item>
    <item>
      <title>Re: Using views to improve performance</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-views-to-improve-performance/m-p/400019#M96935</link>
      <description>&lt;P&gt;If you re-use a data set view, then it is reconstructed each time from the original source, which is probably why you don't see expected benefits sometimes when using the view multiple times.&amp;nbsp; This would be especially true when the view is a merge of multiple datasets, or the view is a small subset of the original.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One way to mitigate this problem is to &lt;EM&gt;&lt;STRONG&gt;simultaneously define a matching data set view (VNEED) and a data set file (NEED)&lt;/STRONG&gt;&lt;/EM&gt;, as below.&amp;nbsp; The first use (proc freq) call VNEED, which in the background&amp;nbsp;would read BIGDATA, stream in VNEED, and write NEED.&amp;nbsp; Note the proc freq doesn't have to wait for dataset NEED to be completely generated.&amp;nbsp; The subsequent univariate proc would read dataset file NEED.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; data need&amp;nbsp; vneed / view=vneed;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; set bigdata;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;where ....;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ....&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; proc freq data=vneed;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; tables .... ;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;proc univariate data=need;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ....&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;The input/output cost of the above&amp;nbsp;is that&amp;nbsp;
&lt;OL&gt;
&lt;LI&gt;BIGDATA is read once&lt;/LI&gt;
&lt;LI&gt;NEED is written to disk once, and&lt;/LI&gt;
&lt;LI&gt;NEED is read from disk once&lt;/LI&gt;
&lt;/OL&gt;
&lt;/LI&gt;
&lt;LI&gt;But if you used view both times, then&lt;/LI&gt;
&lt;OL&gt;
&lt;LI&gt;BIGDATA is read twice&lt;/LI&gt;
&lt;/OL&gt;
&lt;LI&gt;Or&amp;nbsp;if you used the data set file NEED twice, then
&lt;OL&gt;
&lt;LI&gt;BIGDATA is read once&lt;/LI&gt;
&lt;LI&gt;NEED is written to disk once&lt;/LI&gt;
&lt;LI&gt;NEED&amp;nbsp;is read from disk twice&lt;/LI&gt;
&lt;/OL&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;So depending on the relative size&amp;nbsp;of bigdata vs need, you might save considerable time by using the data file for 2nd and further access, and use the view only for the first access.&lt;/P&gt;</description>
      <pubDate>Sat, 30 Sep 2017 18:43:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-views-to-improve-performance/m-p/400019#M96935</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2017-09-30T18:43:58Z</dc:date>
    </item>
  </channel>
</rss>

