<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Reset observation pointer in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/719252#M222688</link>
    <description>&lt;P&gt;What is it that you are actually trying to do? How many DATEs are there?&amp;nbsp; Once you've split VALUE by all of those "factors" what are you going to do with it? Do you really need to re-write that large cartesian product back out into a disk file?&amp;nbsp; Why not just finish your calculations and just save the statistic that summarizes the result?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 15 Feb 2021 00:37:13 GMT</pubDate>
    <dc:creator>Tom</dc:creator>
    <dc:date>2021-02-15T00:37:13Z</dc:date>
    <item>
      <title>Reset observation pointer</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/718889#M222516</link>
      <description>&lt;P&gt;What I need to do, boils down (more or less) to joining two huge datasets &lt;STRONG&gt;A&lt;/STRONG&gt; and &lt;STRONG&gt;B&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;The first approach is of course:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;proc sql;
  create table AB as select * from A join B on 1;
quit;&lt;/PRE&gt;&lt;P&gt;But this is infeasible due to memory restrictions.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The obvious alternative is to loop, for each observation of &lt;STRONG&gt;A&lt;/STRONG&gt;, through each observation of &lt;STRONG&gt;B&lt;/STRONG&gt;. My current naive approach (ab)uses the &lt;STRONG&gt;POINT&lt;/STRONG&gt; option of the &lt;STRONG&gt;SET&lt;/STRONG&gt; statement:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;data AB;
  set A;
  do point_B = 1 to nobs_B;
    set B point = point_B nobs = nobs_B;
    output;
  end;
run;&lt;/PRE&gt;&lt;P&gt;This works - but is incredibly slow. Certainly, this can be remedied by replacing the random access of &lt;STRONG&gt;POINT&lt;/STRONG&gt; by sequential access. Unfortunately, I was not able to figure out how this can be achieved. So any suggestions are greatly appreciated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For instance, what would be nice to have is some option &lt;STRONG&gt;RESET&lt;/STRONG&gt; (similar to &lt;STRONG&gt;KEYRESET&lt;/STRONG&gt;) making the following code work after uncommenting &lt;CODE class=" language-sas"&gt;reset = end_A&lt;/CODE&gt;:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data AB;
  do while(not end_A);
    set A end = end_A;
    do while(not end_B);
      set B end = end_B /* reset = end_A */;
      output;
    end;
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;(If there is no other solution, I might have to split &lt;STRONG&gt;A&lt;/STRONG&gt; into multiple smaller datasets first...)&lt;/P&gt;</description>
      <pubDate>Fri, 12 Feb 2021 13:54:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/718889#M222516</guid>
      <dc:creator>Hugochum</dc:creator>
      <dc:date>2021-02-12T13:54:53Z</dc:date>
    </item>
    <item>
      <title>Re: Reset observation pointer</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/718894#M222520</link>
      <description>&lt;P&gt;My first two questions when I see a problem like this are:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1) Why do you want to create a cartesian product? Usually this is part of a bigger problem.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2) What does your data look like?&lt;/P&gt;</description>
      <pubDate>Fri, 12 Feb 2021 14:24:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/718894#M222520</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2021-02-12T14:24:23Z</dc:date>
    </item>
    <item>
      <title>Re: Reset observation pointer</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/718906#M222527</link>
      <description>&lt;P&gt;Roughly, the data has the following form:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data A;
  /* ... */
  client_id = 1234567890; value = 42; output;
  /* ... */
run;

data B;
  /* ... */
  date = '01JAN1997'd; factor = 0.2; output;
  /* ... */
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The dataset to store should list a &lt;CODE class=" language-sas"&gt;score&lt;/CODE&gt; for each &lt;CODE class=" language-sas"&gt;client_id&lt;/CODE&gt; and &lt;CODE class=" language-sas"&gt;date&lt;/CODE&gt;, like so:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data AB;
  set A;
  do point_B = 1 to nobs_B;
    set B point = point_B nobs = nobs_B;
    score = factor * value;
    output;
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Probably there is an adhoc solution for this concrete problem (like splitting &lt;STRONG&gt;A&lt;/STRONG&gt; into chunks). But it would be quite instructive also for future applications to know how to efficiently loop through multiple datasets.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 12 Feb 2021 15:04:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/718906#M222527</guid>
      <dc:creator>Hugochum</dc:creator>
      <dc:date>2021-02-12T15:04:04Z</dc:date>
    </item>
    <item>
      <title>Re: Reset observation pointer</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/719245#M222681</link>
      <description>&lt;P&gt;1. What sorts of volumes are we talking about here? Which is is the smaller table (in bytes: nb rows * row length)?&lt;/P&gt;
&lt;P&gt;2. What will the resulting Cartesian product data set be used for?&lt;/P&gt;
&lt;P&gt;3. When dealing with very large tables, it is usually a good idea to store the data set in SPDE format with binary compression, in order to save space and to lower I/Os&lt;/P&gt;
&lt;P&gt;4. To achieve your goal you can replace the POINT= iteration with a hash table iteration. This is much faster.&lt;/P&gt;
&lt;P&gt;5. If you don't have enough memory to load any of the data sets in a hash table (hash table have a sizeable overhead), try using POINT= after loading a table in memory using SASFILE.&lt;/P&gt;
&lt;P&gt;6. If you can't load a full table in memory, you'll have to split the join into smaller chunks.&lt;/P&gt;
&lt;P&gt;7. You could also not join anything:&lt;/P&gt;
&lt;P&gt;- create a format with the value of FACTOR for every date (use option CNTLOUT=).&amp;nbsp;&lt;BR /&gt;- add a DO loop in the data step that loops through all the known dates and retrieves the value for FACTOR for each iteration.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 14 Feb 2021 23:07:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/719245#M222681</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2021-02-14T23:07:14Z</dc:date>
    </item>
    <item>
      <title>Re: Reset observation pointer</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/719252#M222688</link>
      <description>&lt;P&gt;What is it that you are actually trying to do? How many DATEs are there?&amp;nbsp; Once you've split VALUE by all of those "factors" what are you going to do with it? Do you really need to re-write that large cartesian product back out into a disk file?&amp;nbsp; Why not just finish your calculations and just save the statistic that summarizes the result?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 15 Feb 2021 00:37:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/719252#M222688</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2021-02-15T00:37:13Z</dc:date>
    </item>
    <item>
      <title>Re: Reset observation pointer</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/719680#M222870</link>
      <description>&lt;P&gt;I have never encountered SASFILE!&amp;nbsp; I am glad I subscribed to this post.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 16 Feb 2021 16:59:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/719680#M222870</guid>
      <dc:creator>PhilC</dc:creator>
      <dc:date>2021-02-16T16:59:05Z</dc:date>
    </item>
    <item>
      <title>Re: Reset observation pointer</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/719718#M222876</link>
      <description>&lt;P&gt;Yes, one hidden gem that is underused for sure. Another is SPDE. More good stuff in my book! &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 16 Feb 2021 20:42:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reset-observation-pointer/m-p/719718#M222876</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2021-02-16T20:42:23Z</dc:date>
    </item>
  </channel>
</rss>

