<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Run code sequential in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536878#M6682</link>
    <description>&lt;P&gt;You are doing a cartesian product of a data set with itself within each pty_id/sub_channel combination.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So let's say, out of your 8 billion records, you have 10 records of a specific subchannel within a specific pty_id, for which the Cartesian product would be 100 records.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So is it really your intention to produce and count 100 records, grouped&amp;nbsp;by columns 1,2,3 ...?&amp;nbsp;&amp;nbsp; I do not understand the purpose of your code.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also you have "a.subchannel=&amp;amp;subchannel and b.subchannel=&amp;amp;subchannel".&amp;nbsp; Why not apply the subchannel filter as&amp;nbsp;a "where=" parameter to the data set, as in&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;   from gmog.loaded_detail_&amp;amp;livedate (where=(subchannel=&amp;amp;subchannel)) a ,
            gmog.loaded_detail_&amp;amp;livedate (where=(subchannel=&amp;amp;subchannel)) b 
    where a.pty_id=b.pty_id
    group by 1,2,3 ...
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This forces the sas data engine to do the subchannel filtering rather than reading in a record only to have the filter applied later on by sql.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Or better yet, do&amp;nbsp;all your subchannels in one pass by&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;making the "where=" parameters to include all the subchannels of interest&lt;/LI&gt;
&lt;LI&gt;modify the where clause to force equality of a.subchannel with b.subchannel&lt;/LI&gt;
&lt;LI&gt;add subchannel as a group dimension:&lt;/LI&gt;
&lt;/OL&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;    from gmog.loaded_detail_&amp;amp;livedate (where=(subchannel in (chn1,chn2,...chn141))) a ,
         gmog.loaded_detail_&amp;amp;livedate (where=(subchannel in (chn1,chn2,...chn141))) b 
    where a.pty_id=b.pty_id and a.subchannel=b.subchannel
    group by subchannel,1,2,3,..
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Perhaps you are running into problems with the resources needed for the Cartesian crossing covering all 41 subchannels.&amp;nbsp; If so, then change the "where subchannel in (....)" parameters to include a shorter list of subchannels.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But there may be more efficient ways to do this.&amp;nbsp; Here are some questions whose answers might enable a more efficient program:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Are the data sorted by pty_id?&amp;nbsp; by subchannel?&amp;nbsp; or both?&lt;/LI&gt;
&lt;LI&gt;Does every&amp;nbsp;pty_id have only one subchannel?&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;I ask this because - if you really are trying to do Cartesian product of a dataset with itself, it might be easier to pass through the data once to get frequencies of each combination of subchannel, col1, col2, ...&amp;nbsp; for every pty_id, generating a possibly much smaller aggregate dataset.&amp;nbsp; Then you could to a Cartesian self-crossing of that aggregate data, where you apply a weight based on the original frequencies.&amp;nbsp; Possibly much faster.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But this conjecture just makes me wonder what measure you are trying to develop by this program.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In other words, what do you mean by "overlap"?&lt;/P&gt;</description>
    <pubDate>Tue, 19 Feb 2019 19:42:46 GMT</pubDate>
    <dc:creator>mkeintz</dc:creator>
    <dc:date>2019-02-19T19:42:46Z</dc:date>
    <item>
      <title>Run code in parallel</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536846#M6675</link>
      <description>&lt;P&gt;I have 8Billion records and I have to do a self-join to the same data to get an overlap. This code needs to run approximately 41 times. Each iteration is taking approximately 2 hours. Is there a way to run it parallely.. like 5 channels at a time rather than wait for each one to complete?&lt;/P&gt;&lt;P&gt;Any suggestions are appreciated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;%macro mcchnl(channel);&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;create table&amp;nbsp; &amp;amp;channel._overlap as&lt;/P&gt;&lt;P&gt;select&lt;/P&gt;&lt;P&gt;var1,&lt;/P&gt;&lt;P&gt;var2,&lt;/P&gt;&lt;P&gt;var3,&lt;/P&gt;&lt;P&gt;var4,&lt;/P&gt;&lt;P&gt;..&lt;/P&gt;&lt;P&gt;..&lt;/P&gt;&lt;P&gt;..&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;count(1) as overlap_cnt&lt;/P&gt;&lt;P&gt;from gmog.loaded_detail_&amp;amp;livedate. a , gmog.loaded_detail_&amp;amp;livedate.&amp;nbsp; b&amp;nbsp;&lt;/P&gt;&lt;P&gt;where a.pty_id=b.pty_id&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; and a.sub_channel=&amp;amp;channel. and b.sub_channel=&amp;amp;channel.&lt;/P&gt;&lt;P&gt;group by 1,2,3,…………;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;%mend;&lt;/P&gt;&lt;P&gt;%mcchnl(chn1);&lt;/P&gt;&lt;P&gt;%mcchnl(chn2);&lt;/P&gt;&lt;P&gt;..&lt;/P&gt;&lt;P&gt;..&lt;/P&gt;&lt;P&gt;%mcchnl(chnl41);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 19 Feb 2019 19:36:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536846#M6675</guid>
      <dc:creator>Tanvi99</dc:creator>
      <dc:date>2019-02-19T19:36:52Z</dc:date>
    </item>
    <item>
      <title>Re: Run code sequential</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536864#M6678</link>
      <description>&lt;P&gt;SAS has many ways to count things.&amp;nbsp; Perhaps SQL is the wrong tool for the job this time.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Describe what you mean by "overlap" and what you are trying to count, and there will probably be a much faster way to get the job done.&lt;/P&gt;</description>
      <pubDate>Tue, 19 Feb 2019 18:54:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536864#M6678</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2019-02-19T18:54:43Z</dc:date>
    </item>
    <item>
      <title>Re: Run code sequential</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536867#M6679</link>
      <description>&lt;P&gt;I suspect you mean run code in parallel (several streams at the same time) not sequential (one after the other).&lt;/P&gt;</description>
      <pubDate>Tue, 19 Feb 2019 19:02:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536867#M6679</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2019-02-19T19:02:54Z</dc:date>
    </item>
    <item>
      <title>Re: Run code sequential</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536874#M6680</link>
      <description>&lt;P&gt;agreed.. I meant parallel.. haha&lt;/P&gt;</description>
      <pubDate>Tue, 19 Feb 2019 19:34:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536874#M6680</guid>
      <dc:creator>Tanvi99</dc:creator>
      <dc:date>2019-02-19T19:34:12Z</dc:date>
    </item>
    <item>
      <title>Re: Run code sequential</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536878#M6682</link>
      <description>&lt;P&gt;You are doing a cartesian product of a data set with itself within each pty_id/sub_channel combination.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So let's say, out of your 8 billion records, you have 10 records of a specific subchannel within a specific pty_id, for which the Cartesian product would be 100 records.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So is it really your intention to produce and count 100 records, grouped&amp;nbsp;by columns 1,2,3 ...?&amp;nbsp;&amp;nbsp; I do not understand the purpose of your code.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also you have "a.subchannel=&amp;amp;subchannel and b.subchannel=&amp;amp;subchannel".&amp;nbsp; Why not apply the subchannel filter as&amp;nbsp;a "where=" parameter to the data set, as in&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;   from gmog.loaded_detail_&amp;amp;livedate (where=(subchannel=&amp;amp;subchannel)) a ,
            gmog.loaded_detail_&amp;amp;livedate (where=(subchannel=&amp;amp;subchannel)) b 
    where a.pty_id=b.pty_id
    group by 1,2,3 ...
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This forces the sas data engine to do the subchannel filtering rather than reading in a record only to have the filter applied later on by sql.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Or better yet, do&amp;nbsp;all your subchannels in one pass by&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;making the "where=" parameters to include all the subchannels of interest&lt;/LI&gt;
&lt;LI&gt;modify the where clause to force equality of a.subchannel with b.subchannel&lt;/LI&gt;
&lt;LI&gt;add subchannel as a group dimension:&lt;/LI&gt;
&lt;/OL&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;    from gmog.loaded_detail_&amp;amp;livedate (where=(subchannel in (chn1,chn2,...chn141))) a ,
         gmog.loaded_detail_&amp;amp;livedate (where=(subchannel in (chn1,chn2,...chn141))) b 
    where a.pty_id=b.pty_id and a.subchannel=b.subchannel
    group by subchannel,1,2,3,..
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Perhaps you are running into problems with the resources needed for the Cartesian crossing covering all 41 subchannels.&amp;nbsp; If so, then change the "where subchannel in (....)" parameters to include a shorter list of subchannels.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But there may be more efficient ways to do this.&amp;nbsp; Here are some questions whose answers might enable a more efficient program:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Are the data sorted by pty_id?&amp;nbsp; by subchannel?&amp;nbsp; or both?&lt;/LI&gt;
&lt;LI&gt;Does every&amp;nbsp;pty_id have only one subchannel?&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;I ask this because - if you really are trying to do Cartesian product of a dataset with itself, it might be easier to pass through the data once to get frequencies of each combination of subchannel, col1, col2, ...&amp;nbsp; for every pty_id, generating a possibly much smaller aggregate dataset.&amp;nbsp; Then you could to a Cartesian self-crossing of that aggregate data, where you apply a weight based on the original frequencies.&amp;nbsp; Possibly much faster.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But this conjecture just makes me wonder what measure you are trying to develop by this program.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In other words, what do you mean by "overlap"?&lt;/P&gt;</description>
      <pubDate>Tue, 19 Feb 2019 19:42:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536878#M6682</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2019-02-19T19:42:46Z</dc:date>
    </item>
    <item>
      <title>Re: Run code sequential</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536900#M6685</link>
      <description>&lt;P&gt;Thanks Mkeintz for your reply.&lt;/P&gt;&lt;P&gt;the purpose is to&amp;nbsp;do a Cartesian join,. We have offers for&amp;nbsp;customers&amp;nbsp;and we would like to see what other offers the same customer has&amp;nbsp;, which we call overlap offers.&lt;/P&gt;&lt;P&gt;Your suggestion 1 with including where clause along with the table name is definitely efficient. I will implement that.&lt;/P&gt;&lt;P&gt;&lt;SPAN class="token keyword"&gt;from&lt;/SPAN&gt; gmog&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;loaded_detail_&lt;SPAN class="token operator"&gt;&amp;amp;&lt;/SPAN&gt;livedate &lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token statement"&gt;where&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;subchannel&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;&amp;amp;&lt;/SPAN&gt;subchannel&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; a &lt;SPAN class="token punctuation"&gt;,&lt;/SPAN&gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; gmog&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;loaded_detail_&lt;SPAN class="token operator"&gt;&amp;amp;&lt;/SPAN&gt;livedate &lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token statement"&gt;where&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;subchannel&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;&amp;amp;&lt;/SPAN&gt;subchannel&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; b&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;SPAN class="token statement"&gt;where&lt;/SPAN&gt; a&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;pty_id&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;b&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;pty_id&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;SPAN class="token keyword"&gt;group&lt;/SPAN&gt; &lt;SPAN class="token statement"&gt;by&lt;/SPAN&gt; &lt;SPAN class="token number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;,&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;,&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;3&lt;/SPAN&gt; &lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="token punctuation"&gt;Coming to &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="token punctuation"&gt;&lt;SPAN class="token keyword"&gt;from&lt;/SPAN&gt; gmog.loaded_detail_&lt;SPAN class="token operator"&gt;&amp;amp;&lt;/SPAN&gt;livedate (&lt;SPAN class="token statement"&gt;where&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;(subchannel &lt;SPAN class="token operator"&gt;in&lt;/SPAN&gt; (chn1,chn2,...chn141))) a ,&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; gmog.loaded_detail_&lt;SPAN class="token operator"&gt;&amp;amp;&lt;/SPAN&gt;livedate (&lt;SPAN class="token statement"&gt;where&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;(subchannel &lt;SPAN class="token operator"&gt;in&lt;/SPAN&gt; (chn1,chn2,...chn141))) b&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;SPAN class="token statement"&gt;where&lt;/SPAN&gt; a.pty_id&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;b.pty_id and a.subchannel&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;b.subchannel &lt;SPAN class="token keyword"&gt;group&lt;/SPAN&gt; &lt;SPAN class="token statement"&gt;by&lt;/SPAN&gt; subchannel,&lt;SPAN class="token number"&gt;1&lt;/SPAN&gt;,&lt;SPAN class="token number"&gt;2&lt;/SPAN&gt;,&lt;SPAN class="token number"&gt;3&lt;/SPAN&gt;,..&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="token punctuation"&gt;the dataset has all the 41 channels and the doing a join by pty_id and subchannel wasn't working because the data is huge. I tried that and had no luck. so had to do it by each individual channel.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="token punctuation"&gt;thanks for the inputs.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 19 Feb 2019 20:18:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536900#M6685</guid>
      <dc:creator>Tanvi99</dc:creator>
      <dc:date>2019-02-19T20:18:41Z</dc:date>
    </item>
    <item>
      <title>Re: Run code sequential</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536932#M6694</link>
      <description>&lt;P&gt;If 41 subchannels is too much, you might be able to do, say,&amp;nbsp;5 subchannels at a time&lt;/P&gt;</description>
      <pubDate>Tue, 19 Feb 2019 22:06:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/536932#M6694</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2019-02-19T22:06:40Z</dc:date>
    </item>
    <item>
      <title>Re: Run code in parallel</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/537610#M6795</link>
      <description>1. Is all these runs independent of each other?&lt;BR /&gt;2. Can you check your IO, CPU and memory consumption during one of the run.&lt;BR /&gt;Running parallel is a tricky business, if your IO capacity is overloaded then you will have no benefit by creating parallel sessions.&lt;BR /&gt;However it is very easy to create a child session and mange it in Base SAS.</description>
      <pubDate>Fri, 22 Feb 2019 07:25:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Run-code-in-parallel/m-p/537610#M6795</guid>
      <dc:creator>Satish_Parida</dc:creator>
      <dc:date>2019-02-22T07:25:02Z</dc:date>
    </item>
  </channel>
</rss>

