<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Duplication in parallel in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345823#M79625</link>
    <description>&lt;P&gt;Agree weight or freq probably wouldn't be relevant. However, couldn't something like the following serve as an alternative to creating the large file?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data have;
  informat date date9.;
  input ID DURATION DATE;
  cards;
1 10 1jan2016
2 3 5mar2016
3 15 8sep2016
;

data want;
  set have (where=('01jan2017'd between date and intnx('month', date, duration-1 )));
run;
&lt;/PRE&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;</description>
    <pubDate>Thu, 30 Mar 2017 16:52:30 GMT</pubDate>
    <dc:creator>art297</dc:creator>
    <dc:date>2017-03-30T16:52:30Z</dc:date>
    <item>
      <title>Duplication in parallel</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345775#M79604</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a table A with a lot of column (something like 70) including this three ones:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;* ID&lt;/P&gt;&lt;P&gt;* DURATION&lt;/P&gt;&lt;P&gt;* DATE&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;I want to duplicate every row according to the value of DURATION, and recalculate DATE (one month more evry time), and then recalculate every other columns according to the new DATE value.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;data B;
set A;

do i = 1 to DURATION;
DATE = INTNX('month', DATE, +1 , 'e');
[...];
output;
end;
run;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The problem is, the number of rows in A is 60K&lt;/P&gt;&lt;P&gt;DURATION is often very high, so the number of rows in B at the end of the duplication is something like 35M&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Vrey long, and very huge volume. I would like to make it parallel. Divide in 6 part (or more if necessary) of 10K rows and run them in parallel, then only do a simple union.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Do you know how I can do this in SAS?&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2017 14:51:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345775#M79604</guid>
      <dc:creator>Planck</dc:creator>
      <dc:date>2017-03-30T14:51:31Z</dc:date>
    </item>
    <item>
      <title>Re: Duplication in parallel</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345781#M79605</link>
      <description>&lt;P&gt;You haven't said what you are going to do with the file after expanding it.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For most SAS procedures you can probably avoid expanding the file and simply use DURATION as a WEIGHT variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2017 14:56:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345781#M79605</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2017-03-30T14:56:01Z</dc:date>
    </item>
    <item>
      <title>Re: Duplication in parallel</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345794#M79610</link>
      <description>&lt;P&gt;Thanks for the answer.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But I don't understand what do you mean by WEIGHT variables. Can you be more explicit?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;After, with the table B, I need to extract specific rows to create a table C, and then export in Excel the table C...&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2017 15:30:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345794#M79610</guid>
      <dc:creator>Planck</dc:creator>
      <dc:date>2017-03-30T15:30:52Z</dc:date>
    </item>
    <item>
      <title>Re: Duplication in parallel</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345809#M79617</link>
      <description>&lt;P&gt;May or may not be relevant to your needs. Here is a link that explains&amp;nbsp;WEIGHT and FREQ statements in SAS procedures:&amp;nbsp;&lt;A href="http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473731.htm" target="_blank"&gt;http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473731.htm&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thus, for a proc freq, it would count each record as representing n (i.e., weight or freq) records without forcing you to build the large file you were describing.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Are you really planning on exporting a 60 million row file to Excel?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;Art, CEO, AnalystFinder.com&lt;STRONG&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2017 16:03:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345809#M79617</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2017-03-30T16:03:21Z</dc:date>
    </item>
    <item>
      <title>Re: Duplication in parallel</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345815#M79620</link>
      <description>&lt;P&gt;You are kidding not 60 millions ^^.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My first table A contains 60.000 rows. My table B contains 35 milions rows, and my table C is back to 60.000 rows: my table C will extract from B only a specific calculated DATE.&lt;/P&gt;&lt;P&gt;So I will export in Excel only 60.000 rows (table C)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But I can't understand the link between WEIGHT/FREQ and my needs. It is more for statical needs, me I just need to transform data...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2017 16:34:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345815#M79620</guid>
      <dc:creator>Planck</dc:creator>
      <dc:date>2017-03-30T16:34:47Z</dc:date>
    </item>
    <item>
      <title>Re: Duplication in parallel</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345823#M79625</link>
      <description>&lt;P&gt;Agree weight or freq probably wouldn't be relevant. However, couldn't something like the following serve as an alternative to creating the large file?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data have;
  informat date date9.;
  input ID DURATION DATE;
  cards;
1 10 1jan2016
2 3 5mar2016
3 15 8sep2016
;

data want;
  set have (where=('01jan2017'd between date and intnx('month', date, duration-1 )));
run;
&lt;/PRE&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2017 16:52:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345823#M79625</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2017-03-30T16:52:30Z</dc:date>
    </item>
    <item>
      <title>Re: Duplication in parallel</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345830#M79627</link>
      <description>&lt;P&gt;Unfortunately not! &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I didn't go into details of all 70 columns in my explanation, but among them some are calculated in a complex way and I truely need to create the large table B, before reducing for table C.&lt;/P&gt;&lt;P&gt;The question is not about finding an alternative, but to accept it and find a solution &lt;SPAN&gt;parallelising&amp;nbsp;&lt;/SPAN&gt;it, because... I have no choice!&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2017 17:02:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345830#M79627</guid>
      <dc:creator>Planck</dc:creator>
      <dc:date>2017-03-30T17:02:05Z</dc:date>
    </item>
    <item>
      <title>Re: Duplication in parallel</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345837#M79632</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/121074"&gt;@Planck&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;Unfortunately not! &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I didn't go into details of all 70 columns in my explanation, but among them some are calculated in a complex way and I truely need to create the large table B, before reducing for table C.&lt;/P&gt;
&lt;P&gt;The question is not about finding an alternative, but to accept it and find a solution &lt;SPAN&gt;parallelising&amp;nbsp;&lt;/SPAN&gt;it, because... I have no choice!&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Are you sure you can't post a little example data and then where things go after that to the final table C?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have seen lots of processes that were either written before new procedures were available in SAS or were a direct translation from another data system and did not use SAS tools well. A frequent example are data steps that go through lots of loops and retaining variables to calculate common statistics such as mean and standard deviation that could be done with four or five lines of code using Proc Means or summary. Or forcing data into a specific layout because "that's the way the report looks" instead of using the SAS report procedures such as Proc Report or Tabulate.&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2017 17:13:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Duplication-in-parallel/m-p/345837#M79632</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2017-03-30T17:13:14Z</dc:date>
    </item>
  </channel>
</rss>

