<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Apply count variable to all observations in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113375#M292697</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;There are merits in this approach. Although there appears to be 2 passes through the data because there are two set statemets, the data flow switches from one stream to the other on change of by group. If the buffering and caching of the data holds more than one by group there will be only one I/O stream into the buffers for this data step.&lt;/P&gt;&lt;P&gt;It has been presented before in forums and sas-L - using two references to to same dataset in a single SET statement like :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Data final ;&lt;/P&gt;&lt;P&gt;SET supplied1( in= A ) supplied1 ;&lt;/P&gt;&lt;P&gt;By uid ;&lt;/P&gt;&lt;P&gt;If first.uid then supplied=0;&lt;/P&gt;&lt;P&gt;If a then&amp;nbsp; supplied +1 ;&lt;/P&gt;&lt;P&gt;else output;&lt;/P&gt;&lt;P&gt;Run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 15 Oct 2013 19:42:42 GMT</pubDate>
    <dc:creator>Peter_C</dc:creator>
    <dc:date>2013-10-15T19:42:42Z</dc:date>
    <item>
      <title>Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113373#M292695</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am trying to calculate how many times different suppliers have provided a name and address.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What I need is for the value of the supplied variable for last.UID to be applied to all observations of that UID so that when I later class by each supplier I get an accurate count of the total time that record has been supplied.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Current code:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;data new;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; set supplied1;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; by uid;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if first.uid then do;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; supplied=0;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; end;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; supplied=supplied+1;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;retain supplied;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;if last.uid;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;run;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I can do this in a number of steps but I have a feeling there is a much simpler way of doing it!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 15 Oct 2013 16:39:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113373#M292695</guid>
      <dc:creator>LukeD</dc:creator>
      <dc:date>2013-10-15T16:39:38Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113374#M292696</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I think you will have to make two passes through the data, even if only one datastep is used.&amp;nbsp; One solution:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data new;&lt;/P&gt;&lt;P&gt;&amp;nbsp; do until (last.uid);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; set supplied1;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; by uid;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; if first.uid then supplied=1;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; else supplied=supplied+1;&lt;/P&gt;&lt;P&gt;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;&amp;nbsp; do until (last.uid);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; set supplied1;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; by uid;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; output;&lt;/P&gt;&lt;P&gt;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 15 Oct 2013 17:33:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113374#M292696</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2013-10-15T17:33:13Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113375#M292697</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;There are merits in this approach. Although there appears to be 2 passes through the data because there are two set statemets, the data flow switches from one stream to the other on change of by group. If the buffering and caching of the data holds more than one by group there will be only one I/O stream into the buffers for this data step.&lt;/P&gt;&lt;P&gt;It has been presented before in forums and sas-L - using two references to to same dataset in a single SET statement like :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Data final ;&lt;/P&gt;&lt;P&gt;SET supplied1( in= A ) supplied1 ;&lt;/P&gt;&lt;P&gt;By uid ;&lt;/P&gt;&lt;P&gt;If first.uid then supplied=0;&lt;/P&gt;&lt;P&gt;If a then&amp;nbsp; supplied +1 ;&lt;/P&gt;&lt;P&gt;else output;&lt;/P&gt;&lt;P&gt;Run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 15 Oct 2013 19:42:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113375#M292697</guid>
      <dc:creator>Peter_C</dc:creator>
      <dc:date>2013-10-15T19:42:42Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113376#M292698</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;A __default_attr="392339" __jive_macro_name="user" class="jive_macro jive_macro_user" data-objecttype="3" href="https://communities.sas.com/"&gt;&lt;/A&gt;: Don't know how I missed that but, in the tests I just ran, both approaches took approximately the same amounts of CPU and real times, thus I'm not convinced that they don't both take two passes through the data. However, that said, I like the approach you suggested as it does the same thing with less code.&amp;nbsp; The only thing I'd suggest changing is the if then else combinations. i.e.:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data new;&lt;/P&gt;&lt;P&gt;&amp;nbsp; set supplied1( in= a ) supplied1 ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; by uid ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; if first.uid then supplied=1;&lt;/P&gt;&lt;P&gt;&amp;nbsp; else if a then supplied +1 ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; else output;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 15 Oct 2013 20:11:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113376#M292698</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2013-10-15T20:11:47Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113377#M292699</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;shorter:&lt;/P&gt;&lt;P&gt;data have;&lt;/P&gt;&lt;P&gt;input id@@;&lt;/P&gt;&lt;P&gt;cards;&lt;/P&gt;&lt;P&gt;1 2 2 3 3 3 9 9 9 9 18 18 18 18 18&lt;/P&gt;&lt;P&gt;;&lt;/P&gt;&lt;P&gt;data want;&lt;/P&gt;&lt;P&gt; set have(in=a)have;&lt;/P&gt;&lt;P&gt; by id ;&lt;/P&gt;&lt;P&gt; if a then t+1-first.id*t;&lt;/P&gt;&lt;P&gt; else output;&lt;/P&gt;&lt;P&gt; run;&lt;/P&gt;&lt;P&gt; proc print;run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 15 Oct 2013 20:29:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113377#M292699</guid>
      <dc:creator>Linlin</dc:creator>
      <dc:date>2013-10-15T20:29:11Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113378#M292700</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;A __default_attr="3068" __jive_macro_name="user" class="jive_macro jive_macro_user" data-objecttype="3" href="https://communities.sas.com/"&gt;&lt;/A&gt;: Actually, on average, your suggestion runs slightly slower, but the two approaches do run in very similar times.&amp;nbsp; I prefer the if-then-else combinations as it is easier (I think) for most to follow the logic.&amp;nbsp; The fact that it run slightly faster (and I mean really, really only slightly), I think, is just an added benefit.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 15 Oct 2013 20:43:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113378#M292700</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2013-10-15T20:43:23Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113379#M292701</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Art,&lt;/P&gt;&lt;P&gt;You are right. Is there any leftover turkey to share?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 15 Oct 2013 20:48:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113379#M292701</guid>
      <dc:creator>Linlin</dc:creator>
      <dc:date>2013-10-15T20:48:17Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113380#M292702</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;@Linlin: Only if you get here quickly and now is being served in the form of Sheppard's Pie&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 15 Oct 2013 21:06:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113380#M292702</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2013-10-15T21:06:22Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113381#M292703</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;You won't measure the difference in a one-variable data set, but it would be good practice to modify the SET statement:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;set supplied1 (in=a keep=uid) supplied1;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt; &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 15 Oct 2013 21:22:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113381#M292703</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2013-10-15T21:22:22Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113382#M292704</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;A __default_attr="5253" __jive_macro_name="user" class="jive_macro jive_macro_user" data-objecttype="3" href="https://communities.sas.com/"&gt;&lt;/A&gt;: Interestingly, the efficiency doesn't appear to apply in this particular situation.&amp;nbsp; The test code I ran was:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data supplied1;&lt;/P&gt;&lt;P&gt;&amp;nbsp; set sashelp.class (rename=(age=uid));&lt;/P&gt;&lt;P&gt;&amp;nbsp; array junk(999) (999*1);&lt;/P&gt;&lt;P&gt;&amp;nbsp; do _n_=1 to 10000;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; output;&lt;/P&gt;&lt;P&gt;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sort data=supplied1;&lt;/P&gt;&lt;P&gt;&amp;nbsp; by uid;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data new;&lt;/P&gt;&lt;P&gt;&amp;nbsp; set supplied1( in= a ) supplied1 ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; by uid ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; if first.uid then supplied=0;&lt;/P&gt;&lt;P&gt;&amp;nbsp; if a then&amp;nbsp; supplied +1 ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; else output;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data new;&lt;/P&gt;&lt;P&gt;&amp;nbsp; set supplied1( in= a keep=uid) supplied1 ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; by uid ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; if first.uid then supplied=0;&lt;/P&gt;&lt;P&gt;&amp;nbsp; if a then&amp;nbsp; supplied +1 ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; else output;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 15 Oct 2013 21:45:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113382#M292704</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2013-10-15T21:45:35Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113383#M292705</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Art,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hmmmm ... there are mysteries that can be difficult to explain.&amp;nbsp; By any chance did you measure the difference with your original DOW approach?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 15 Oct 2013 22:41:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113383#M292705</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2013-10-15T22:41:37Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113384#M292706</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;A __default_attr="5253" __jive_macro_name="user" class="jive_macro jive_macro_user" data-objecttype="3" href="https://communities.sas.com/"&gt;&lt;/A&gt;: of course I looked at that as well.&amp;nbsp; Even with the double DOW, the pdv is populated with the variables from both steps thus using a keep option doesn't improve the processing times.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If any of you are wondering why &lt;A __default_attr="5253" __jive_macro_name="user" class="jive_macro jive_macro_user" data-objecttype="3" href="https://communities.sas.com/"&gt;&lt;/A&gt; is surprised, take a look at: &lt;A href="http://www.sascommunity.org/wiki/Increase_Your_Productivity_by_Doing_Less" title="http://www.sascommunity.org/wiki/Increase_Your_Productivity_by_Doing_Less"&gt;Increase Your Productivity by Doing Less - sasCommunity&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 15 Oct 2013 22:53:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113384#M292706</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2013-10-15T22:53:35Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113385#M292707</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Art's paper reminds me that one of the co-authors, Ksharp, has dropped out of the grid for a long time. Let's wish him the best!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Haikuo&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 16 Oct 2013 02:38:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113385#M292707</guid>
      <dc:creator>Haikuo</dc:creator>
      <dc:date>2013-10-16T02:38:41Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113386#M292708</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;A __default_attr="5068" __jive_macro_name="user" class="jive_macro jive_macro_user" data-objecttype="3" href="https://communities.sas.com/"&gt;&lt;/A&gt;: KSharp is still alive and well and, every now and then, sends me an idea for a SAS-related paper that he'd like me to investigate.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 16 Oct 2013 03:20:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113386#M292708</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2013-10-16T03:20:20Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113387#M292709</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Luke's original post said "...there must be a simpler way of doing this", but many of the replies are focused on performance.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you discard simplicity and go for performance, and if your data will fit in memory, then IIRC you can use a hash object for this, and do this in one pass of the data.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I don't have time to look it up now, but hit the doc on the hash object and search on "suminc" (I think).&amp;nbsp; Anyway, there's an example in there of deriving a sum based on the keys (your by variable), and you then use the output statement to write out the results after you've hit EOF.&amp;nbsp; (I think it's kinda cool you can write a permanent dataset from a data _null_ step!)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;HTH,&lt;/P&gt;&lt;P&gt;Scott&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 16 Oct 2013 07:31:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113387#M292709</guid>
      <dc:creator>ScottBass</dc:creator>
      <dc:date>2013-10-16T07:31:09Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113388#M292710</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I had a further chance to "play" with this, and thought I'd share my code.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Luke's original post only produced a frequency count per UID, due to his subsetting if statement on last.uid.&amp;nbsp; However, his text said:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE __jive_macro_name="quote" class="jive_text_macro jive_macro_quote"&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;What I need is for the value of the supplied variable for last.UID &lt;SPAN style="color: #ff0000;"&gt;&lt;STRONG&gt;to be applied to all observations of that UID&lt;/STRONG&gt;&lt;/SPAN&gt; so that when I later class by each supplier I get an accurate count of the total time that record has been supplied.&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;This is a bit ambiguous; does Luke just want a frequency count per UID, as his code generates, or does he want the frequency count to be "applied to all observations"?&amp;nbsp; Most of the suggested approaches assumed the latter.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;The double set statement "set supplied1 (in=a) supplied1; by uid" is a pretty cool construct I'd not seen before.&amp;nbsp; I walked it through the data step debugger to see that it reads all observations in the first by group from the first dataset, then all the observations in the first by group from the second dataset, etc.&amp;nbsp; If I recall my terminology correctly, it is interleaving the data from the two datasets by UID "chunks".&amp;nbsp; This then allows you to derive the frequency counts from the first dataset (in=a), then "merge" them back into the observations from the second dataset.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;This is analogous to the message you often see in SQL "The query requires remerging summary statistics back with the original data."&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;I "played" with the different approaches, trying to see how I could optimize the performance.&amp;nbsp; I've attached my .sas file and .log file when I ran it on my server.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;Comments:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;1) I've been using a "SPDE Work" library a lot lately to get better I/O performance in my large jobs.&amp;nbsp; It's a real easy change to make, with some caveats (potential file locking, no support for views, doesn't support sasfile, and so on).&amp;nbsp; See the attached file for more details.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;2) Even though we're doing a double read of the data on disk, I didn't get an appreciable performance gain by using sasfile, and then doing the double read from memory.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;3) I didn't get the expected performance gains using a hash object.&amp;nbsp; And given that, &lt;SPAN style="text-decoration: underline;"&gt;&lt;STRONG&gt;no one&lt;/STRONG&gt;&lt;/SPAN&gt; would approach the problem this way.&amp;nbsp; It was just a chance for me to "play" and learn more about hash object key summaries.&amp;nbsp; See &lt;A href="http://support.sas.com/documentation/cdl/en/lrcon/65287/HTML/default/viewer.htm#n1b4cbtmb049xtn1vh9x4waiioz4.htm" title="http://support.sas.com/documentation/cdl/en/lrcon/65287/HTML/default/viewer.htm#n1b4cbtmb049xtn1vh9x4waiioz4.htm"&gt;SAS(R) 9.3 Language Reference: Concepts, Second Edition&lt;/A&gt;, scroll to "Maintaining Key Summaries", for more details.&amp;nbsp; And I'm not sure my code is optimized; while I do first read the data into memory and then iterate on the copy in memory, I suspect having to iterate over the hash introduces overhead.&amp;nbsp; Using a DOW loop might be more efficient.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;Anyway, I hope some of this is useful &lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://communities.sas.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;Scott&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 18 Oct 2013 07:21:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113388#M292710</guid>
      <dc:creator>ScottBass</dc:creator>
      <dc:date>2013-10-18T07:21:29Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113389#M292711</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;For the sake of code clarity,&lt;/P&gt;&lt;P&gt;set SUPPLIED1(in=a keep=uid) SUPPLIED1;&lt;/P&gt;&lt;P&gt;is indeed better imho, regardless of performance.&lt;/P&gt;&lt;P&gt;This way we have more information about what each table is used for.&lt;/P&gt;&lt;P&gt;We could even write:&lt;/P&gt;&lt;P&gt;set SUPPLIED1(in=groupingstream keep=uid) SUPPLIED1;&lt;/P&gt;&lt;P&gt;or&lt;/P&gt;&lt;P&gt;set SUPPLIED1(in=groupingstream keep=uid) SUPPLIED1(in=outputstream);&lt;/P&gt;&lt;P&gt;if we were really petty about descriptive names, which we are not.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 23 Oct 2013 23:20:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113389#M292711</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2013-10-23T23:20:35Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113390#M292712</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Scott&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;in case no one else has said it, thank you for the depth of your investigation, the results and the detail of your review.&lt;/P&gt;&lt;P&gt;I also respond to discuss your suggestion:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;"&lt;SPAN style="font-style: inherit; background-color: #ffffff; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; font-size: 10pt; line-height: 1.5em;"&gt;This is analogous to the message you often see in SQL "The query requires remerging summary statistics back with the original data."&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;imho the data step is not "remerging" as sql does, because this "remerge" occurs only for the rows of a/each by-group whereas for sql it will be storing the whole table in spool, cache or memory before performing a whole-table merge (at least in my experience of teradata sql).&lt;/P&gt;&lt;P&gt;OK when there is no by-statement or there is only one value for the by-variable the data step of this style would indeed be equivalent to an "sql remerge". &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;regards&lt;/P&gt;&lt;P&gt;peterC&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 03 Nov 2013 17:45:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113390#M292712</guid>
      <dc:creator>Peter_C</dc:creator>
      <dc:date>2013-11-03T17:45:42Z</dc:date>
    </item>
    <item>
      <title>Re: Apply count variable to all observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113391#M292713</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Peter,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for your comments.&amp;nbsp; I just meant that SQL is first deriving the summary statistics (under the covers), then remerging with the original data.&amp;nbsp; I didn't mean to imply that the two processes (data step vs. sql) were identical.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Scott&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 04 Nov 2013 00:36:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Apply-count-variable-to-all-observations/m-p/113391#M292713</guid>
      <dc:creator>ScottBass</dc:creator>
      <dc:date>2013-11-04T00:36:56Z</dc:date>
    </item>
  </channel>
</rss>

