<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: + and by in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/846288#M334575</link>
    <description>One example would be if used on datasets on the SET statement, which would make the PDV smaller, hence would make the program use less resources.</description>
    <pubDate>Fri, 25 Nov 2022 09:56:47 GMT</pubDate>
    <dc:creator>LinusH</dc:creator>
    <dc:date>2022-11-25T09:56:47Z</dc:date>
    <item>
      <title>+ and by</title>
      <link>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/845594#M334305</link>
      <description>&lt;PRE&gt;data &amp;amp;Portofo._VAL2(keep=MOB_VAL_P MOB);
set &amp;amp;Portofo._VAL;
by MOB;
if first.MOB then CNT=0;
CNT+1;
MOB_VAL_P=CNT/&amp;amp;total_val;
if LAST.MOB;
run;&lt;/PRE&gt;
&lt;P&gt;does CNT means CNT+1?&lt;/P&gt;
&lt;P&gt;what is the line 'by MOB' doing here? what does it do?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 22 Nov 2022 03:04:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/845594#M334305</guid>
      <dc:creator>HeatherNewton</dc:creator>
      <dc:date>2022-11-22T03:04:22Z</dc:date>
    </item>
    <item>
      <title>Re: + and by</title>
      <link>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/845602#M334309</link>
      <description>&lt;P&gt;This statement&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;cnt+1;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;is &lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/n1dfiqj146yi2cn1maeju9wo7ijs.htm" target="_self"&gt;a SUM statement&lt;/A&gt;.&amp;nbsp; It adds 1 to the value of CNT.&amp;nbsp; It also makes sure that the value of CNT is retained across iterations of the data step.&amp;nbsp; That is it is NOT reset to missing at the start of the next iteration, unlike other variables, such as MOB_VAL_P, that are not sourced from input datasets.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also when it adds the one it will ignore any missing value of CNT when doing the addition.&amp;nbsp; (See&lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lefunctionsref/n0zxive1z1ctqin12w06c85jfigd.htm" target="_self"&gt; the SUM() function&lt;/A&gt; for more information on this).&amp;nbsp; Normally when you do arithmetic with any missing values the result will be missing also.&amp;nbsp; But the SUM() function used by the sum statement just ignores the missing values instead.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A BY statement tells SAS to process the dataset by the value of the listed variables.&amp;nbsp; If the data is not actually sorted by those variables then the data step will throw an error and stop.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The use of the BY statement is what creates and populates the FIRST. and LAST. flag variables that are also referenced in the code.&amp;nbsp; FIRST. is true when it is the first observation for the current value of the BY variable whose name follows the FIRST. (within the current values of all of the variables listed before it on the BY statement).&amp;nbsp; Similarly the LAST. flag is true only on the last observation of the group.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So basically this step is counting how many observations are present for each value of the BY variable and only writing out one observation per value of the BY variable.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It is also calculating MOB_VAL_P by dividing the count by the value of the macro variable that is referenced resolves to.&amp;nbsp; Note that it makes this calculation on every observation, but since only the last observation for a by group is written the intermediate values of CNT and MOB_VAL_P are not written out.&amp;nbsp; You could have avoided some extra division operations by moving the assignment statement after the &lt;A href="https://documentation.sas.com/doc/en/vdmmlcdc/8.1/lestmtsref/p1cxl8ifdt8u0gn12wqbji8o5fq1.htm#:~:text=The%20subsetting%20IF%20statement%20causes,specified%20in%20the%20IF%20statement." target="_self"&gt;subsetting IF statement&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also instead of using the KEEP=&lt;A href="https://documentation.sas.com/doc/en/vdmmlcdc/8.11/ledsoptsref/p0l3b7h13rpck6n17in4etoaoedm.htm" target="_self"&gt; dataset option&lt;/A&gt; to set the list of variables that the target dataset has you could just use a plain old normal&amp;nbsp;&lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/n1nnrzzsw6rzrjn1p2jfky6pdv23.htm" target="_self"&gt;KEEP statement&lt;/A&gt;.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data &amp;amp;Portofo._VAL2;
  set &amp;amp;Portofo._VAL;
  by MOB;
  if first.MOB then CNT=0;
  CNT+1;
  if LAST.MOB;
  MOB_VAL_P=CNT/&amp;amp;total_val;
  keep MOB MOB_VAL_P ;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 22 Nov 2022 04:34:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/845602#M334309</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-11-22T04:34:29Z</dc:date>
    </item>
    <item>
      <title>Re: + and by</title>
      <link>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/845693#M334337</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sorry to bother you with a question quite unrelated to the problem presented in this post. But I would like to know why you suggest using the plain old KEEP statement instead of the data set option. I wonder if there are any good reasons I have missed, as I always promote the Data Set options, because&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. The Keep or Drop variable lists specified as options are visually connected to the data set they relate to.&lt;/P&gt;
&lt;P&gt;2. The same syntax can be used in DATA and SET statements, and there are no corresponding INKEEP or INDROP statements available.&lt;/P&gt;
&lt;P&gt;3. Keep or Drop statements has effect on all output data sets, so the&amp;nbsp;Data Set options&amp;nbsp;is the only way to specify individual action on different outputs.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I recently spend time trying to find a missing output variable caused by a drop statement in line 600-something in a data step. The step had many drop statements placed all over the code, and it makes sense during development to place the drop statement in the section of the code where the variable is created, so the drop is not forgotten. But it is easier to maintain programs with a consistent use of syntax, which calls for the use of Data Set options, because they are unavoidable in some cases.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 22 Nov 2022 15:36:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/845693#M334337</guid>
      <dc:creator>ErikLund_Jensen</dc:creator>
      <dc:date>2022-11-22T15:36:06Z</dc:date>
    </item>
    <item>
      <title>Re: + and by</title>
      <link>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/845697#M334338</link>
      <description>&lt;P&gt;I follow the KISS principle.&amp;nbsp; Keep It Simple, Stupid.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Dataset options are a complication. Sometimes the complication is useful and so worth it.&amp;nbsp; The example program is not one of those cases.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1) Most data steps only create one output dataset.&lt;/P&gt;
&lt;P&gt;2) Not sure how that point relates, especially to this program, or really any normal simple data step. There are LOTS of dataset options.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3) See (1).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 22 Nov 2022 16:00:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/845697#M334338</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-11-22T16:00:08Z</dc:date>
    </item>
    <item>
      <title>Re: + and by</title>
      <link>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/846266#M334568</link>
      <description>&lt;P&gt;so where I put the statement 'if last.mob' is crucial..&lt;/P&gt;
&lt;P&gt;if I put it above the statement ' if first.mob then cnt=0';', the result would be different only counting the last entry of each mob value, am I correct??&lt;/P&gt;
&lt;P&gt;I always thought it does not matter where I put the subsetting condition 'if...' within the dataset, but actually it does matter in this case...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 25 Nov 2022 08:56:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/846266#M334568</guid>
      <dc:creator>HeatherNewton</dc:creator>
      <dc:date>2022-11-25T08:56:33Z</dc:date>
    </item>
    <item>
      <title>Re: + and by</title>
      <link>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/846287#M334574</link>
      <description>Yes it does matter, subsequent rows will not be executed.</description>
      <pubDate>Fri, 25 Nov 2022 09:52:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/846287#M334574</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2022-11-25T09:52:28Z</dc:date>
    </item>
    <item>
      <title>Re: + and by</title>
      <link>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/846288#M334575</link>
      <description>One example would be if used on datasets on the SET statement, which would make the PDV smaller, hence would make the program use less resources.</description>
      <pubDate>Fri, 25 Nov 2022 09:56:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/846288#M334575</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2022-11-25T09:56:47Z</dc:date>
    </item>
    <item>
      <title>Re: + and by</title>
      <link>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/846353#M334596</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13674"&gt;@LinusH&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;One example would be if used on datasets on the SET statement, which would make the PDV smaller, hence would make the program use less resources.&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Yes. And that is a totally different use case than the data step in this question.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Nov 2022 16:58:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/846353#M334596</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-11-25T16:58:58Z</dc:date>
    </item>
    <item>
      <title>Re: + and by</title>
      <link>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/846354#M334597</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/416388"&gt;@HeatherNewton&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;so where I put the statement 'if last.mob' is crucial..&lt;/P&gt;
&lt;P&gt;if I put it above the statement ' if first.mob then cnt=0';', the result would be different only counting the last entry of each mob value, am I correct??&lt;/P&gt;
&lt;P&gt;I always thought it does not matter where I put the subsetting condition 'if...' within the dataset, but actually it does matter in this case...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I suspect you were confusing the subsetting IF statement with the WHERE statement.&amp;nbsp; The WHERE statement (or WHERE= dataset option) limits the observations that come into that dataset. So it is not executable.&amp;nbsp; But the positioning of a WHERE statement can matter in a complex data step that has more than one SET (or MERGE or UPDATE) statement.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Nov 2022 17:01:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/and-by/m-p/846354#M334597</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-11-25T17:01:15Z</dc:date>
    </item>
  </channel>
</rss>

