<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530828#M145185</link>
    <description>&lt;P&gt;OK, so now it seems (that&lt;/P&gt;
&lt;OL style="list-style-position: inside;"&gt;
&lt;LI&gt;data is sorted by descending date / ascending age&lt;/LI&gt;
&lt;LI&gt;you want to do this process for each month&lt;/LI&gt;
&lt;LI&gt;ages need not be integer values&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Is the above correct?&amp;nbsp; If so, then you need to read all ages for a group once to get the initial mean.&amp;nbsp; Then reread them again to trim (and count) observations to adjust the mean if neccessary.&amp;nbsp; At the end of the group's second pass you can calculate the ratio:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want (keep=month total_count removed_count ratio);
  do total_count=1 by 1 until (last.month);
    set have;
    by descending month;
    age_sum=sum(age_sum,age);
  end;

  removed_count=0;
  removed_sum=0;
  do until (last.month);
    set have;
    by descending month;
    if  (age_sum-removed_sum)/(total_count-removed_count) &amp;lt;= -0.2 then do;
      removed_sum=removed_sum+age;
      removed_count=removed_count+1;
    end;
  end;
  ratio=removed_count/total_count;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 29 Jan 2019 00:37:08 GMT</pubDate>
    <dc:creator>mkeintz</dc:creator>
    <dc:date>2019-01-29T00:37:08Z</dc:date>
    <item>
      <title>How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530548#M145104</link>
      <description>&lt;P&gt;Here's what I'm trying to do:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Age&lt;/P&gt;&lt;P&gt;1&lt;/P&gt;&lt;P&gt;2&lt;/P&gt;&lt;P&gt;3&lt;/P&gt;&lt;P&gt;4&lt;/P&gt;&lt;P&gt;5&lt;/P&gt;&lt;P&gt;6&lt;/P&gt;&lt;P&gt;7&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Get average of age. If average is less than 5, remove topmost row in column which is "1" and recalculate average again. If still less than 5, then remove "2". Repeat until average is more than 5 and then stop and output how many numbers were removed before achieving goal. Also output the ratio of #removed to total count.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help will be greatly appreciated. Thank you!&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jan 2019 07:03:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530548#M145104</guid>
      <dc:creator>Luisyu</dc:creator>
      <dc:date>2019-01-28T07:03:31Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530555#M145109</link>
      <description>&lt;P&gt;Hi and welcome to the SAS communities &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Where do you want the information to go? In a data set or to the log?&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jan 2019 08:06:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530555#M145109</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2019-01-28T08:06:05Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530556#M145110</link>
      <description>&lt;P&gt;Here is an IML approach&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input age @@;
datalines;
1 2 3 4 5 6 7
;

proc iml;
use have;
   read all var {age};
close have;

do i=1 to nrow(age) until (avg&amp;gt;=5);
   avg=mean(age[i:nrow(age)]);
end;

numsremoved=i-1;
ratio=numsremoved/nrow(age);

create want var {avg numsremoved ratio};
   append;
close want;

quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 28 Jan 2019 08:17:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530556#M145110</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2019-01-28T08:17:13Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530558#M145111</link>
      <description>&lt;P&gt;Here is a data step solution which writes the information in the log:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
  array temp(100) 8 _temporary_;
  do _N_=1 to nobs;
    set have nobs=nobs;
    temp(_N_)=age;
    end;
  do _N_=1 to nobs while(mean(of temp(*))&amp;lt;5);
    call missing(temp(_N_));
    end;
  nums_removed=_N_-1;
  ratio=nums_removed/nobs;
  if ratio=1 then
    put 'Average 5 was never reached!';
  else
    put nobs= nums_removed= ratio=;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I made the array size 100 to accomodate larger datasets. If you have even more data than that, you will get an error message about invalid array index, in that case you can increase the number 100.&lt;/P&gt;
&lt;P&gt;I used the _N_ variable for array indexing, as it is an automatic variable which is always present (so you do not have to drop it explicitly if you create a dataset).&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jan 2019 08:48:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530558#M145111</guid>
      <dc:creator>s_lassen</dc:creator>
      <dc:date>2019-01-28T08:48:11Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530572#M145119</link>
      <description>&lt;P&gt;Another approach, using a hash object:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
declare hash inp (ordered:'yes');
declare hiter inpi('inp');
rc = inp.definedata('num','age');
rc = inp.definekey('num');
rc = inp.definedone();
num = 0;
do until (eof);
  num + 1;
  set have end = eof;
  inp.add();
end;
removed = 0;
size = num;
newsize = size;
total = 0;
rc = inpi.first();
do while (rc = 0);
  total + age;
  rc = inpi.next();
end;
average = total / size;
do while (average &amp;lt; 5);
  removed + 1;
  num = removed;
  rc = inp.remove();
  total = 0;
  newsize = 0;
  rc = inpi.first();
  do while (rc = 0);
    newsize + 1;    
    total + age;
    rc = inpi.next();
  end;
  average = total / newsize;
end;
rc = inp.output(dataset:"want");
put removed=;
percentage = removed / size;
put percentage=;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The advantage of the hash object is that you do not need to make assumptions about the dataset size and it does everything with one sequential read and one sequential write. It is limited by the amount of memory available.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jan 2019 09:54:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530572#M145119</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2019-01-28T09:54:58Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530584#M145125</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/258175"&gt;@Luisyu&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If your dataset is much larger, e.g., like this&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
do _n_=1 to 100000;
  age+ranuni(2718)/7000;
  output;
end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;you may want to make the iterative mean calculation more efficient:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=have;
var age;
output out=stats n=n sum=s;
run;

data want(keep=n removed ratio);
set stats;
do k=n to 1 by -1;
  set have;
  avg=s/k;
  if avg&amp;gt;=5 then leave;
  s+(-age);
end;
removed=n-k;
ratio=removed/n;
if ratio&amp;lt;1 then output;
else put 'Average 5 was never reached!';
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;(adopting&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/76464"&gt;@s_lassen&lt;/a&gt;'s good idea to allow for the case that the target average cannot be reached).&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jan 2019 11:03:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530584#M145125</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2019-01-28T11:03:00Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530698#M145143</link>
      <description>&lt;P&gt;If your dataset is sorted, then the task is really about determining the value of firstobs.&amp;nbsp; For instance if the ages are 4,5,5,5 then you would want&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  set have (firstobs=2);
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If the maximum age &amp;lt;5 then you would want a message to that effect in the sas log.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And if the minimum age is 5 or more, then you don't need to examine the rest of HAVE to know that "firstobs=1" satisfies your objective (but I didn't include this operation in the code below):&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
  input age @@;
datalines;
1 2 3 4 5 5 5 5
run;

data _null_;

  array n {0:4} _temporary_ (5*0);

  set have end=end_of_have;

  if age&amp;lt;5 then n{age}+1;
  sum_age+age;

  if end_of_have;
  total_n=_n_;

  if age&amp;lt;5 then do;
    putlog 'Mean age &amp;gt;= 5 not possible. Highest ' age=;
    stop;
  end;

  age=lbound(n);
  if sum_age/total_n&amp;lt;5 then do until(sum_age/total_n&amp;gt;=5);
    if n{age}=0 then age=age+1;
    else do;
      sum_age=sum_age-age;
      total_n=total_n-1;
      n{age}=n{age}-1;
    end;
  end;
  fobs=1 + _n_-total_n;

  call execute (cats('data want; set have (firstobs=',fobs,');run;'));
run;

&lt;/CODE&gt;&lt;/PRE&gt;
&lt;OL style="list-style-position: inside;"&gt;
&lt;LI&gt;A temporary array N is kept of the frequencies&amp;nbsp;of each age &amp;lt;5, initialized to all zeroes.&amp;nbsp; This will provide a way to determine how many of the lowest AGE values can be kept while generating mean&amp;gt;=5.&amp;nbsp; You can set the lower and upper bounds of N as needed.&lt;/LI&gt;
&lt;LI&gt;&amp;nbsp;The subsetting if statement "IF end_of_have;" allows subsequent statement to be executed only once, after all of HAVE has been read.&lt;/LI&gt;
&lt;LI&gt;&amp;nbsp;The following "if age&amp;lt;5" provides a check on whether there are ANY ages &amp;gt;=5.&lt;/LI&gt;
&lt;LI&gt;&amp;nbsp;The "if ... then do until ..." loop allows a way to count how many age=1, then age=2, then age=3, etc. need to be skipped to produce mean age?=5.&lt;/LI&gt;
&lt;LI&gt;&amp;nbsp;The call execute statement constructs the next data step, with properly calculated FIRSTOBS=.&lt;/LI&gt;
&lt;LI&gt;&amp;nbsp;Note if there is no age&amp;gt;=5 then the next data step is not constructed.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jan 2019 18:12:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530698#M145143</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2019-01-28T18:12:33Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530817#M145183</link>
      <description>&lt;P&gt;Really appreciate all the feedback. Let me further clarify what I'm looking for.&lt;/P&gt;&lt;P&gt;I have two columns, month and age, with month column already pre-sorted from 201812~201801.&lt;/P&gt;&lt;P&gt;Age column is also already pre-sorted from smallest number to biggest number.&lt;/P&gt;&lt;P&gt;Below is a short clip of my data set:&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;MONTH&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Age (normalized)&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-7.60&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-7.17&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-7.14&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-6.05&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-4.85&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-4.67&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-4.67&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-4.23&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-4.15&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-3.99&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-3.50&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-3.07&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-3.01&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-2.48&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;-2.19&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;I need a program which will read in the data set above and calculate average age by month.&lt;/P&gt;&lt;P&gt;However, if monthly average age is smaller or equal to -0.2, it will remove the topmost data (-7.60) and recalculate the monthly average age. If it is still &amp;lt; or = to&amp;nbsp; -0.2, it will continue removing the next topmost data (-7.17) and recalculate until finally the monthly average age &amp;gt;-0.2. It will then output results like below to new data set. So for example, in Dec'2018 if there are a total of 100 counts of age and 10 of the smallest numbers (10 topmost rows in age column) in that month need to be removed to achieve monthly average age &amp;gt;-0.2, it will output below data with ratio = 0.1 (10 out of 100). So basically, the program will fill in all the data below.&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;MONTH&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Total Count&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Removed Count&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Ratio&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201812&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;100&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;10&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;0.1&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201811&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201810&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201809&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201808&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201807&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201806&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201805&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201804&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201803&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201802&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;201801&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope this is clear enough. Thank you.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jan 2019 23:15:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530817#M145183</guid>
      <dc:creator>Luisyu</dc:creator>
      <dc:date>2019-01-28T23:15:38Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530828#M145185</link>
      <description>&lt;P&gt;OK, so now it seems (that&lt;/P&gt;
&lt;OL style="list-style-position: inside;"&gt;
&lt;LI&gt;data is sorted by descending date / ascending age&lt;/LI&gt;
&lt;LI&gt;you want to do this process for each month&lt;/LI&gt;
&lt;LI&gt;ages need not be integer values&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Is the above correct?&amp;nbsp; If so, then you need to read all ages for a group once to get the initial mean.&amp;nbsp; Then reread them again to trim (and count) observations to adjust the mean if neccessary.&amp;nbsp; At the end of the group's second pass you can calculate the ratio:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want (keep=month total_count removed_count ratio);
  do total_count=1 by 1 until (last.month);
    set have;
    by descending month;
    age_sum=sum(age_sum,age);
  end;

  removed_count=0;
  removed_sum=0;
  do until (last.month);
    set have;
    by descending month;
    if  (age_sum-removed_sum)/(total_count-removed_count) &amp;lt;= -0.2 then do;
      removed_sum=removed_sum+age;
      removed_count=removed_count+1;
    end;
  end;
  ratio=removed_count/total_count;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 29 Jan 2019 00:37:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530828#M145185</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2019-01-29T00:37:08Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530850#M145190</link>
      <description>It works perfectly. Thank you!</description>
      <pubDate>Tue, 29 Jan 2019 07:08:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/530850#M145190</guid>
      <dc:creator>Luisyu</dc:creator>
      <dc:date>2019-01-29T07:08:01Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/531555#M145501</link>
      <description>&lt;P&gt;One question. If instead of calculating mean, I want to calculate the standard deviation, how do I&amp;nbsp; modify the program? Thanks.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jan 2019 05:39:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/531555#M145501</guid>
      <dc:creator>Luisyu</dc:creator>
      <dc:date>2019-01-31T05:39:39Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/531784#M145630</link>
      <description>&lt;P&gt;The sample mean is&amp;nbsp;&amp;nbsp; sum(of sample values) divided by sample count.&amp;nbsp; The only issue in this problem is to properly remove elements from the sum of sample values, and reduce sample count correspondingly.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The std can be the same.&amp;nbsp; But you have to also track the sum of (squared sample values).&amp;nbsp; Given the formula for the sample std is the square root of the sample variance, start with the sample variance formula:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="sample_variance.PNG" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/26737i6214C02F960C9382/image-size/medium?v=v2&amp;amp;px=400" role="button" title="sample_variance.PNG" alt="sample_variance.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So, in addition to accumulating sum of the values (used for means and also for the right hand term above), you need to also accumulate sum of the squared values (for the left hand term), and perform the same REMOVE tracking as for the mean.&amp;nbsp; Once the removal totals are complete, just generate the sample variance above and get its square root.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jan 2019 20:00:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/531784#M145630</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2019-01-31T20:00:24Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/531843#M145660</link>
      <description>I know the formula for stdev but I'm having a hard time coding the remove tracking for the stdev into the program in addition to the mean. Can you please modify your program above to include the stdev? Appreciate it!</description>
      <pubDate>Fri, 01 Feb 2019 00:02:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/531843#M145660</guid>
      <dc:creator>Luisyu</dc:creator>
      <dc:date>2019-02-01T00:02:41Z</dc:date>
    </item>
    <item>
      <title>Re: How to loop calculating avg of column, then remove topmost data, then recalculate avg again?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/531852#M145664</link>
      <description>&lt;P&gt;The reason I put the formula for STD - especially the 2nd variation of the formula - was to formulate it in a way that fits perfectly into the program structure I presented for calculating the mean.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I will not be writing the program, which is just a way of showing you do not yet understand the logic of the program I already provided.&amp;nbsp; Here's a clue:&amp;nbsp;&amp;nbsp; Every place you see AGE_SUM add a parallel statement for AGESQUARE_SUM.&amp;nbsp; Same thing with REMOVED_SUM.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then where the ratio is calculated, you will have all the variable needed for generating STD.&lt;/P&gt;</description>
      <pubDate>Fri, 01 Feb 2019 00:53:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-loop-calculating-avg-of-column-then-remove-topmost-data/m-p/531852#M145664</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2019-02-01T00:53:48Z</dc:date>
    </item>
  </channel>
</rss>

