<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: calculate multiple activities for some variables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932329#M366751</link>
    <description>&lt;P&gt;Some of the below code is so we don't have to fool around with "100's of variable" names.&lt;/P&gt;
&lt;P&gt;I changed some values in the data step so there were some examples of the different ranges of CV to test.&lt;/P&gt;
&lt;P&gt;Note that reshaping the data allows sorting into the order and selecting the value you want after adding an ordering variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;Data have;
Input custID YYYYMM X Y Z R T;
datalines;
1 202206 100 20 900 40 50
1 202207 15 20 25 30 35
1 202208 30 20 15 10 12
1 202209 15 18 21 27 31 
1 202210 19 22 40 15 12
1 202211 40 20 30 10 25
2 202210 25 45 35 15 10
2 202211 20 16 18 24 31
;
Run;

proc transpose data=have out=trans (rename=(_name_=Var))
   ;
   by custid;
   var x y z;
run;

proc sort data=trans;
   by custid Var;
run;

proc transpose data=trans out=trans2;
  by custid Var;
  var col: ;
run;

proc sort data=trans2;
  by custid Var Col1;
run;
  
proc summary data=trans2 nway;
   class custid var;
   var col1;
   output out=summary (drop=_type_ _freq_) cv=CV mean= mean n=n std=std/autoname;
run;
/* the above cv is the percent, ie std/mean multiplied by 100  */
data need;
   set summary;
   cv=cv/100;
   If n ge 4 then do;
      if cv le 0.8 then calc=mean;
      else IF 0.8&amp;lt; CV le 1.5 then use=2;
      else IF CV&amp;gt;1.5 then use=1;
   end;
   else calc =.;
run;

data order;
   set trans2 (drop=_name_);
   by custid var;
   retain use;
   if first.var then use=1;
   else use+1;
run;

data need2;
   merge order 
         need (in=inneed)
   ;
   by custid var use;
   if inneed;
   if use then calc=col1;
run;

proc transpose data=need2 out=want (drop=_name_)
     prefix=calc_
;
   by custid;
   id var;
   var calc;
run;
&lt;/PRE&gt;
&lt;P&gt;If you have hundreds of thousands of observations the transpose steps may take a long time and hopefully won't run out of memory.&lt;/P&gt;</description>
    <pubDate>Thu, 13 Jun 2024 21:42:43 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2024-06-13T21:42:43Z</dc:date>
    <item>
      <title>calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932261#M366731</link>
      <description>&lt;P&gt;Hello&lt;/P&gt;
&lt;P&gt;The raw data set have till 6 rows for each customer (Each row represent information of a month).&lt;/P&gt;
&lt;P&gt;It might happen that customer have less than 6 rows.&lt;/P&gt;
&lt;P&gt;From list of variables (let's say X,Y,Z) I need to calculate for each variable a new calculated variable (called calc_X,Calc_Y,Calc_Z)&amp;nbsp; .&lt;/P&gt;
&lt;P&gt;The logic how to calculate these 3 variables is as following:&lt;/P&gt;
&lt;P&gt;1-If customer have less then 4 rows then each of the variables be null (calc_X,Calc_Y,Calc_Z)&lt;/P&gt;
&lt;P&gt;2-If customer have at least 4 rows then need to calculate do this calculation for each of the variables (X,Y,Z). calculate CV(STD/AVG)&lt;/P&gt;
&lt;P&gt;IF CV&amp;lt;=0.8 then the calculated value equal to average&lt;/P&gt;
&lt;P&gt;else&amp;nbsp;IF 0.8&amp;lt;CV&amp;lt;=1.5 then calculated value equal to the value of second observation after sort it from low to high&lt;/P&gt;
&lt;P&gt;else&amp;nbsp;IF CV&amp;gt;1.5 then calculated value equal to the value of First observation after sort it from low to high (note- sort by the variable that we want to calculate. For example: if we calculate calc_X then sort by X)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is example to raw data set.&lt;/P&gt;
&lt;P&gt;Please not the request to calculated : Calc_X,Calc_Y,Calc_Z (and not calculate for R and T)&lt;/P&gt;
&lt;P&gt;In real life I have 100 variables to calculate&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Data have;
Input custID YYYYMM X Y Z R T;&lt;BR /&gt;cards;
1 202206 10 20 30 40 50
1 202207 15 20 25 30 35
1 202208 30 20 15 10 12
1 202209 15 18 21 27 31 
1 202210 19 22 40 15 12
1 202211 40 20 30 10 25
2 202210 25 45 35 15 10
2 202211 20 16 18 24 31
;
Run;


&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;What id the recommended way to do it please?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jun 2024 20:50:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932261#M366731</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-06-13T20:50:20Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932293#M366740</link>
      <description>&lt;PRE&gt;else IF 0.8&amp;lt;CV&amp;lt;=1.5 then calculated value equal to the value of second observation after sort it from low to high&lt;BR /&gt;else IF CV&amp;gt;1.5 then calculated value equal to the value of First observation after sort it from low to high&lt;/PRE&gt;
&lt;P&gt;Sort by which variable? The value of what?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note that your data step does not have a Datalines/Cards statement and hence does not run.&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jun 2024 20:40:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932293#M366740</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2024-06-13T20:40:32Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932297#M366743</link>
      <description>&lt;P&gt;&amp;nbsp; sort by the variable that we want to calculate. For example: if we calculate calc_X then sort by X&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jun 2024 20:50:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932297#M366743</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-06-13T20:50:37Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932329#M366751</link>
      <description>&lt;P&gt;Some of the below code is so we don't have to fool around with "100's of variable" names.&lt;/P&gt;
&lt;P&gt;I changed some values in the data step so there were some examples of the different ranges of CV to test.&lt;/P&gt;
&lt;P&gt;Note that reshaping the data allows sorting into the order and selecting the value you want after adding an ordering variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;Data have;
Input custID YYYYMM X Y Z R T;
datalines;
1 202206 100 20 900 40 50
1 202207 15 20 25 30 35
1 202208 30 20 15 10 12
1 202209 15 18 21 27 31 
1 202210 19 22 40 15 12
1 202211 40 20 30 10 25
2 202210 25 45 35 15 10
2 202211 20 16 18 24 31
;
Run;

proc transpose data=have out=trans (rename=(_name_=Var))
   ;
   by custid;
   var x y z;
run;

proc sort data=trans;
   by custid Var;
run;

proc transpose data=trans out=trans2;
  by custid Var;
  var col: ;
run;

proc sort data=trans2;
  by custid Var Col1;
run;
  
proc summary data=trans2 nway;
   class custid var;
   var col1;
   output out=summary (drop=_type_ _freq_) cv=CV mean= mean n=n std=std/autoname;
run;
/* the above cv is the percent, ie std/mean multiplied by 100  */
data need;
   set summary;
   cv=cv/100;
   If n ge 4 then do;
      if cv le 0.8 then calc=mean;
      else IF 0.8&amp;lt; CV le 1.5 then use=2;
      else IF CV&amp;gt;1.5 then use=1;
   end;
   else calc =.;
run;

data order;
   set trans2 (drop=_name_);
   by custid var;
   retain use;
   if first.var then use=1;
   else use+1;
run;

data need2;
   merge order 
         need (in=inneed)
   ;
   by custid var use;
   if inneed;
   if use then calc=col1;
run;

proc transpose data=need2 out=want (drop=_name_)
     prefix=calc_
;
   by custid;
   id var;
   var calc;
run;
&lt;/PRE&gt;
&lt;P&gt;If you have hundreds of thousands of observations the transpose steps may take a long time and hopefully won't run out of memory.&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jun 2024 21:42:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932329#M366751</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2024-06-13T21:42:43Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932333#M366753</link>
      <description>In real there are 12 million rows with 50 columns to calculate</description>
      <pubDate>Thu, 13 Jun 2024 21:53:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932333#M366753</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-06-13T21:53:14Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932345#M366764</link>
      <description>&lt;P&gt;You can effectively reshape the data by using a hash object, reducing code to a single DATA step.&amp;nbsp; The hash object H below has only 3 dataitems (with variables _VALUES1 through _VALUES6&lt;STRIKE&gt;3&lt;/STRIKE&gt;).&amp;nbsp; One dataitem (think one row) for X.&amp;nbsp; One for Y and one for Z.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Using dataset HAVE as provided by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;, you can do the below.\:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data have;
Input custID YYYYMM X Y Z R T;
datalines;
1 202206 100 20 900 40 50
1 202207 15 20 25 30 35
1 202208 30 20 15 10 12
1 202209 15 18 21 27 31 
1 202210 19 22 40 15 12
1 202211 40 20 30 10 25
2 202210 25 45 35 15 10
2 202211 20 16 18 24 31
run;
data want (drop=_: x y z);
  set have (keep=custid x y z);
  by custid;

  array _values{6} ;
  length _key 8;
  if _n_=1 then do;
    declare hash h ();
      h.definekey('_key');
      h.definedata('_values1','_values2','_values3','_values4','_values5','_values6');
      h.definedone();
  end;

  if first.custid then call missing(_n,of _:);
  _n+1;

  array xyz  {3} x y z;
  do _key=1 to 3;
    _rc=h.find();
    _values{_n}=xyz{_key};
    h.replace();
  end;

  if last.custid;

  array  calc{*} x_calc y_calc z_calc ;

  if _n&amp;lt;4 then call missing(of calc{*});
  else do _key=1 to 3;
    h.find();
    _cv=std(of _values{*})/mean(of _values{*});
    if _cv&amp;lt;=0.8 then calc{_key}=mean(of _values{*}); else
    if _cv&amp;lt;=1.5 then calc{_key}=smallest(2,of _values{*});  else
    calc{_key}=min(of _values{*});
  end;
  h.clear();
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jun 2024 12:50:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932345#M366764</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2024-06-14T12:50:32Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932347#M366766</link>
      <description>Please note that if the 6 values are for example : 20,10,20,20,20,10&lt;BR /&gt;Then after order it from low to high&lt;BR /&gt;10,10,20,20,20,20&lt;BR /&gt;Value of 1st is 10&lt;BR /&gt;Value of 2nd is 10&lt;BR /&gt;Mean is 100/6&lt;BR /&gt;Will smallest 2 function that used give 10 or 20?</description>
      <pubDate>Fri, 14 Jun 2024 01:03:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932347#M366766</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-06-14T01:03:59Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932362#M366776</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159549"&gt;@Ronein&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;Please note that if the 6 values are for example : 20,10,20,20,20,10&lt;BR /&gt;Then after order it from low to high&lt;BR /&gt;10,10,20,20,20,20&lt;BR /&gt;Value of 1st is 10&lt;BR /&gt;Value of 2nd is 10&lt;BR /&gt;Mean is 100/6&lt;BR /&gt;Will smallest 2 function that used give 10 or 20?&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Why not write some code and test it yourself if you have questions:&lt;/P&gt;
&lt;PRE&gt;data example;
   result = smallest(2,10,10,20,20,20,20);
   /* and without ordering the values*/
   result2 = smallest(2,20,10,20,20,10,20);
run;&lt;/PRE&gt;</description>
      <pubDate>Fri, 14 Jun 2024 03:57:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932362#M366776</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2024-06-14T03:57:49Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932366#M366778</link>
      <description>&lt;P&gt;Based on anwser of&amp;nbsp;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;&amp;nbsp;, here is an DOW-Loop + Array version:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  do i=1 by 1 until(last.custid);
    set have;
    by custid;
    array _raw_[*] x y z;
    array _val_[3,99]_temporary_;
    array _cal_[*] calc_x calc_y calc_z;
    do j=1 to dim(_raw_);
      _val_[j,i]=_raw_[j];
    end;
  end;

  array _tmp_[99]_temporary_;
  if i&amp;gt;4 then do j=1 to dim(_raw_);
    do k=1 to i;
      _tmp_[k]=_val_[j,k];
    end;
    cv=cv(of _tmp_[*])/100;     *The function CV() multipy the result 100 times by default;

    if cv&amp;gt;1.5 then _cal_[j]=min(of _tmp_[*]);
    else if cv&amp;gt;0.8 then _cal_[j]=smallest(2,of _tmp_[*]);
    else if cv&amp;gt;. then _cal_[j]=mean(of _tmp_[*]);
  end;
  drop i j k cv;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Array&amp;nbsp;&lt;CODE class=" language-sas"&gt;_val_&lt;/CODE&gt; has two dimension: the first for number of&amp;nbsp;&lt;SPAN&gt;variables, the second for oberservations of different&amp;nbsp;&lt;CODE class=" language-sas"&gt;custid&lt;/CODE&gt;&amp;nbsp;, you can use bigger numbers to process more variables and observations, with not signicicant increasment on time consuming.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jun 2024 06:14:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932366#M366778</guid>
      <dc:creator>whymath</dc:creator>
      <dc:date>2024-06-14T06:14:32Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932375#M366785</link>
      <description>If I have 20 variables and 2 million customers ( let's say that each customer has max 6 rows but can have less ). What will be written in _val_[3,99]?&lt;BR /&gt;I understand that instead of 3 need to write 20.but what about 99? What should write there ?</description>
      <pubDate>Fri, 14 Jun 2024 08:09:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932375#M366785</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-06-14T08:09:12Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932383#M366788</link>
      <description>&lt;P&gt;Thanks!&lt;/P&gt;
&lt;P&gt;What happens if mean=0 ? then CV calculation will get error&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jun 2024 09:39:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932383#M366788</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-06-14T09:39:05Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932384#M366789</link>
      <description>&lt;P&gt;Where did you apply the logic of second smallest value or smallest value ?&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jun 2024 09:46:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932384#M366789</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2024-06-14T09:46:44Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932387#M366790</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159549"&gt;@Ronein&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What happens if mean=0 ? then CV calculation will get error&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;It does not generate an error, it generates a NOTE and the rest of the data set continues to run. You should try it and find out what happens when mean=0.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If what does happen isn't what you want, what do you want to happen in that case? Tell us what you want.&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jun 2024 10:17:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932387#M366790</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2024-06-14T10:17:39Z</dc:date>
    </item>
    <item>
      <title>Re: calculate multiple activities for some variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932589#M366871</link>
      <description>It can be _val_[20,6]. This array is designed to collect all values of specified variables, so the second dimension  could have a big number like 99. But if you can sure that each customer has max 6 rows, 6 is enough to be the second dimension.</description>
      <pubDate>Mon, 17 Jun 2024 02:15:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-multiple-activities-for-some-variables/m-p/932589#M366871</guid>
      <dc:creator>whymath</dc:creator>
      <dc:date>2024-06-17T02:15:31Z</dc:date>
    </item>
  </channel>
</rss>

