<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Summarize a large table by calculating counts, means and standard deviations in SAS Enterprise Guide</title>
    <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875687#M42962</link>
    <description>Thank you!&lt;BR /&gt;</description>
    <pubDate>Sun, 14 May 2023 16:54:43 GMT</pubDate>
    <dc:creator>St_Rokas</dc:creator>
    <dc:date>2023-05-14T16:54:43Z</dc:date>
    <item>
      <title>Summarize a large table by calculating counts, means and standard deviations</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875634#M42952</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I am trying to summarize and average data in a large table with up to 2.5 million rows and 120 columns. The structure is like this: a single CASE is divided into thousands of REGIONS, each of which can have a RANK (usually ranging from -30 to 30 including zero) and more than a hundred measured different FEATURES in that region:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2023-05-14 062716.png" style="width: 464px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/83953iBAF174BAE0E48604/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot 2023-05-14 062716.png" alt="Screenshot 2023-05-14 062716.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I need to convert this data into two tables. One of them lists down unique CASES with all possible RANK values as colmns and has the count of REGIONS with such a rank in a cell, like this:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2023-05-14 064325.png" style="width: 593px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/83954i65E8359C14AFCD84/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot 2023-05-14 064325.png" alt="Screenshot 2023-05-14 064325.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The second table should be a summary list: mean and standard deviation of each FEATURE per RANK present in a CASE, like this:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2023-05-14 065211.png" style="width: 763px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/83955iFD08758EEC5A0FE8/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot 2023-05-14 065211.png" alt="Screenshot 2023-05-14 065211.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Of course the two tables can be combined by simply adding a column RANK_COUNT in the second table.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2023-05-14 071715.png" style="width: 861px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/83956i0A7B82591B3FD122/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot 2023-05-14 071715.png" alt="Screenshot 2023-05-14 071715.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I would appreciate your help,&lt;/P&gt;&lt;P&gt;Thank you,&lt;/P&gt;&lt;P&gt;Rokas&lt;/P&gt;</description>
      <pubDate>Sun, 14 May 2023 04:18:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875634#M42952</guid>
      <dc:creator>St_Rokas</dc:creator>
      <dc:date>2023-05-14T04:18:17Z</dc:date>
    </item>
    <item>
      <title>Re: Summarize a large table by calculating counts, means and standard deviations</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875672#M42958</link>
      <description>&lt;P&gt;Hi and welcome to SAS-communities!&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you want to get help faster provide some example data in usable form, not as pictures... the best as a data step with cards e.g.,&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input x a b $ c d $;
cards;
1 11 A 100 ABC
2 22 B 200 DEF
3 33 C 300 GHI
4 44 D 400 JKL
;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Read about it here:&amp;nbsp;&lt;A href="https://communities.sas.com/t5/help/faqpage/faq-category-id/posting/title/use_syntax_highlighter" target="_blank"&gt;https://communities.sas.com/t5/help/faqpage/faq-category-id/posting/title/use_syntax_highlighter&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Bart&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 14 May 2023 14:13:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875672#M42958</guid>
      <dc:creator>yabwon</dc:creator>
      <dc:date>2023-05-14T14:13:32Z</dc:date>
    </item>
    <item>
      <title>Re: Summarize a large table by calculating counts, means and standard deviations</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875673#M42959</link>
      <description>&lt;P&gt;I think the most "standard way" would be to use the following approach:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
  call streaminit(42);
  do Case="A","C","E","G","I","K","M","O","Q","S","U","W","Y";
    do Region = 1 to rand("integer",200, 300);
      Rank = rand("integer",-5, 5);
      array f feature_01-feature_20;
      do over f;
        f=rand("normal");
      end;
      output;
    end;
  end;
run;


proc sql;
  create table r as
    select C as Case, Rank, count(region) as rank_count
    from have(rename=(case=c))
    group by C, Rank
    order by C, Rank
  ;
quit;


proc summary data=have NWAY;
  class Case Rank;
  var feature_:;
  output  out=m_sd(drop=_:) MEAN(feature_:)= STD(feature_:)= / AUTONAME; 
run;

proc transpose data=m_sd(obs=0) out=rename(keep=_name_);
  var feature_:; 
run;

filename f TEMP;
data _null_;
file f;

put 'data Combined; merge r m_sd; by Case Rank; rename';

do until(EOF);
  set rename end=EOF;
  _name2_ = catx("_",scan(_NAME_,3,"_"),scan(_NAME_,1,"_"),scan(_NAME_,2,"_"));
  put _NAME_ "=" _NAME2_;
end;

put ';run;';
stop; 
run;
%include f /source2;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;[EDIT:] &lt;STRIKE&gt;the only thing is that variables with mean and std will be called: feature_XX_mean and&amp;nbsp;feature_XX_stdDev. If it is a big problem let me know.&lt;/STRIKE&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Bart&lt;/P&gt;</description>
      <pubDate>Sun, 14 May 2023 14:53:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875673#M42959</guid>
      <dc:creator>yabwon</dc:creator>
      <dc:date>2023-05-14T14:53:21Z</dc:date>
    </item>
    <item>
      <title>Re: Summarize a large table by calculating counts, means and standard deviations</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875681#M42960</link>
      <description>&lt;P&gt;Hello &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/442750"&gt;@St_Rokas&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If, in the last table, you want to "combine" the first and second table so that Rank_Count is vertical, don't create a table (as in your first table) such that the counts are horizontal. You are simply making your work more difficult. Create counts so that they are vertical.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;PROC SUMMARY will create the final table, without first having to create the first two tables. As you have not provided data in a usable form (a "usable form" means working SAS data step code as shown by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/35763"&gt;@yabwon&lt;/a&gt;, and not any other method) the following code is UNTESTED.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=have nway;
    class case region rank;
    var feature_01 feature_02 feature_03;
    output out=want n(feature_01)=rank_count mean=f1_mean f2_mean f3_mean
        stddev=f1_std f2_std f3_std;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 14 May 2023 16:35:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875681#M42960</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2023-05-14T16:35:38Z</dc:date>
    </item>
    <item>
      <title>Re: Summarize a large table by calculating counts, means and standard deviations</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875686#M42961</link>
      <description>Thank you for the tip, Bart - next time I will most certainly do that.</description>
      <pubDate>Sun, 14 May 2023 16:53:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875686#M42961</guid>
      <dc:creator>St_Rokas</dc:creator>
      <dc:date>2023-05-14T16:53:37Z</dc:date>
    </item>
    <item>
      <title>Re: Summarize a large table by calculating counts, means and standard deviations</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875687#M42962</link>
      <description>Thank you!&lt;BR /&gt;</description>
      <pubDate>Sun, 14 May 2023 16:54:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875687#M42962</guid>
      <dc:creator>St_Rokas</dc:creator>
      <dc:date>2023-05-14T16:54:43Z</dc:date>
    </item>
    <item>
      <title>Re: Summarize a large table by calculating counts, means and standard deviations</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875688#M42963</link>
      <description>Thank you for taking time to help!</description>
      <pubDate>Sun, 14 May 2023 16:56:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Summarize-a-large-table-by-calculating-counts-means-and-standard/m-p/875688#M42963</guid>
      <dc:creator>St_Rokas</dc:creator>
      <dc:date>2023-05-14T16:56:17Z</dc:date>
    </item>
  </channel>
</rss>

