<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Grouping stats in no grouped data set in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386441#M92553</link>
    <description>&lt;P&gt;Any time I see a request to merge summary values into a data set I want to know how the resulting data set is to be used.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Next, with the same question, comes from "I have to create over 500 variables like mentioned one". Is this 500 in a single set or across the 20 (adding 25 variables to a single set makes much more sense than 500 to each of 20 sets).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;How you use the result can make a difference on approachs. For some things I would tend to make a custom format and attach that to a variable for reporting purposes. For example, your "mean_age_by_sex" variable is going to take exactly 2 values in any given data set.&lt;/P&gt;
&lt;P&gt;So an additional variable is really not needed as if the subject is male or female you already know the value.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It would not hurt to mention that you started with SASHELP.CLASS for example data so folks can work with the data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;with that said:&lt;/P&gt;
&lt;PRE&gt;proc summary data=sashelp.class;
   class sex age;
   var height weight;
   output out=classsummary mean= max= min= std= median= range= /autoname;
run;&lt;/PRE&gt;
&lt;P&gt;Will in a single pass through the data provide the requested statistics for 1) All records 2) by sex 3) by age and 4) by age and sex.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The autoname appends the statistic to the name of the variable. There will be a variable in the output set named _type_. It will have values of 0 to 3 in this case and indicates which combinations of group variables are used to make the statistic.&lt;/P&gt;
&lt;P&gt;I would be strongly tempted to not merge the statistics onto the data until I am actually using them as then I can indicate which _type_ to use and match on the appropriate class variable(s) or none at all. But that's me.&lt;/P&gt;</description>
    <pubDate>Tue, 08 Aug 2017 22:33:11 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2017-08-08T22:33:11Z</dc:date>
    <item>
      <title>Grouping stats in no grouped data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386439#M92551</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I&amp;nbsp; am struggling with the groups' statistics in not grouped data set, I would be grateful if anyone please tell me an efficient way to count stats like std, mean, max by a group and add it to the original dataset. I found this&amp;nbsp;&lt;A href="http://support.sas.com/resources/papers/proceedings11/019-2011.pdf" target="_blank" rel="nofollow noopener noreferrer"&gt;http://support.sas.com/resources/papers/proceedings11/019-2011.pdf,&lt;/A&gt; but they show how to count or sum not mentioned std, max, mean. The other problem is efficiency, I have to create over 500 variables like mentioned one, working with more than 20 large tables(over 10 mln obs in each) so sorting and merging is time-consuming(SQL package process everything in about 15h, nevertheless there is need to rewrite everything into 4GL, SQL Oracle can hadle this problem with overt (partition by)). &amp;nbsp;&lt;/P&gt;&lt;P&gt;In the table below, you will find desirable results. &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Name&lt;/TD&gt;&lt;TD&gt;Sex&lt;/TD&gt;&lt;TD&gt;Age&lt;/TD&gt;&lt;TD&gt;Weight&lt;/TD&gt;&lt;TD&gt;Height&lt;/TD&gt;&lt;TD&gt;mean_age_by_sex&lt;/TD&gt;&lt;TD&gt;std_weight_by_age&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Philip&lt;/TD&gt;&lt;TD&gt;M&lt;/TD&gt;&lt;TD&gt;16&lt;/TD&gt;&lt;TD&gt;72&lt;/TD&gt;&lt;TD&gt;150&lt;/TD&gt;&lt;TD&gt;13,4&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Janet&lt;/TD&gt;&lt;TD&gt;F&lt;/TD&gt;&lt;TD&gt;15&lt;/TD&gt;&lt;TD&gt;62,5&lt;/TD&gt;&lt;TD&gt;112,5&lt;/TD&gt;&lt;TD&gt;13,22&lt;/TD&gt;&lt;TD&gt;1,81573&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Mary&lt;/TD&gt;&lt;TD&gt;F&lt;/TD&gt;&lt;TD&gt;15&lt;/TD&gt;&lt;TD&gt;66,5&lt;/TD&gt;&lt;TD&gt;112&lt;/TD&gt;&lt;TD&gt;13,22&lt;/TD&gt;&lt;TD&gt;1,81573&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Ronald&lt;/TD&gt;&lt;TD&gt;M&lt;/TD&gt;&lt;TD&gt;15&lt;/TD&gt;&lt;TD&gt;67&lt;/TD&gt;&lt;TD&gt;133&lt;/TD&gt;&lt;TD&gt;13,4&lt;/TD&gt;&lt;TD&gt;1,81573&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;William&lt;/TD&gt;&lt;TD&gt;M&lt;/TD&gt;&lt;TD&gt;15&lt;/TD&gt;&lt;TD&gt;66,5&lt;/TD&gt;&lt;TD&gt;112&lt;/TD&gt;&lt;TD&gt;13,4&lt;/TD&gt;&lt;TD&gt;1,81573&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Carol&lt;/TD&gt;&lt;TD&gt;F&lt;/TD&gt;&lt;TD&gt;14&lt;/TD&gt;&lt;TD&gt;62,8&lt;/TD&gt;&lt;TD&gt;102,5&lt;/TD&gt;&lt;TD&gt;13,22&lt;/TD&gt;&lt;TD&gt;2,425902&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Judy&lt;/TD&gt;&lt;TD&gt;F&lt;/TD&gt;&lt;TD&gt;14&lt;/TD&gt;&lt;TD&gt;64,3&lt;/TD&gt;&lt;TD&gt;90&lt;/TD&gt;&lt;TD&gt;13,22&lt;/TD&gt;&lt;TD&gt;2,425902&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Alfred&lt;/TD&gt;&lt;TD&gt;M&lt;/TD&gt;&lt;TD&gt;14&lt;/TD&gt;&lt;TD&gt;69&lt;/TD&gt;&lt;TD&gt;112,5&lt;/TD&gt;&lt;TD&gt;13,4&lt;/TD&gt;&lt;TD&gt;2,425902&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Henry&lt;/TD&gt;&lt;TD&gt;M&lt;/TD&gt;&lt;TD&gt;14&lt;/TD&gt;&lt;TD&gt;63,5&lt;/TD&gt;&lt;TD&gt;102,5&lt;/TD&gt;&lt;TD&gt;13,4&lt;/TD&gt;&lt;TD&gt;2,425902&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Alice&lt;/TD&gt;&lt;TD&gt;F&lt;/TD&gt;&lt;TD&gt;13&lt;/TD&gt;&lt;TD&gt;56,5&lt;/TD&gt;&lt;TD&gt;84&lt;/TD&gt;&lt;TD&gt;13,22&lt;/TD&gt;&lt;TD&gt;3,670907&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Barbara&lt;/TD&gt;&lt;TD&gt;F&lt;/TD&gt;&lt;TD&gt;13&lt;/TD&gt;&lt;TD&gt;65,3&lt;/TD&gt;&lt;TD&gt;98&lt;/TD&gt;&lt;TD&gt;13,22&lt;/TD&gt;&lt;TD&gt;3,670907&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Jeffrey&lt;/TD&gt;&lt;TD&gt;M&lt;/TD&gt;&lt;TD&gt;13&lt;/TD&gt;&lt;TD&gt;62,5&lt;/TD&gt;&lt;TD&gt;84&lt;/TD&gt;&lt;TD&gt;13,4&lt;/TD&gt;&lt;TD&gt;3,670907&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Jane&lt;/TD&gt;&lt;TD&gt;F&lt;/TD&gt;&lt;TD&gt;12&lt;/TD&gt;&lt;TD&gt;59,8&lt;/TD&gt;&lt;TD&gt;84,5&lt;/TD&gt;&lt;TD&gt;13,22&lt;/TD&gt;&lt;TD&gt;2,949305&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Louise&lt;/TD&gt;&lt;TD&gt;F&lt;/TD&gt;&lt;TD&gt;12&lt;/TD&gt;&lt;TD&gt;56,3&lt;/TD&gt;&lt;TD&gt;77&lt;/TD&gt;&lt;TD&gt;13,22&lt;/TD&gt;&lt;TD&gt;2,949305&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;James&lt;/TD&gt;&lt;TD&gt;M&lt;/TD&gt;&lt;TD&gt;12&lt;/TD&gt;&lt;TD&gt;57,3&lt;/TD&gt;&lt;TD&gt;83&lt;/TD&gt;&lt;TD&gt;13,4&lt;/TD&gt;&lt;TD&gt;2,949305&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;John&lt;/TD&gt;&lt;TD&gt;M&lt;/TD&gt;&lt;TD&gt;12&lt;/TD&gt;&lt;TD&gt;59&lt;/TD&gt;&lt;TD&gt;99,5&lt;/TD&gt;&lt;TD&gt;13,4&lt;/TD&gt;&lt;TD&gt;2,949305&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Robert&lt;/TD&gt;&lt;TD&gt;M&lt;/TD&gt;&lt;TD&gt;12&lt;/TD&gt;&lt;TD&gt;64,8&lt;/TD&gt;&lt;TD&gt;128&lt;/TD&gt;&lt;TD&gt;13,4&lt;/TD&gt;&lt;TD&gt;2,949305&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Joyce&lt;/TD&gt;&lt;TD&gt;F&lt;/TD&gt;&lt;TD&gt;11&lt;/TD&gt;&lt;TD&gt;51,3&lt;/TD&gt;&lt;TD&gt;50,5&lt;/TD&gt;&lt;TD&gt;13,22&lt;/TD&gt;&lt;TD&gt;3,1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Thomas&lt;/TD&gt;&lt;TD&gt;M&lt;/TD&gt;&lt;TD&gt;11&lt;/TD&gt;&lt;TD&gt;57,5&lt;/TD&gt;&lt;TD&gt;85&lt;/TD&gt;&lt;TD&gt;13,4&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;3,1&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;</description>
      <pubDate>Tue, 08 Aug 2017 22:08:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386439#M92551</guid>
      <dc:creator>Matt3</dc:creator>
      <dc:date>2017-08-08T22:08:39Z</dc:date>
    </item>
    <item>
      <title>Re: Grouping stats in no grouped data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386441#M92553</link>
      <description>&lt;P&gt;Any time I see a request to merge summary values into a data set I want to know how the resulting data set is to be used.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Next, with the same question, comes from "I have to create over 500 variables like mentioned one". Is this 500 in a single set or across the 20 (adding 25 variables to a single set makes much more sense than 500 to each of 20 sets).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;How you use the result can make a difference on approachs. For some things I would tend to make a custom format and attach that to a variable for reporting purposes. For example, your "mean_age_by_sex" variable is going to take exactly 2 values in any given data set.&lt;/P&gt;
&lt;P&gt;So an additional variable is really not needed as if the subject is male or female you already know the value.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It would not hurt to mention that you started with SASHELP.CLASS for example data so folks can work with the data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;with that said:&lt;/P&gt;
&lt;PRE&gt;proc summary data=sashelp.class;
   class sex age;
   var height weight;
   output out=classsummary mean= max= min= std= median= range= /autoname;
run;&lt;/PRE&gt;
&lt;P&gt;Will in a single pass through the data provide the requested statistics for 1) All records 2) by sex 3) by age and 4) by age and sex.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The autoname appends the statistic to the name of the variable. There will be a variable in the output set named _type_. It will have values of 0 to 3 in this case and indicates which combinations of group variables are used to make the statistic.&lt;/P&gt;
&lt;P&gt;I would be strongly tempted to not merge the statistics onto the data until I am actually using them as then I can indicate which _type_ to use and match on the appropriate class variable(s) or none at all. But that's me.&lt;/P&gt;</description>
      <pubDate>Tue, 08 Aug 2017 22:33:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386441#M92553</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2017-08-08T22:33:11Z</dc:date>
    </item>
    <item>
      <title>Re: Grouping stats in no grouped data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386484#M92574</link>
      <description>&lt;P&gt;1. Resulting data set is going to be use as analytical base table customer 360 customer view*.&lt;BR /&gt;2. Current SQL package build about 20 ABT tables from diffrent point of view and merge them into one ABT 360 customer view*.&lt;/P&gt;</description>
      <pubDate>Wed, 09 Aug 2017 19:07:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386484#M92574</guid>
      <dc:creator>Matt3</dc:creator>
      <dc:date>2017-08-09T19:07:36Z</dc:date>
    </item>
    <item>
      <title>Re: Grouping stats in no grouped data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386609#M92629</link>
      <description>&lt;P&gt;write multiple SQL statement, everyone correspond to a statisical .&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table a as
 select *,mean(age) as mean_age_by_sex
  from sashelp.class
   group by sex;
create table b as
 select *,std(weight) as std_weight_by_age
  from a 
   group by age;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 09 Aug 2017 13:28:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386609#M92629</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2017-08-09T13:28:21Z</dc:date>
    </item>
    <item>
      <title>Re: Grouping stats in no grouped data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386678#M92654</link>
      <description>&lt;P&gt;I have no idea what "&amp;nbsp;base table customer 360 degree" means.&lt;/P&gt;
&lt;P&gt;What specific types of analysis are contemplated? I am not seeing an advantage to all of the redundant information in the proposed structure especially with such large numbers of records.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And is "ABT table" supposed to be something special or is that local to your organization jargon?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Would it be possible to post a few rows of one of these final tables?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 09 Aug 2017 15:32:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386678#M92654</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2017-08-09T15:32:13Z</dc:date>
    </item>
    <item>
      <title>Re: Grouping stats in no grouped data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386773#M92682</link>
      <description>&lt;P&gt;No that`s no possible(personal details), unfortunately.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;ABT is commonly used for predictive models, especially in the banking sector or telecommunication industry.&lt;/P&gt;&lt;P&gt;&lt;A href="https://en.wikipedia.org/wiki/Analytical_base_table" target="_blank"&gt;https://en.wikipedia.org/wiki/Analytical_base_table&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Nevertheless, thank you for your help. I am going to use proc summary.&lt;/P&gt;</description>
      <pubDate>Wed, 09 Aug 2017 19:24:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386773#M92682</guid>
      <dc:creator>Matt3</dc:creator>
      <dc:date>2017-08-09T19:24:33Z</dc:date>
    </item>
    <item>
      <title>Re: Grouping stats in no grouped data set</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386786#M92688</link>
      <description>&lt;P&gt;I am not a banking expert and only a basic modeller but this sounds like one approach, depending on how many individuals you have, would be&lt;/P&gt;
&lt;P&gt;1) summarize from the entire data such as the Proc Summary shown.&lt;/P&gt;
&lt;P&gt;2) select a sample from the main data, maybe a 20%, or enough to give a reasonable shot at developing a model&lt;/P&gt;
&lt;P&gt;3) test the model with the remaining data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If the time crunch is the mergin this may help.&lt;/P&gt;</description>
      <pubDate>Wed, 09 Aug 2017 19:37:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Grouping-stats-in-no-grouped-data-set/m-p/386786#M92688</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2017-08-09T19:37:52Z</dc:date>
    </item>
  </channel>
</rss>

