<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Function over a List - Create dataset with various groupping statistics in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491203#M128759</link>
    <description>&lt;P&gt;The general question is if it is possible to create a function that takes a list, applies a function to each item in that list, and then returns the results in a usable type (list, dataset, etc).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In my specific case, I have a dataset with several categorical variables, and I want to return a dataset that contains various summary statistics (mean, count, etc) as columns and rows are determined by each categorical variable.&amp;nbsp; I can do with SQL by creating a dataset of a single row for each categorical variable and then iteratively combining these, but it is rather ugly and awkward.&amp;nbsp; Is there a better way to do this?&lt;/P&gt;</description>
    <pubDate>Thu, 30 Aug 2018 12:41:55 GMT</pubDate>
    <dc:creator>mark4</dc:creator>
    <dc:date>2018-08-30T12:41:55Z</dc:date>
    <item>
      <title>Function over a List - Create dataset with various groupping statistics</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491203#M128759</link>
      <description>&lt;P&gt;The general question is if it is possible to create a function that takes a list, applies a function to each item in that list, and then returns the results in a usable type (list, dataset, etc).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In my specific case, I have a dataset with several categorical variables, and I want to return a dataset that contains various summary statistics (mean, count, etc) as columns and rows are determined by each categorical variable.&amp;nbsp; I can do with SQL by creating a dataset of a single row for each categorical variable and then iteratively combining these, but it is rather ugly and awkward.&amp;nbsp; Is there a better way to do this?&lt;/P&gt;</description>
      <pubDate>Thu, 30 Aug 2018 12:41:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491203#M128759</guid>
      <dc:creator>mark4</dc:creator>
      <dc:date>2018-08-30T12:41:55Z</dc:date>
    </item>
    <item>
      <title>Re: Function over a List - Create dataset with various groupping statistics</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491206#M128762</link>
      <description>&lt;P&gt;Please provide some example, with test data in the form of a datastep, and what the output should look like.&lt;/P&gt;
&lt;P&gt;This: "&lt;SPAN&gt;The general question is if it is possible to create a function that takes a list, applies a function to each item in that list, and then returns the results in a usable type (list, dataset, etc)." - is what a datastep is.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Dataset in with list&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Datastep runs function against that&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Data out&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Aug 2018 12:47:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491206#M128762</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2018-08-30T12:47:46Z</dc:date>
    </item>
    <item>
      <title>Re: Function over a List - Create dataset with various groupping statistics</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491229#M128769</link>
      <description>&lt;P&gt;I don't see how to reconcile your description of a data step with my problem.&amp;nbsp; The data step, as far as I know, takes a list of datasets as input and produces a dataset, but that is not what I'm after here:&amp;nbsp; My list is a set of categorical variables from a dataset and the function takes a variable and produces a dataset.&amp;nbsp; So,&lt;/P&gt;&lt;P&gt;"Data Step": (dataset_1, ..., dataset_n) -&amp;gt; dataset,&lt;/P&gt;&lt;P&gt;but I want&lt;/P&gt;&lt;P&gt;"Wanted_Function": (variable, dataset) -&amp;gt; dataset&lt;/P&gt;&lt;P&gt;where variable is a column of the input dataset.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Since I have a working solution, I'm really interested in knowing if there are built-in functions or other solutions I might be missing that can clean up my code and the amount of datasets produced.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For example, if my dataset my_data looked like:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sex&amp;nbsp;&amp;nbsp;&amp;nbsp; Region &amp;nbsp; Salary&lt;/P&gt;&lt;P&gt;M&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; A &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 10000&lt;/P&gt;&lt;P&gt;M&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; B &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 15000&lt;/P&gt;&lt;P&gt;F&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; A &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 11000&lt;/P&gt;&lt;P&gt;F&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; A &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 14000&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I would want a dataset want_data that looked like&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Category&amp;nbsp;&amp;nbsp; Count&amp;nbsp;&amp;nbsp; Avg_Salary&lt;/P&gt;&lt;P&gt;M&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2 &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 12500 &amp;nbsp;&lt;/P&gt;&lt;P&gt;F&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2 &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 12500 &amp;nbsp;&lt;/P&gt;&lt;P&gt;A&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3 &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 12500 &amp;nbsp;&lt;/P&gt;&lt;P&gt;B&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1 &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 11667 &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I can create this in SQL, but as I said, it's ugly - it builds a lot of intermediate datasets (here, just two - group_by_Sex, group_by_Region) that I don't want to keep and the union statement gets messy when you start incorporating more and more categorical variables.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;%macro summary_table(categorical_variable);&lt;/P&gt;&lt;P&gt;&amp;nbsp; create table group_by_&amp;amp;categorical_variable as select&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;amp;categorical_variable as category, count() as count, avg(Salarly) as Avg_Salary&lt;/P&gt;&lt;P&gt;&amp;nbsp; from my_data group by&amp;nbsp; &amp;amp;categorical_varialble;&lt;/P&gt;&lt;P&gt;%mend;&lt;/P&gt;&lt;P&gt;proc sql noprint;&lt;/P&gt;&lt;P&gt;%summary_table(Sex);&lt;/P&gt;&lt;P&gt;%summary_table(Region);&lt;/P&gt;&lt;P&gt;create table want_data as (select * from group_by_Sex) union (select * from group_by_Region);&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Aug 2018 13:32:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491229#M128769</guid>
      <dc:creator>mark4</dc:creator>
      <dc:date>2018-08-30T13:32:17Z</dc:date>
    </item>
    <item>
      <title>Re: Function over a List - Create dataset with various groupping statistics</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491250#M128778</link>
      <description>&lt;PRE&gt;data have;
 input Sex $ Region $ Salary;
datalines;
M       A             10000
M       B             15000
F        A             11000
F        A             14000
;
run;

proc summary data=have;
  class sex region;
  var salary;
  output out=want sum=sum;
run;&lt;/PRE&gt;
&lt;P&gt;That gives the sums for sex/region and combinations.&amp;nbsp; I can't find the post from data_null, but he had presented two options to reduce the number of the output, let me see if I can find it.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Ah, here we go:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Programming/Summary-table-with-categories-from-multiple-vars/m-p/484836#M125896" target="_blank"&gt;https://communities.sas.com/t5/SAS-Programming/Summary-table-with-categories-from-multiple-vars/m-p/484836#M125896&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Aug 2018 14:15:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491250#M128778</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2018-08-30T14:15:08Z</dc:date>
    </item>
    <item>
      <title>Re: Function over a List - Create dataset with various groupping statistics</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491269#M128789</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/149303"&gt;@mark4&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;I don't see how to reconcile your description of a data step with my problem.&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The request for a data step that demonstrates the structure of the data set and names and types of variables is so we can provide examples that match you situation closer. A data step allows us to create a data set that code can be tested with.&lt;/P&gt;
&lt;P&gt;It often saves a great many posts when a suggested solution is provided but then the original poster comes back and "says that didn't work", followed by questions determining why not. Then finding out that one of the variables was character and such concepts as "mean" and "standard deviation" don't actually apply. Also, a fair number of questions here data that is poorly structured for a specific task. With an example we can show how to restructure the data as well as get the solution steps.&lt;/P&gt;</description>
      <pubDate>Thu, 30 Aug 2018 14:52:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491269#M128789</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2018-08-30T14:52:00Z</dc:date>
    </item>
    <item>
      <title>Re: Function over a List - Create dataset with various groupping statistics</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491292#M128808</link>
      <description>&lt;P&gt;Maybe something like this.&amp;nbsp; I like the STACKODS output from PROC MEANS for summary stats.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data class;
   set sashelp.class;
   trt = rantbl(456789,.4);
   run;
proc print;
   run;
ods select none;
ods output summary=summary;
proc means data=class chartype descendtypes stackods missing n mean stddev median min max;
   class age sex trt / mlf;
   types trt*(age sex);
   run;
ods select all;
proc print;
   run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 534px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/22895i400083883D2910FC/image-size/large?v=v2&amp;amp;px=999" role="button" title="Capture.PNG" alt="Capture.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Aug 2018 15:19:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491292#M128808</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2018-08-30T15:19:04Z</dc:date>
    </item>
    <item>
      <title>Re: Function over a List - Create dataset with various groupping statistics</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491419#M128862</link>
      <description>&lt;P&gt;For categorical variables you would use PROC FREQ to get counts&lt;/P&gt;
&lt;P&gt;For numerical/continous variables you would use PROC MEANS. You've received serveral results for PROC MEANS, so I'll show you the PROC FREQ for categorical data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And here's a way to list all the possible levels and counts:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://gist.github.com/statgeek/e0903d269d4a71316a4e" target="_blank"&gt;https://gist.github.com/statgeek/e0903d269d4a71316a4e&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here's a quick way to do a table summary, with a single cross variable:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://gist.github.com/statgeek/0c4aeec9053cf8050be18a03b842c1b9" target="_blank"&gt;https://gist.github.com/statgeek/0c4aeec9053cf8050be18a03b842c1b9&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Another good PROC to keep in your back pocket for summary tables is PROC TABULATE. It's pretty powerful for creating various summaries and probably under utilized.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/149303"&gt;@mark4&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;The general question is if it is possible to create a function that takes a list, applies a function to each item in that list, and then returns the results in a usable type (list, dataset, etc).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In my specific case, I have a dataset with several categorical variables, and I want to return a dataset that contains various summary statistics (mean, count, etc) as columns and rows are determined by each categorical variable.&amp;nbsp; I can do with SQL by creating a dataset of a single row for each categorical variable and then iteratively combining these, but it is rather ugly and awkward.&amp;nbsp; Is there a better way to do this?&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Aug 2018 20:42:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/491419#M128862</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-08-30T20:42:12Z</dc:date>
    </item>
    <item>
      <title>Re: Function over a List - Create dataset with various groupping statistics</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/492603#M129478</link>
      <description>No, I get that. My intent wasn't so much a solution as general principals to get to a solution so I was trying to be as general as possible. Having the various options discussed below is very helpful for me since a lot of SAS requires you to know which procedures are even out there for you to use.</description>
      <pubDate>Wed, 05 Sep 2018 11:26:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Function-over-a-List-Create-dataset-with-various-groupping/m-p/492603#M129478</guid>
      <dc:creator>mark4</dc:creator>
      <dc:date>2018-09-05T11:26:00Z</dc:date>
    </item>
  </channel>
</rss>

