<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How can I determine the sample size for categorical data in SAS? in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-determine-the-sample-size-for-categorical-data-in-SAS/m-p/644136#M30885</link>
    <description>&lt;P&gt;I have a data set with 54 million observations, 10 categorical variables (with response 0 or 1, S or N) and 2 numeric variables and some variables have missing values.&lt;/P&gt;&lt;P&gt;I don't know how I can determine the sample size and which method to use in SAS.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;(the goal is to do exploratory analysis and use cluster)&lt;/P&gt;</description>
    <pubDate>Thu, 30 Apr 2020 03:26:49 GMT</pubDate>
    <dc:creator>al165275</dc:creator>
    <dc:date>2020-04-30T03:26:49Z</dc:date>
    <item>
      <title>How can I determine the sample size for categorical data in SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-determine-the-sample-size-for-categorical-data-in-SAS/m-p/644136#M30885</link>
      <description>&lt;P&gt;I have a data set with 54 million observations, 10 categorical variables (with response 0 or 1, S or N) and 2 numeric variables and some variables have missing values.&lt;/P&gt;&lt;P&gt;I don't know how I can determine the sample size and which method to use in SAS.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;(the goal is to do exploratory analysis and use cluster)&lt;/P&gt;</description>
      <pubDate>Thu, 30 Apr 2020 03:26:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-determine-the-sample-size-for-categorical-data-in-SAS/m-p/644136#M30885</guid>
      <dc:creator>al165275</dc:creator>
      <dc:date>2020-04-30T03:26:49Z</dc:date>
    </item>
    <item>
      <title>Re: How can I determine the sample size for categorical data in SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-determine-the-sample-size-for-categorical-data-in-SAS/m-p/644216#M30887</link>
      <description>&lt;P&gt;What do you mean by determining a sample size?&amp;nbsp; With 54 million observations you have all the sample size you need for any inferential statistics, and sample size is not a consideration for exploratory statistics.&amp;nbsp; If you mean what proportion to sample so that you can do the exploratory stats, I would say 1% would be adequate.&amp;nbsp; If you feel you need a better estimate, then try bootstrapping with replacement.&amp;nbsp; Generate a hundred or so 1% samples and then look at the distribution of the sampled parameters.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SteveDenham&lt;/P&gt;</description>
      <pubDate>Thu, 30 Apr 2020 11:45:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-determine-the-sample-size-for-categorical-data-in-SAS/m-p/644216#M30887</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2020-04-30T11:45:14Z</dc:date>
    </item>
    <item>
      <title>Re: How can I determine the sample size for categorical data in SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-determine-the-sample-size-for-categorical-data-in-SAS/m-p/644238#M30888</link>
      <description>&lt;DIV class="tlid-input input"&gt;&lt;DIV class="source-wrap"&gt;&lt;DIV class="input-full-height-wrapper tlid-input-full-height-wrapper"&gt;&lt;DIV class="source-input"&gt;&lt;DIV class="source-footer-wrap source-or-target-footer"&gt;&lt;DIV class="character-count tlid-character-count"&gt;&lt;SPAN style="font-family: inherit;"&gt;Thanks for answering me&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class="tlid-results-container results-container"&gt;&lt;DIV class="tlid-result result-dict-wrapper"&gt;&lt;DIV class="result tlid-copy-target"&gt;&lt;DIV class="text-wrap tlid-copy-target"&gt;&lt;DIV class="result-shield-container tlid-copy-target"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="result-shield-container tlid-copy-target"&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;I'm sorry, I guess I didn't specify that the 54 million observations is my entire population.&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class="result-shield-container tlid-copy-target"&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;To select 1%, which command will I use? SURVEYSELECT?&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class="result-shield-container tlid-copy-target"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="result-shield-container tlid-copy-target"&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;If&amp;nbsp;I select 1%, How I can l&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;&lt;SPAN&gt;ook at the distribution of the sampled parameters i&lt;/SPAN&gt;n SAS? Can you give me an example?&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class="result-shield-container tlid-copy-target"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="result-shield-container tlid-copy-target"&gt;(I'm new with SAS)&lt;/DIV&gt;&lt;DIV class="result-shield-container tlid-copy-target"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="result-shield-container tlid-copy-target"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="result-shield-container tlid-copy-target"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 30 Apr 2020 13:19:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-determine-the-sample-size-for-categorical-data-in-SAS/m-p/644238#M30888</guid>
      <dc:creator>al165275</dc:creator>
      <dc:date>2020-04-30T13:19:18Z</dc:date>
    </item>
    <item>
      <title>Re: How can I determine the sample size for categorical data in SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-determine-the-sample-size-for-categorical-data-in-SAS/m-p/644243#M30889</link>
      <description>&lt;P&gt;Pardon my confusion, but if you have the whole population, why not summarize it, rather than sampling to estimate population parameters?&amp;nbsp; Think about that a bit. These days 54M records are not an extremely large dataset.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;Anyway, if you do want to work with samples:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As far as tools, the SURVEY procs are likely the best - SURVEYSELECT to sample the data, SURVEYFREQ for the categorical variables, SURVEYMEANS for the continuous variables&amp;nbsp; My recommendation is to work through all of the examples in the documentation, so that you get a feel for what the statements in each PROC enable you to do.&amp;nbsp; Then try to use the code there to address your questions.&amp;nbsp; If you run into trouble, come on back, but please don't say "It didn't work.".&amp;nbsp; Provide the code, some sample data and the log to show where and what isn't working.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SteveDenham&lt;/P&gt;</description>
      <pubDate>Thu, 30 Apr 2020 13:42:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-determine-the-sample-size-for-categorical-data-in-SAS/m-p/644243#M30889</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2020-04-30T13:42:15Z</dc:date>
    </item>
    <item>
      <title>Re: How can I determine the sample size for categorical data in SAS?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-determine-the-sample-size-for-categorical-data-in-SAS/m-p/644253#M30896</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/317512"&gt;@al165275&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;DIV class="tlid-input input"&gt;
&lt;DIV class="source-wrap"&gt;
&lt;DIV class="input-full-height-wrapper tlid-input-full-height-wrapper"&gt;
&lt;DIV class="source-input"&gt;
&lt;DIV class="source-footer-wrap source-or-target-footer"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="tlid-results-container results-container"&gt;
&lt;DIV class="tlid-result result-dict-wrapper"&gt;
&lt;DIV class="result tlid-copy-target"&gt;
&lt;DIV class="text-wrap tlid-copy-target"&gt;
&lt;DIV class="result-shield-container tlid-copy-target"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="result-shield-container tlid-copy-target"&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;If&amp;nbsp;I select 1%, How I can l&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;&lt;SPAN&gt;ook at the distribution of the sampled parameters i&lt;/SPAN&gt;n SAS? Can you give me an example?&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV class="result-shield-container tlid-copy-target"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Parameters? I don't see parameters but data. If you want the distribution of values for variables then likely proc freq is a place to start.&lt;/P&gt;</description>
      <pubDate>Thu, 30 Apr 2020 14:21:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-can-I-determine-the-sample-size-for-categorical-data-in-SAS/m-p/644253#M30896</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-04-30T14:21:58Z</dc:date>
    </item>
  </channel>
</rss>

