<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Using RANUNI in PROC SQL query for random sample in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Using-RANUNI-in-PROC-SQL-query-for-random-sample/m-p/207392#M51520</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks ballardw for your reply.&amp;nbsp; I realize my question was somewhat broad and I appreciate your guidance. (actually, I really asked two questions)&amp;nbsp; I'll look into proc surveyselect.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Stuart&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Fri, 01 May 2015 15:55:12 GMT</pubDate>
    <dc:creator>stuart753</dc:creator>
    <dc:date>2015-05-01T15:55:12Z</dc:date>
    <item>
      <title>Using RANUNI in PROC SQL query for random sample</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Using-RANUNI-in-PROC-SQL-query-for-random-sample/m-p/207390#M51518</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I have a dataset with client information and I want to draw a sample for each client and load that sample to a separate dataset.&amp;nbsp; The sample size varies by client.&amp;nbsp; Below is the code I'm using to do this. (I'm still new to SAS and I got help on the macro portion of this code.&amp;nbsp; It was derived from: &lt;A href="http://www2.sas.com/proceedings/sugi26/p093-26.pdf" title="http://www2.sas.com/proceedings/sugi26/p093-26.pdf"&gt;http://www2.sas.com/proceedings/sugi26/p093-26.pdf&lt;/A&gt; )&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My question relates to the WHERE filter in the query that INSERTS INTO the sample dataset.&amp;nbsp; The WHERE clause is: 'WHERE zClientName = "&amp;amp;VAR1" AND ranuni(101) &amp;lt;= &amp;amp;VAR2/&amp;amp;VAR3;'.&amp;nbsp; This method of capturing a sample usually does not produce a sample size that exactly equals the desired sample size.&amp;nbsp; (VAR2 in this case)&amp;nbsp; It gives a sample size that's close, but not exactly equal to VAR2.&amp;nbsp; I have a few questions related to this:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Can someone provide guidance on the best approach to create a sample that equals the target sample size (in this case VAR2)?&amp;nbsp; (As a side note, I'm not a statistician so I really don't know the impact of generating a sample of 189 observations when 200 was requested.&amp;nbsp; Maybe it's not a significant issue??)&lt;/LI&gt;&lt;LI&gt;I noticed that if I rerun the code below (which should create a new dataset) - my sample size is not changing?&amp;nbsp; I would have thought that if I applied a truly random filter to the total population, that the sample size would change from one run to the next.&amp;nbsp; Does anyone know what would cause this?&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;BR /&gt;For question #1 above, I attempted the following SQL Statement in the macro loop below after adding a SAS_Rand field to wpss.ttWPSS:&lt;/P&gt;&lt;P&gt;proc sql INOBS = &amp;amp;VAR2;&lt;/P&gt;&lt;P&gt;INSERT INTO wpss.ttWPSS&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; SELECT *, ranuni(101) AS SAS_Rand from UCMdb.ttWPSS&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE zClientName = "&amp;amp;VAR1" &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; ORDER BY SAS_Rand;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;The problem with this is SAS throws a syntax error (22-322) with the ORDER BY clause.&amp;nbsp; I'm thinking possibly ANSI standard doesn't allow an ordered set to be inserted into a table?&amp;nbsp; - but I'm not sure.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I appreciate any insight on this.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Stuart&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;/*Get unique list of companies and observations (observation = claim)*/&lt;/P&gt;&lt;P&gt;proc sql;&lt;BR /&gt; create table work.WPSS_ClientList AS&lt;BR /&gt; SELECT zClientName, ROUND(CASE WHEN Count(*) * .02 &amp;lt; 200 THEN 200 ELSE Count(*) * .02 END, 1) AS SampleSize,&lt;BR /&gt;&amp;nbsp;&amp;nbsp; Count(*) AS TotalPopulation &lt;BR /&gt;&amp;nbsp; FROM UCMdb.ttWPSS&lt;BR /&gt; GROUP BY zClientName;&lt;BR /&gt;quit;&lt;/P&gt;&lt;P&gt;/*Delete the table with samples to ensure table is cleared prior to running*/&lt;BR /&gt;proc datasets library=wpss;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; DELETE ttwpss;&lt;BR /&gt;quit;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;/*Create the table as a new table*/&lt;BR /&gt;proc sql;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; CREATE TABLE wpss.ttWPSS&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; LIKE UCMdb.ttWPSS;&lt;BR /&gt;quit;&lt;/P&gt;&lt;P&gt;/* Macro to loop through all client observations */&lt;BR /&gt;%MACRO ObsIterate(DriverDataset,FIELD1,FIELD2,FIELD3);&lt;/P&gt;&lt;P&gt;/* First obtain the number of records in Driver Dataset */&lt;/P&gt;&lt;P&gt;DATA _NULL_;&lt;BR /&gt;IF 0 THEN SET &amp;amp;DriverDataset NOBS=X;&lt;BR /&gt;CALL SYMPUT("RECCOUNT",X);&lt;BR /&gt;STOP;&lt;BR /&gt;RUN;&lt;BR /&gt;/* loop from one to number of records */&lt;BR /&gt;%DO I=1 %TO &amp;amp;RECCOUNT;&lt;BR /&gt;/* Advance to the Ith record */&lt;BR /&gt;DATA _NULL_;&lt;BR /&gt;SET &amp;amp;DriverDataset (FIRSTOBS=&amp;amp;I);&lt;BR /&gt;/* store the variables of interest in macro variables */&lt;BR /&gt;CALL SYMPUT("VAR1",&amp;amp;FIELD1);&lt;BR /&gt;CALL SYMPUT("VAR2",&amp;amp;FIELD2);&lt;BR /&gt;CALL SYMPUT("VAR3",&amp;amp;FIELD3);&lt;BR /&gt;STOP;&lt;BR /&gt;RUN;&lt;BR /&gt;/* perform tasks on each observation */&lt;/P&gt;&lt;P&gt;proc sql;&lt;BR /&gt;INSERT INTO wpss.ttWPSS&lt;BR /&gt;&amp;nbsp;&amp;nbsp; SELECT * from UCMdb.ttWPSS&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE zClientName = "&amp;amp;VAR1" AND ranuni(101) &amp;lt;= &amp;amp;VAR2/&amp;amp;VAR3;&lt;BR /&gt;quit;&lt;/P&gt;&lt;P&gt;%END;&lt;/P&gt;&lt;P&gt;%MEND ObsIterate;&lt;/P&gt;&lt;P&gt;/* Call ObsIterate */&lt;BR /&gt;%ObsIterate(WPSS_ClientList,zClientName,SampleSize,TotalPopulation);&lt;BR /&gt;RUN;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 30 Apr 2015 15:55:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Using-RANUNI-in-PROC-SQL-query-for-random-sample/m-p/207390#M51518</guid>
      <dc:creator>stuart753</dc:creator>
      <dc:date>2015-04-30T15:55:31Z</dc:date>
    </item>
    <item>
      <title>Re: Using RANUNI in PROC SQL query for random sample</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Using-RANUNI-in-PROC-SQL-query-for-random-sample/m-p/207391#M51519</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;PRE __jive_macro_name="quote" class="jive_text_macro jive_macro_quote"&gt;
&lt;P&gt;Can someone provide guidance on the best approach to create a sample that equals the target sample size (in this case VAR2)?&amp;nbsp; (As a side note, I'm not a statistician so I really don't know the impact of generating a sample of 189 observations when 200 was requested.&amp;nbsp; Maybe it's not a significant issue??)&lt;/P&gt;
&lt;/PRE&gt;&lt;P&gt;A smaller sample will usually result in wider than designed confidence limits and/or less power in a test. How much and the practical impact will depend on way more information.&lt;/P&gt;&lt;PRE __jive_macro_name="quote" class="jive_text_macro jive_macro_quote"&gt;
&lt;P&gt;I noticed that if I rerun the code below (which should create a new dataset) - my sample size is not changing?&amp;nbsp; I would have thought that if I applied a truly random filter to the total population&lt;/P&gt;
&lt;/PRE&gt;&lt;P&gt;When you use ranuni with the same seed you tend to get the same sequence of random numbers.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If I needed a fixed size random sample I would look into proc Surveyselect.your list of VAR2 values could well be a list of sampsize parameters for a strata of clientnames.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 30 Apr 2015 16:57:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Using-RANUNI-in-PROC-SQL-query-for-random-sample/m-p/207391#M51519</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2015-04-30T16:57:36Z</dc:date>
    </item>
    <item>
      <title>Re: Using RANUNI in PROC SQL query for random sample</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Using-RANUNI-in-PROC-SQL-query-for-random-sample/m-p/207392#M51520</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks ballardw for your reply.&amp;nbsp; I realize my question was somewhat broad and I appreciate your guidance. (actually, I really asked two questions)&amp;nbsp; I'll look into proc surveyselect.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Stuart&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 01 May 2015 15:55:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Using-RANUNI-in-PROC-SQL-query-for-random-sample/m-p/207392#M51520</guid>
      <dc:creator>stuart753</dc:creator>
      <dc:date>2015-05-01T15:55:12Z</dc:date>
    </item>
    <item>
      <title>Re: Using RANUNI in PROC SQL query for random sample</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Using-RANUNI-in-PROC-SQL-query-for-random-sample/m-p/207393#M51521</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;You can use proc sql to do the sampling by making the following changes:&lt;/P&gt;&lt;P&gt;- use CREATE TABLE [temporary work dataset] AS and do the INSERT INTO as a separate statement&lt;/P&gt;&lt;P&gt;- from INOBS to OUTOBS to limit the number of observations written out to the temporary work dataset to exact number you want for your sample&lt;/P&gt;&lt;P&gt;- use ranuni(0) to get a different pseudo-random seed each run (based on clock time)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql &lt;SPAN style="text-decoration: underline;"&gt;&lt;STRONG&gt;OUTOBS&lt;/STRONG&gt;&lt;/SPAN&gt; = &amp;amp;VAR2;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="text-decoration: underline;"&gt;&lt;STRONG&gt;CREATE TABLE temp AS&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;SELECT *, ranuni(&lt;SPAN style="text-decoration: underline;"&gt;&lt;STRONG&gt;0&lt;/STRONG&gt;&lt;/SPAN&gt;) AS SAS_Rand from UCMdb.ttWPSS&lt;/P&gt;&lt;P&gt;WHERE zClientName = "&amp;amp;VAR1" &lt;/P&gt;&lt;P&gt;ORDER BY SAS_Rand;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="text-decoration: underline;"&gt;&lt;STRONG&gt;INSERT INTO wpss.ttWPSS&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="text-decoration: underline;"&gt;&lt;STRONG&gt;SELECT * FROM temp&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;SPAN style="text-decoration: underline;"&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This link may be helpful as well: &lt;A href="http://www2.sas.com/proceedings/sugi31/168-31.pdf" title="http://www2.sas.com/proceedings/sugi31/168-31.pdf"&gt;http://www2.sas.com/proceedings/sugi31/168-31.pdf&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 01 May 2015 17:08:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Using-RANUNI-in-PROC-SQL-query-for-random-sample/m-p/207393#M51521</guid>
      <dc:creator>AlexCurrie</dc:creator>
      <dc:date>2015-05-01T17:08:12Z</dc:date>
    </item>
  </channel>
</rss>

