<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: random  number in SAS Enterprise Guide</title>
    <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6209#M1961</link>
    <description>Try PROC SURVEYSELECT (with method=SRS) in order to select a simple random sample of size N.</description>
    <pubDate>Tue, 08 Jan 2008 14:47:19 GMT</pubDate>
    <dc:creator>deleted_user</dc:creator>
    <dc:date>2008-01-08T14:47:19Z</dc:date>
    <item>
      <title>random  number</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6208#M1960</link>
      <description>Hi &lt;BR /&gt;
I got 10 data sets which is having each 60000 obs. so i nees to get 100 random obs from each set . can you please let me know any one .&lt;BR /&gt;
&lt;BR /&gt;
I got idea to use RANUNI  but dont know how&lt;BR /&gt;
&lt;BR /&gt;
thx</description>
      <pubDate>Tue, 08 Jan 2008 14:39:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6208#M1960</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-01-08T14:39:39Z</dc:date>
    </item>
    <item>
      <title>Re: random  number</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6209#M1961</link>
      <description>Try PROC SURVEYSELECT (with method=SRS) in order to select a simple random sample of size N.</description>
      <pubDate>Tue, 08 Jan 2008 14:47:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6209#M1961</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-01-08T14:47:19Z</dc:date>
    </item>
    <item>
      <title>Re: random  number</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6210#M1962</link>
      <description>Proc SurveySelect is not part of Base SAS, so you may not have it available to you.&lt;BR /&gt;
&lt;BR /&gt;
Within SAS EG under Data is "Random Sample".&lt;BR /&gt;
&lt;BR /&gt;
If coding here's an idea:&lt;BR /&gt;
&lt;BR /&gt;
%macro select(inset,outset,size);&lt;BR /&gt;
&lt;BR /&gt;
Data &amp;amp;outset;&lt;BR /&gt;
  set &amp;amp;inset nobs=N;&lt;BR /&gt;
  retain criteria count fudge 0;&lt;BR /&gt;
&lt;BR /&gt;
  if _n_ = 1 then criteria = N/&amp;amp;size;&lt;BR /&gt;
&lt;BR /&gt;
  if ranuni(-1) + fudge &amp;gt; criteria then do;&lt;BR /&gt;
    if count &amp;lt; 100 then do;&lt;BR /&gt;
      output;&lt;BR /&gt;
      count+1;&lt;BR /&gt;
      fudge+criteria;&lt;BR /&gt;
    end;&lt;BR /&gt;
  end;&lt;BR /&gt;
&lt;BR /&gt;
  drop criteria count fudge;&lt;BR /&gt;
run;&lt;BR /&gt;
quit;&lt;BR /&gt;
&lt;BR /&gt;
%mend;&lt;BR /&gt;
&lt;BR /&gt;
By increasing fudge, the probability of selecting a record increases, so that there is a greater change of selecting a particular record.&lt;BR /&gt;
The downside to this method is that the actual probability distribution is not uniform.  If fudge were not used, and "uniformity" maintained, then in a single pass through the dataset, you might not get all "size = 100" records/observations.&lt;BR /&gt;
&lt;BR /&gt;
An alternative would be to use the POINT= set option&lt;BR /&gt;
&lt;BR /&gt;
data &amp;amp;outset;&lt;BR /&gt;
  retain count 0;&lt;BR /&gt;
  I = ranuni(-1) * N;  &lt;BR /&gt;
  set &amp;amp;inset NOBS=N POINT=I;&lt;BR /&gt;
  count+1;&lt;BR /&gt;
  if count = &amp;amp;size then stop;&lt;BR /&gt;
  drop count;&lt;BR /&gt;
run;&lt;BR /&gt;
quit;&lt;BR /&gt;
    &lt;BR /&gt;
This is probably a better method, and can also be encased in the above macro.</description>
      <pubDate>Tue, 08 Jan 2008 15:40:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6210#M1962</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-01-08T15:40:13Z</dc:date>
    </item>
    <item>
      <title>Re: random  number</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6211#M1963</link>
      <description>Another idea, that may work better:&lt;BR /&gt;
&lt;BR /&gt;
%macro select(inset,outset,size);&lt;BR /&gt;
&lt;BR /&gt;
data &amp;amp;outset;&lt;BR /&gt;
&lt;BR /&gt;
retain count 0;&lt;BR /&gt;
drop count;&lt;BR /&gt;
&lt;BR /&gt;
I = ranuni(-1) * N;&lt;BR /&gt;
set &amp;amp;inset NOBS=N POINT=I;&lt;BR /&gt;
&lt;BR /&gt;
count+1;&lt;BR /&gt;
if count = &amp;amp;size then stop;&lt;BR /&gt;
&lt;BR /&gt;
run;&lt;BR /&gt;
quit;&lt;BR /&gt;
&lt;BR /&gt;
%mend;</description>
      <pubDate>Tue, 08 Jan 2008 15:43:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6211#M1963</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-01-08T15:43:20Z</dc:date>
    </item>
    <item>
      <title>Re: random  number</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6212#M1964</link>
      <description>Chuck, your approach doesn't guarantee that a row could be selected multiple times, does it?</description>
      <pubDate>Tue, 08 Jan 2008 17:32:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6212#M1964</guid>
      <dc:creator>advoss</dc:creator>
      <dc:date>2008-01-08T17:32:16Z</dc:date>
    </item>
    <item>
      <title>Re: random  number</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6213#M1965</link>
      <description>Yes, you are correct for the POINT= method.  There would need to be some way to check that that observation hadn't been used already.&lt;BR /&gt;
&lt;BR /&gt;
Also, the calculation for I isn't quite right either since it doesn't guarantee an integer value.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
 I = round(ranuni(-1) * N) is an easy solution to the integer problem.&lt;BR /&gt;
&lt;BR /&gt;
Solving the other problem takes a bit more work.&lt;BR /&gt;
One way would be to use an array to keep a list of consumed records, and then use a linear search through the array to determine if the observation has been read before or not.&lt;BR /&gt;
&lt;BR /&gt;
Another way to get a random subset of observations would require multiple passes through the dataset.&lt;BR /&gt;
&lt;BR /&gt;
data dummy;&lt;BR /&gt;
  set &amp;amp;inset;&lt;BR /&gt;
    selection_key = ranuni(-1);&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
proc sort data=dummy; by selection_key;&lt;BR /&gt;
&lt;BR /&gt;
data &amp;amp;outset;&lt;BR /&gt;
  set dummy (obs=&amp;amp;size);&lt;BR /&gt;
  drop selection_key;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
But, this is still not perfectly generic, as none of the ideas are because they introduce at least one variable that may already be defined within the &amp;amp;inset dataset.  So, no matter what is done, care must be taken, and some creativity on the part of the programmer.</description>
      <pubDate>Tue, 08 Jan 2008 21:42:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/random-number/m-p/6213#M1965</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-01-08T21:42:03Z</dc:date>
    </item>
  </channel>
</rss>

