<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: selecting IDs randomly from a dataset in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/selecting-IDs-randomly-from-a-dataset/m-p/758427#M239469</link>
    <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/168930"&gt;@Anita_n&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/168930"&gt;@Anita_n&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;I wish to select firstly, all IDs which occurs less than 5 times with corresponding variables in another dataset(newly created). After that I wish to select randomly from the list of the ids which occurs more than 5 times 5 ids with corresponding variables.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I guess you mean 5 &lt;EM&gt;observations&lt;/EM&gt;, not 5 IDs, from those IDs with more than 5 observations and also that you want all 5 observations from those IDs with exactly 5 observations (if there were any). In this case &lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_surveyselect_toc.htm" target="_blank" rel="noopener"&gt;PROC SURVEYSELECT&lt;/A&gt; with the &lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_surveyselect_syntax01.htm#statug.surveyselect.selectselectall" target="_blank" rel="noopener"&gt;SELECTALL&lt;/A&gt; option (and the selection method "simple random sampling," which is the default) meets the requirements:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=test;
by pat_id;
run;

proc surveyselect data=test
method=srs n=5 selectall /* outall */
seed=2718 out=want(drop=SelectionProb SamplingWeight);
strata pat_id;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Use the &lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_surveyselect_syntax01.htm#statug.surveyselect.selectoutall" target="_blank" rel="noopener"&gt;OUTALL&lt;/A&gt; option (commented out above) if you want all observations from TEST in dataset WANT with a 0-1 flag variable (named &lt;FONT face="courier new,courier"&gt;Selected&lt;/FONT&gt;) indicating whether or not an observation belongs to the random sample. Otherwise, WANT contains only the random sample.&lt;/P&gt;</description>
    <pubDate>Fri, 30 Jul 2021 15:40:28 GMT</pubDate>
    <dc:creator>FreelanceReinh</dc:creator>
    <dc:date>2021-07-30T15:40:28Z</dc:date>
    <item>
      <title>selecting IDs randomly from a dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/selecting-IDs-randomly-from-a-dataset/m-p/758420#M239466</link>
      <description>&lt;P&gt;Dear all,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have this sample dataset :&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
input pat_id 2. sex $2. age 3. var1 $2. var2 $2. var3 $2. var4 $2. year 5.;
datalines;
1  F 25 A B C D 2001
1  F 25 E F G D 2002
2  F 35 C M N D 2010
2  F 35 E F V W 2020
15 M 55 A B C D 2011
15 M 55 E F G D 2010
15 M 55 U B C D 2011
15 M 55 j F K D 2010
15 M 55 X B C D 2009
15 M 55 E Y G D 2008
15 M 55 F Y T D 2008
11 F 60 A B C D 2001
11 F 60 E F G D 2002
11 F 60 U B C D 2015
11 F 60 j F K D 2004
11 F 60 X B C D 2010
11 F 60 E Y G D 2014
11 F 60 F Y T D 2008&lt;BR /&gt;11 F 60 S F G D 2003
11 F 60 V B G D 2012
11 F 60 K F K Q 2000
11 F 60 Z B M D 2011
11 F 60 U Y G S 2010
11 F 60 X Y T O 2009
;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I wish to select firstly, all IDs which occurs less than 5 times with corresponding variables in another dataset(newly created). After that I wish to select randomly from the list of the ids which occurs more than 5 times 5 ids with corresponding variables.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A row should not be selected twice. If already selected it should be flagged so that. it wouldn't be selected again.&lt;/P&gt;
&lt;P&gt;Any help?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 30 Jul 2021 15:05:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/selecting-IDs-randomly-from-a-dataset/m-p/758420#M239466</guid>
      <dc:creator>Anita_n</dc:creator>
      <dc:date>2021-07-30T15:05:55Z</dc:date>
    </item>
    <item>
      <title>Re: selecting IDs randomly from a dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/selecting-IDs-randomly-from-a-dataset/m-p/758425#M239468</link>
      <description>&lt;P&gt;So you want to stratify your sample based on number of times that they appear?&lt;/P&gt;
&lt;P&gt;You can use PROC FREQ (or PROC SQL) to count the number of observations per ID.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc freq data=test;
  tables id / noprint out=count;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Do you just want to sample from the unique ID list?&lt;/P&gt;
&lt;P&gt;Or do you want to sample observations from the original file?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Show an example of the output you want for the given example input.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 30 Jul 2021 15:39:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/selecting-IDs-randomly-from-a-dataset/m-p/758425#M239468</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2021-07-30T15:39:06Z</dc:date>
    </item>
    <item>
      <title>Re: selecting IDs randomly from a dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/selecting-IDs-randomly-from-a-dataset/m-p/758427#M239469</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/168930"&gt;@Anita_n&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/168930"&gt;@Anita_n&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;I wish to select firstly, all IDs which occurs less than 5 times with corresponding variables in another dataset(newly created). After that I wish to select randomly from the list of the ids which occurs more than 5 times 5 ids with corresponding variables.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I guess you mean 5 &lt;EM&gt;observations&lt;/EM&gt;, not 5 IDs, from those IDs with more than 5 observations and also that you want all 5 observations from those IDs with exactly 5 observations (if there were any). In this case &lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_surveyselect_toc.htm" target="_blank" rel="noopener"&gt;PROC SURVEYSELECT&lt;/A&gt; with the &lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_surveyselect_syntax01.htm#statug.surveyselect.selectselectall" target="_blank" rel="noopener"&gt;SELECTALL&lt;/A&gt; option (and the selection method "simple random sampling," which is the default) meets the requirements:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=test;
by pat_id;
run;

proc surveyselect data=test
method=srs n=5 selectall /* outall */
seed=2718 out=want(drop=SelectionProb SamplingWeight);
strata pat_id;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Use the &lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_surveyselect_syntax01.htm#statug.surveyselect.selectoutall" target="_blank" rel="noopener"&gt;OUTALL&lt;/A&gt; option (commented out above) if you want all observations from TEST in dataset WANT with a 0-1 flag variable (named &lt;FONT face="courier new,courier"&gt;Selected&lt;/FONT&gt;) indicating whether or not an observation belongs to the random sample. Otherwise, WANT contains only the random sample.&lt;/P&gt;</description>
      <pubDate>Fri, 30 Jul 2021 15:40:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/selecting-IDs-randomly-from-a-dataset/m-p/758427#M239469</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2021-07-30T15:40:28Z</dc:date>
    </item>
    <item>
      <title>Re: selecting IDs randomly from a dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/selecting-IDs-randomly-from-a-dataset/m-p/758429#M239471</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/32733"&gt;@FreelanceReinh&lt;/a&gt;: Thanks, that is exactly what I wanted&lt;/P&gt;</description>
      <pubDate>Fri, 30 Jul 2021 15:55:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/selecting-IDs-randomly-from-a-dataset/m-p/758429#M239471</guid>
      <dc:creator>Anita_n</dc:creator>
      <dc:date>2021-07-30T15:55:51Z</dc:date>
    </item>
  </channel>
</rss>

