<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Help with stratified random samples in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50401#M10520</link>
    <description>Hello Songqi,&lt;BR /&gt;
&lt;BR /&gt;
I'm not too sure whether I understood what had to be done. Nevertheless, this was great fun &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;BR /&gt;
&lt;BR /&gt;
Regards,&lt;BR /&gt;
&lt;BR /&gt;
Yoba&lt;BR /&gt;
&lt;BR /&gt;
* Second question;&lt;BR /&gt;
&lt;BR /&gt;
proc sql;&lt;BR /&gt;
create table T02_groups as&lt;BR /&gt;
select group, count(*) as count&lt;BR /&gt;
from T01_data&lt;BR /&gt;
group by 1;&lt;BR /&gt;
quit;&lt;BR /&gt;
&lt;BR /&gt;
proc surveyselect data=T02_groups&lt;BR /&gt;
out=T03_sample(rename=(count=_NSIZE_) drop=numberHits)&lt;BR /&gt;
sampsize=3 method=urs outhits;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
data T04_newGroup;&lt;BR /&gt;
set T03_sample;&lt;BR /&gt;
newGroup+1;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
proc sql;&lt;BR /&gt;
create table T05_data as&lt;BR /&gt;
select A.*, B.newGroup&lt;BR /&gt;
from T01_data A, T04_newGroup B&lt;BR /&gt;
where A.group=B.group&lt;BR /&gt;
order by group, newGroup;&lt;BR /&gt;
quit;&lt;BR /&gt;
&lt;BR /&gt;
proc surveyselect data=T05_data&lt;BR /&gt;
                  sampsize=T04_newGroup method=urs outhits&lt;BR /&gt;
                  out=T06_sample_sample;&lt;BR /&gt;
strata group newGroup;&lt;BR /&gt;
run;</description>
    <pubDate>Tue, 30 Jun 2009 21:07:31 GMT</pubDate>
    <dc:creator>deleted_user</dc:creator>
    <dc:date>2009-06-30T21:07:31Z</dc:date>
    <item>
      <title>Help with stratified random samples</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50398#M10517</link>
      <description>Hello,&lt;BR /&gt;
&lt;BR /&gt;
Here is my question. My data has stratified structure. And I want to create some stratified random samples from my data.&lt;BR /&gt;
&lt;BR /&gt;
Suppose my original data look like this: I have three groups of people. Group 1 and group 3 has two people in it and group 2 has three people in it. For each people I have three variables: X, Y, and Z.&lt;BR /&gt;
&lt;BR /&gt;
Group      ID        X      Y      Z&lt;BR /&gt;
1             11       3       8      9&lt;BR /&gt;
1             12       4      10     16&lt;BR /&gt;
2             21       1       5      6&lt;BR /&gt;
2             22       2       7      7&lt;BR /&gt;
2             23       5       6      12&lt;BR /&gt;
3             31       8       6       7&lt;BR /&gt;
3             32       9       4       3&lt;BR /&gt;
&lt;BR /&gt;
I want to select a random sample of groups with replacement, say 1,1,2, or 1, 3, 3. Then keep all observations within that group. So the sample I create should look this way (1, 1, 2):&lt;BR /&gt;
&lt;BR /&gt;
Group      ID        X      Y      Z        NewGroup&lt;BR /&gt;
1             11       3       8      9           1&lt;BR /&gt;
1             12       4      10     16          1&lt;BR /&gt;
1             11       3       8      9           2&lt;BR /&gt;
1             12       4      10     16          2&lt;BR /&gt;
2             21       1       5      6           3&lt;BR /&gt;
2             22       2       7      7           3&lt;BR /&gt;
2             23       5       6      12         3&lt;BR /&gt;
&lt;BR /&gt;
Note that I need a variable (i.e., NewGroup) indicating that the first and second two lines belong to different units. &lt;BR /&gt;
&lt;BR /&gt;
Or (1, 3, 3):&lt;BR /&gt;
Group      ID        X      Y      Z     NewGroup&lt;BR /&gt;
1             11       3       8      9           1&lt;BR /&gt;
1             12       4      10     16          1&lt;BR /&gt;
3             31       8       6       7          2&lt;BR /&gt;
3             32       9       4       3          2&lt;BR /&gt;
3             31       8       6       7          3&lt;BR /&gt;
3             32       9       4       3          3&lt;BR /&gt;
&lt;BR /&gt;
I am wondering how to get this type of random samples.&lt;BR /&gt;
&lt;BR /&gt;
My second question is similar but slightly more complicated. In the first step I get a random sample of groups with replacement, say (1, 1, 2). In the second step, I randomly select people from the every group I get with replacement, say ((11, 12), (12, 12), (21, 22, 22)), or ((11, 11), (11, 12), (21, 21, 23)). And my output data should look like this ((11, 12), (12, 12), (21, 22, 22)):&lt;BR /&gt;
&lt;BR /&gt;
Group      ID        X      Y      Z     NewGroup&lt;BR /&gt;
1             11       3       8      9          1&lt;BR /&gt;
1             12       4      10     16         1&lt;BR /&gt;
1             12       4      10     16         2&lt;BR /&gt;
1             12       4      10     16         2&lt;BR /&gt;
2             21       1       5      6          3&lt;BR /&gt;
2             22       2       7      7          3&lt;BR /&gt;
2             22       2       7      7          3&lt;BR /&gt;
&lt;BR /&gt;
Or ((11, 11), (11, 12), (21, 21, 23)):&lt;BR /&gt;
&lt;BR /&gt;
Group      ID        X      Y      Z     NewGroup&lt;BR /&gt;
1             11       3       8      9          1&lt;BR /&gt;
1             11       3       8      9          1&lt;BR /&gt;
1             11       3       8      9          2&lt;BR /&gt;
1             12       4      10     16         2&lt;BR /&gt;
2             21       1       5      6          3&lt;BR /&gt;
2             21       1       5      6          3&lt;BR /&gt;
2             23       5       6      12         3&lt;BR /&gt;
&lt;BR /&gt;
How can I achieve that?&lt;BR /&gt;
&lt;BR /&gt;
Thank you very much for your help!</description>
      <pubDate>Tue, 30 Jun 2009 19:39:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50398#M10517</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-06-30T19:39:36Z</dc:date>
    </item>
    <item>
      <title>Re: Help with stratified random samples</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50399#M10518</link>
      <description>Hello Sliu,&lt;BR /&gt;
&lt;BR /&gt;
Could you take a look at this piece of code?&lt;BR /&gt;
&lt;BR /&gt;
data T01_data;&lt;BR /&gt;
infile cards;&lt;BR /&gt;
input group id x y z;&lt;BR /&gt;
cards;&lt;BR /&gt;
1 11 3 8 9&lt;BR /&gt;
1 12 4 10 16&lt;BR /&gt;
2 21 1 5 6&lt;BR /&gt;
2 22 2 7 7&lt;BR /&gt;
2 23 5 6 12&lt;BR /&gt;
3 31 8 6 7&lt;BR /&gt;
3 32 9 4 3&lt;BR /&gt;
;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
* First question;&lt;BR /&gt;
&lt;BR /&gt;
proc sql;&lt;BR /&gt;
create table T02_groups as&lt;BR /&gt;
select distinct group&lt;BR /&gt;
from T01_data;&lt;BR /&gt;
quit;&lt;BR /&gt;
&lt;BR /&gt;
proc surveyselect data=T02_groups&lt;BR /&gt;
                  out=T03_sample(drop=numberHits)&lt;BR /&gt;
                  sampsize=3 method=urs outhits;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
data T04_newgroup;&lt;BR /&gt;
set T03_sample;&lt;BR /&gt;
newGroup+1;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
proc sql;&lt;BR /&gt;
create table T05_final as&lt;BR /&gt;
select A.*, B.newGroup&lt;BR /&gt;
from T01_data A, T04_newGroup B&lt;BR /&gt;
where A.group=B.group;&lt;BR /&gt;
quit;&lt;BR /&gt;
&lt;BR /&gt;
For the second question ... how does it come that you get 2 observations, 2 observations and then &lt;B&gt;3&lt;/B&gt; observations in  ((11, 12), (12, 12), (21, 22, 22)) ?  I don't get it,&lt;BR /&gt;
&lt;BR /&gt;
Regards,&lt;BR /&gt;
&lt;BR /&gt;
Yoba</description>
      <pubDate>Tue, 30 Jun 2009 20:16:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50399#M10518</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-06-30T20:16:01Z</dc:date>
    </item>
    <item>
      <title>Re: Help with stratified random samples</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50400#M10519</link>
      <description>Hi Yoba,&lt;BR /&gt;
&lt;BR /&gt;
Thanks for your code! I will take a look. &lt;BR /&gt;
&lt;BR /&gt;
For your question: in my original data group 2 has three observations. So when I select (1, 1, 2), I want to have correspondingly two obs in newgroup 1 (originally from group 1), two obs in newgroup 2 (originally from group 1), and three obs in newgroup 3 (original from group 2). Does this help? Sorry, I did not make it clear.&lt;BR /&gt;
&lt;BR /&gt;
Songqi</description>
      <pubDate>Tue, 30 Jun 2009 20:25:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50400#M10519</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-06-30T20:25:28Z</dc:date>
    </item>
    <item>
      <title>Re: Help with stratified random samples</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50401#M10520</link>
      <description>Hello Songqi,&lt;BR /&gt;
&lt;BR /&gt;
I'm not too sure whether I understood what had to be done. Nevertheless, this was great fun &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;BR /&gt;
&lt;BR /&gt;
Regards,&lt;BR /&gt;
&lt;BR /&gt;
Yoba&lt;BR /&gt;
&lt;BR /&gt;
* Second question;&lt;BR /&gt;
&lt;BR /&gt;
proc sql;&lt;BR /&gt;
create table T02_groups as&lt;BR /&gt;
select group, count(*) as count&lt;BR /&gt;
from T01_data&lt;BR /&gt;
group by 1;&lt;BR /&gt;
quit;&lt;BR /&gt;
&lt;BR /&gt;
proc surveyselect data=T02_groups&lt;BR /&gt;
out=T03_sample(rename=(count=_NSIZE_) drop=numberHits)&lt;BR /&gt;
sampsize=3 method=urs outhits;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
data T04_newGroup;&lt;BR /&gt;
set T03_sample;&lt;BR /&gt;
newGroup+1;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
proc sql;&lt;BR /&gt;
create table T05_data as&lt;BR /&gt;
select A.*, B.newGroup&lt;BR /&gt;
from T01_data A, T04_newGroup B&lt;BR /&gt;
where A.group=B.group&lt;BR /&gt;
order by group, newGroup;&lt;BR /&gt;
quit;&lt;BR /&gt;
&lt;BR /&gt;
proc surveyselect data=T05_data&lt;BR /&gt;
                  sampsize=T04_newGroup method=urs outhits&lt;BR /&gt;
                  out=T06_sample_sample;&lt;BR /&gt;
strata group newGroup;&lt;BR /&gt;
run;</description>
      <pubDate>Tue, 30 Jun 2009 21:07:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50401#M10520</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-06-30T21:07:31Z</dc:date>
    </item>
    <item>
      <title>Re: Help with stratified random samples</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50402#M10521</link>
      <description>Hello yoba,&lt;BR /&gt;
&lt;BR /&gt;
This works! Thanks a lot!&lt;BR /&gt;
&lt;BR /&gt;
The only change I want to make is to change the line:&lt;BR /&gt;
sampsize=3 to sampsize="number of groups in the original dataset"&lt;BR /&gt;
So, when I have more groups, the code will be more flexible.&lt;BR /&gt;
&lt;BR /&gt;
For example, when my data have five groups of people:&lt;BR /&gt;
1 11 3 8 9&lt;BR /&gt;
1 12 4 10 16&lt;BR /&gt;
2 21 1 5 6&lt;BR /&gt;
2 22 2 7 7&lt;BR /&gt;
2 23 5 6 12&lt;BR /&gt;
3 31 8 6 7&lt;BR /&gt;
3 32 9 4 3&lt;BR /&gt;
4 41 2 8 11&lt;BR /&gt;
4 42 3 4 5&lt;BR /&gt;
4 43 5 3 4&lt;BR /&gt;
4 44 6 7 10&lt;BR /&gt;
5 51 4 3 2&lt;BR /&gt;
5 52 3 4 9&lt;BR /&gt;
I don't need to change the sentence to "sampsize=5". &lt;BR /&gt;
Can you do that? Thank you!</description>
      <pubDate>Tue, 30 Jun 2009 23:30:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50402#M10521</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-06-30T23:30:21Z</dc:date>
    </item>
    <item>
      <title>Re: Help with stratified random samples</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50403#M10522</link>
      <description>Hello Sliu,&lt;BR /&gt;
&lt;BR /&gt;
It is easy with the SQL Procedure. You can put the count in a macro variable like this:&lt;BR /&gt;
&lt;BR /&gt;
proc sql noprint;&lt;BR /&gt;
select count(distinct group)&lt;BR /&gt;
into :count_of_groups&lt;BR /&gt;
from your_original_table;&lt;BR /&gt;
quit;&lt;BR /&gt;
&lt;BR /&gt;
proc surveyselect .... sampsize=&amp;amp;count_of_Groups ...;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
Best regards,&lt;BR /&gt;
&lt;BR /&gt;
Yoba</description>
      <pubDate>Wed, 01 Jul 2009 15:43:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50403#M10522</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-07-01T15:43:58Z</dc:date>
    </item>
    <item>
      <title>Re: Help with stratified random samples</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50404#M10523</link>
      <description>Thank you, yoba! I really appreciate it!&lt;BR /&gt;
Just started to learn SAS syntax, it is fun!</description>
      <pubDate>Wed, 01 Jul 2009 16:29:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-stratified-random-samples/m-p/50404#M10523</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-07-01T16:29:13Z</dc:date>
    </item>
  </channel>
</rss>

