<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Select 50%, 25% of the rows randomly and mark them in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565034#M158552</link>
    <description>&lt;P&gt;Wouldn't this just be an array?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
set have;

call streaminit(300) ; *to allow for replication;
array pct(100) pct1-pct100;

do i=1 to 100;
pct(i) = rand('bernoulli', i/100);
end;

run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/132289"&gt;@Cruise&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Dear SAS experts:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'd like to create 100 indicator variables which mark 0 through 100% of the rows in a random fashion. For example, MARK50 in the mock data would mark 50% of the rows selected randomly while MARK25 variable would mark 25% of the rows in a random selection. Desirable dataset has 100 indicator dummy variables marking 0-100% of the rows in random.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is my attempt to manually conduct a simulation analysis in my previous post.&amp;nbsp; If I could just create mark 0, 15, 50, 75 and 100 then I would come up with a skeleton of the final desirable plot I want. I need this prelim information int he meantime.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Programming/Simulate-parameter-estimates-of-the-model-for-missing-scenarios/m-p/564982" target="_blank" rel="noopener"&gt;https://communities.sas.com/t5/SAS-Programming/Simulate-parameter-estimates-of-the-model-for-missing-scenarios/m-p/564982&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you so much for your precious time!&lt;/P&gt;
&lt;P&gt;PS: I'm familiar with proc surveyselect. But I want to create indicator variables within the dataset rather than creating subsets of random samples.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE;
INPUT ID RANDOM MARK50 MARK25;
CARDS;
1  1  1
2  0  0
3  1  0
4  0  1
5  1  0
6  0  0
7  1  0
8  0  1
9  1  0
10 0  0
11 1  0
12 0  1
;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 10 Jun 2019 20:10:11 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2019-06-10T20:10:11Z</dc:date>
    <item>
      <title>Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565018#M158544</link>
      <description>&lt;P&gt;Dear SAS experts:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'd like to create 100 indicator variables which mark 0 through 100% of the rows in a random fashion. For example, MARK50 in the mock data would mark 50% of the rows selected randomly while MARK25 variable would mark 25% of the rows in a random selection. Desirable dataset has 100 indicator dummy variables marking 0-100% of the rows in random.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is my attempt to manually conduct a simulation analysis in my previous post.&amp;nbsp; If I could just create mark 0, 15, 50, 75 and 100 then I would come up with a skeleton of the final desirable plot I want. I need this prelim information int he meantime.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Programming/Simulate-parameter-estimates-of-the-model-for-missing-scenarios/m-p/564982" target="_blank" rel="noopener"&gt;https://communities.sas.com/t5/SAS-Programming/Simulate-parameter-estimates-of-the-model-for-missing-scenarios/m-p/564982&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you so much for your precious time!&lt;/P&gt;
&lt;P&gt;PS: I'm familiar with proc surveyselect. But I want to create indicator variables within the dataset rather than creating subsets of random samples.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE;
INPUT ID RANDOM MARK50 MARK25;
CARDS;
1  1  1
2  0  0
3  1  0
4  0  1
5  1  0
6  0  0
7  1  0
8  0  1
9  1  0
10 0  0
11 1  0
12 0  1
;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jun 2019 19:39:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565018#M158544</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2019-06-10T19:39:05Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565019#M158545</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE;
    INPUT ID;
    rand=rand('uniform');
    array mark mark1-mark99;
    do i=1 to dim(mark);
        if rand&amp;lt;(i/100) then mark(i)=1;
	else mark(i)=0;
    end;
CARDS;
1  1  1
2  0  0
3  1  0
4  0  1
5  1  0
6  0  0
7  1  0
8  0  1
9  1  0
10 0  0
11 1  0
12 0  1
;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 10 Jun 2019 19:42:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565019#M158545</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-06-10T19:42:08Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565029#M158549</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks a lot. I tried your approach on 1 thru 100 row dataset. How to tie mark1 totals to 1., so on., mark100 totals to 100. Possible?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE;
    INPUT ID;
    rand=rand('uniform');
    array mark mark1-mark99;
    do i=1 to dim(mark);
        if rand&amp;lt;(i/100) then mark(i)=1;
	else mark(i)=0;
    end;
CARDS;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
;
RUN;

proc summary data=have;
var mark1-mark100;
output out=totals sum=;
run;
proc transpose data=totals;
run;
proc print; run; &lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 10 Jun 2019 20:03:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565029#M158549</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2019-06-10T20:03:45Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565030#M158550</link>
      <description>&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;How to tie mark1 totals to 1., so on., mark100 totals to 100. Possible?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;I don't know what this means.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jun 2019 20:04:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565030#M158550</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-06-10T20:04:41Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565032#M158551</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt; correct me If I'm wrong. I thought if i have 100 observations and randomly selecting them 100 times. Total number of rows selected randomly aiming to cover the 50% of the dataset of 100 rows then I would end up selecting 50 rows. Right? therefore, mark50 would match up to the 50 rows selected in total. That is how I mean by matching markN to the N and the N% of the data selection in random on the 100 rows of data. Make sense?&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jun 2019 20:09:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565032#M158551</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2019-06-10T20:09:56Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565034#M158552</link>
      <description>&lt;P&gt;Wouldn't this just be an array?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
set have;

call streaminit(300) ; *to allow for replication;
array pct(100) pct1-pct100;

do i=1 to 100;
pct(i) = rand('bernoulli', i/100);
end;

run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/132289"&gt;@Cruise&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Dear SAS experts:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'd like to create 100 indicator variables which mark 0 through 100% of the rows in a random fashion. For example, MARK50 in the mock data would mark 50% of the rows selected randomly while MARK25 variable would mark 25% of the rows in a random selection. Desirable dataset has 100 indicator dummy variables marking 0-100% of the rows in random.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is my attempt to manually conduct a simulation analysis in my previous post.&amp;nbsp; If I could just create mark 0, 15, 50, 75 and 100 then I would come up with a skeleton of the final desirable plot I want. I need this prelim information int he meantime.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Programming/Simulate-parameter-estimates-of-the-model-for-missing-scenarios/m-p/564982" target="_blank" rel="noopener"&gt;https://communities.sas.com/t5/SAS-Programming/Simulate-parameter-estimates-of-the-model-for-missing-scenarios/m-p/564982&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you so much for your precious time!&lt;/P&gt;
&lt;P&gt;PS: I'm familiar with proc surveyselect. But I want to create indicator variables within the dataset rather than creating subsets of random samples.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE;
INPUT ID RANDOM MARK50 MARK25;
CARDS;
1  1  1
2  0  0
3  1  0
4  0  1
5  1  0
6  0  0
7  1  0
8  0  1
9  1  0
10 0  0
11 1  0
12 0  1
;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jun 2019 20:10:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565034#M158552</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-06-10T20:10:11Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565037#M158553</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE;
    INPUT ID;
    rand=rand('uniform');
CARDS;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
;
RUN;
proc rank data=have out=ranked;
    var rand;
	ranks ranks;
run;
data want;
	set ranked;
	array mark mark1-mark99;
	do i=1 to dim(mark);
	    if ranks&amp;lt;=i then mark(i)=1; 
		else mark(i)=0;
	end;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 10 Jun 2019 20:14:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565037#M158553</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-06-10T20:14:45Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565041#M158555</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Wouldn't this just be an array?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
set have;

call streaminit(300) ; *to allow for replication;
array pct(100) pct1-pct100;

do i=1 to 100;
pct(i) = rand('bernoulli', i/100);
end;

run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;It's still not 100% clear to me, but it seems that the question from&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/132289"&gt;@Cruise&lt;/a&gt;&amp;nbsp; requires only one value of 1 and 99 values of zero in column 1; and exactly 2 values of 1 and 98 values of zero in column 2; and so on. So, this use of the Bernoulli random variable does not produce that result.&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jun 2019 20:20:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565041#M158555</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-06-10T20:20:58Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565044#M158556</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp; given the last reponse of this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="lia-message-heading lia-component-message-header"&gt;
&lt;DIV class="lia-quilt-row lia-quilt-row-standard"&gt;
&lt;DIV class="lia-quilt-column lia-quilt-column-20 lia-quilt-column-left"&gt;
&lt;DIV class="lia-quilt-column-alley lia-quilt-column-alley-left"&gt;
&lt;DIV class="lia-message-subject"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P class="lia-message-dates lia-message-post-date lia-component-post-date-last-edited"&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV id="messagebodydisplay_0_3" class="lia-message-body lia-component-body"&gt;
&lt;DIV class="lia-message-body-content"&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;I thought if i have 100 observations and randomly selecting them 100 times. Total number of rows selected randomly aiming to cover the 50% of the dataset of 100 rows then I would end up selecting 50 rows. Right? therefore, mark50 would match up to the 50 rows selected in total. That is how I mean by matching markN to the N and the N% of the data selection in random on the 100 rows of data. Make sense?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;This would create those variables.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;PCT1 would have 1% with values of 1.&lt;/P&gt;
&lt;P&gt;PCT50 would have 50% of values with 1, ie 50% selected.&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Mon, 10 Jun 2019 20:25:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565044#M158556</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-06-10T20:25:08Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565047#M158559</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp; given the last reponse of this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="lia-message-heading lia-component-message-header"&gt;
&lt;DIV class="lia-quilt-row lia-quilt-row-standard"&gt;
&lt;DIV class="lia-quilt-column lia-quilt-column-20 lia-quilt-column-left"&gt;
&lt;DIV class="lia-quilt-column-alley lia-quilt-column-alley-left"&gt;
&lt;DIV class="lia-message-subject"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P class="lia-message-dates lia-message-post-date lia-component-post-date-last-edited"&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV id="messagebodydisplay_0_3" class="lia-message-body lia-component-body"&gt;
&lt;DIV class="lia-message-body-content"&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;I thought if i have 100 observations and randomly selecting them 100 times. Total number of rows selected randomly aiming to cover the 50% of the dataset of 100 rows then I would end up selecting 50 rows. Right? therefore, mark50 would match up to the 50 rows selected in total. That is how I mean by matching markN to the N and the N% of the data selection in random on the 100 rows of data. Make sense?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;This would create those variables.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;PCT1 would have 1% with values of 1.&lt;/P&gt;
&lt;P&gt;PCT50 would have 50% of values with 1, ie 50% selected.&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;PRE class=" language-sas"&gt;&lt;CODE class="  language-sas"&gt;&lt;SPAN class="token procnames"&gt;data&lt;/SPAN&gt; want&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
&lt;SPAN class="token keyword"&gt;set&lt;/SPAN&gt; have&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;

call &lt;SPAN class="token function"&gt;streaminit&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;300&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; &lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt; &lt;SPAN class="token comment"&gt;*to allow for replication;&lt;/SPAN&gt;
&lt;SPAN class="token statement"&gt;array&lt;/SPAN&gt; pct&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;100&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; pct1&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;pct100&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;

do i&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;1&lt;/SPAN&gt; to &lt;SPAN class="token number"&gt;100&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
pct&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;i&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; &lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt; &lt;SPAN class="token keyword"&gt;rand&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token string"&gt;'bernoulli'&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;,&lt;/SPAN&gt; i&lt;SPAN class="token operator"&gt;/&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;100&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
end&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;

&lt;SPAN class="token procnames"&gt;run&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
&lt;SPAN class="token procnames"&gt;proc&lt;/SPAN&gt; &lt;SPAN class="token procnames"&gt;summary&lt;/SPAN&gt; &lt;SPAN class="token procnames"&gt;data&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;want&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
    &lt;SPAN class="token keyword"&gt;var&lt;/SPAN&gt; pct1&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;pct100&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
	output out&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;stats &lt;SPAN class="token function"&gt;sum&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
&lt;SPAN class="token procnames"&gt;run&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If it produced the result as I now understand it, PROC SUMMARY should produce a sum of 1 for pct1, and 2 for pct 2 and so on, and this code does not produce that result.&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jun 2019 20:28:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565047#M158559</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-06-10T20:28:26Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565049#M158560</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your last approach is what I wanted. mark1 flags 1 row, mark2 flags 2 rows et.c. &lt;/P&gt;
&lt;P&gt;my actual data has 20K rows and i will use these marks as an indicator of the extent of a missing, thus, replaced data in my manual simulation analysis. mark50, would indicate that the 50% of the variable of interest was missing,therefore, replaced by a proxy value (July1 for the month/day).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="RANDOM_MARK.png" style="width: 221px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/30174iFA605E1FBC6738B9/image-size/large?v=v2&amp;amp;px=999" role="button" title="RANDOM_MARK.png" alt="RANDOM_MARK.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jun 2019 20:39:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565049#M158560</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2019-06-10T20:39:48Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565073#M158574</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp; given the last reponse of this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="lia-message-heading lia-component-message-header"&gt;
&lt;DIV class="lia-quilt-row lia-quilt-row-standard"&gt;
&lt;DIV class="lia-quilt-column lia-quilt-column-20 lia-quilt-column-left"&gt;
&lt;DIV class="lia-quilt-column-alley lia-quilt-column-alley-left"&gt;
&lt;DIV class="lia-message-subject"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P class="lia-message-dates lia-message-post-date lia-component-post-date-last-edited"&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV id="messagebodydisplay_0_3" class="lia-message-body lia-component-body"&gt;
&lt;DIV class="lia-message-body-content"&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;I thought if i have 100 observations and randomly selecting them 100 times. Total number of rows selected randomly aiming to cover the 50% of the dataset of 100 rows then I would end up selecting 50 rows. Right? therefore, mark50 would match up to the 50 rows selected in total. That is how I mean by matching markN to the N and the N% of the data selection in random on the 100 rows of data. Make sense?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;This would create those variables.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;PCT1 would have 1% with values of 1.&lt;/P&gt;
&lt;P&gt;PCT50 would have 50% of values with 1, ie 50% selected.&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;PRE class=" language-sas"&gt;&lt;CODE class="  language-sas"&gt;&lt;SPAN class="token procnames"&gt;data&lt;/SPAN&gt; want&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
&lt;SPAN class="token keyword"&gt;set&lt;/SPAN&gt; have&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;

call &lt;SPAN class="token function"&gt;streaminit&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;300&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; &lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt; &lt;SPAN class="token comment"&gt;*to allow for replication;&lt;/SPAN&gt;
&lt;SPAN class="token statement"&gt;array&lt;/SPAN&gt; pct&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;100&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; pct1&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;pct100&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;

do i&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;1&lt;/SPAN&gt; to &lt;SPAN class="token number"&gt;100&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
pct&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;i&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; &lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt; &lt;SPAN class="token keyword"&gt;rand&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token string"&gt;'bernoulli'&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;,&lt;/SPAN&gt; i&lt;SPAN class="token operator"&gt;/&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;100&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
end&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;

&lt;SPAN class="token procnames"&gt;run&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
&lt;SPAN class="token procnames"&gt;proc&lt;/SPAN&gt; &lt;SPAN class="token procnames"&gt;summary&lt;/SPAN&gt; &lt;SPAN class="token procnames"&gt;data&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;want&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
    &lt;SPAN class="token keyword"&gt;var&lt;/SPAN&gt; pct1&lt;SPAN class="token operator"&gt;-&lt;/SPAN&gt;pct100&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
	output out&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;stats &lt;SPAN class="token function"&gt;sum&lt;/SPAN&gt;&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
&lt;SPAN class="token procnames"&gt;run&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If it produced the result as I now understand it, PROC SUMMARY should produce a sum of 1 for pct1, and 2 for pct 2 and so on, and this code does not produce that result.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;You would need to look at the mean, not the sum because the number of observations would affect the results. When done in that case, it most definitely does produce a series of approximately 1-100%.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="delete_probabilities.png" style="width: 492px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/30177iEC26B9BB6F1BF4DD/image-size/large?v=v2&amp;amp;px=999" role="button" title="delete_probabilities.png" alt="delete_probabilities.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jun 2019 22:22:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565073#M158574</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-06-10T22:22:28Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565074#M158575</link>
      <description>I see what you're saying. I'll use both your and paige miller's approaches on my actual data for a comparison purpose. I'll report back in here.</description>
      <pubDate>Mon, 10 Jun 2019 22:26:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565074#M158575</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2019-06-10T22:26:50Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565078#M158577</link>
      <description>Doesn't matter which one you use, just wanted to ensure that I wasn't missing something. Both work and allow you to choose samples of a specific size though. You should review David L. Cassell's paper on don't be loopy - it covers simulations in SAS both using an approach that you're trying and survey select. &lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 10 Jun 2019 22:44:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565078#M158577</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-06-10T22:44:34Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565089#M158581</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt; &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I ended up using&amp;nbsp; Reeza's approach since this took a percentile of the rows. Paige's approach marked N 1-100 rows in total from 27,440 rows. &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="REEZA.png" style="width: 309px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/30178iB0FD194D68074DE4/image-size/large?v=v2&amp;amp;px=999" role="button" title="REEZA.png" alt="REEZA.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 11 Jun 2019 00:32:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565089#M158581</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2019-06-11T00:32:45Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565173#M158642</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/132289"&gt;@Cruise&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt; &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I ended up using&amp;nbsp; Reeza's approach since this took a percentile of the rows. Paige's approach marked N 1-100 rows in total from 27,440 rows.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="REEZA.png" style="width: 309px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/30178iB0FD194D68074DE4/image-size/large?v=v2&amp;amp;px=999" role="button" title="REEZA.png" alt="REEZA.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/132289"&gt;@Cruise&lt;/a&gt;&amp;nbsp; you never mentioned that you had 27440 observations. First you showed 12 observations, later you showed 100, and we could get to the correct answer much more quickly if we had known that you would want to do this on an arbitrarily sized data set in the first place.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In particular, I point out that pct32 has 31 percent of the observations, not 32 percent of the observations. You seem to have indicated earlier that you want 32 percent of the observations in pct32, not 31 percent of the observations in pct32, so again, it's not clear to me what the exact result is that you want. But if you can accept pct32 having a value of 31 then my very first solution provides that answer as well.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Jun 2019 11:01:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565173#M158642</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-06-11T11:01:53Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565174#M158643</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I truly appreciate your pointing out to the pct31 an pct32 are both associated with 31% of the data. I apologize for not being clear. I have different sizes of samples in actuality depending on the anatomical site of the body. N=27,440 is the site of a pancreas. Your rank and rand approach yielded exact matching results. Is it possible to convert your code to percentile so pct31 would cover the exact 31% of the data and pct32 covers exact 32% of the data., for example. Again, sorry for not being clear. Big lesson learned here.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Jun 2019 11:10:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565174#M158643</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2019-06-11T11:10:07Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565193#M158651</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As Paige pointed out, I checked for the all discordant cases and Bernoulli distribution doesn't provide exact % of the data at following 5 occasions. I have varying size of samples for the different anatomical sites of the disease.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="DISCORDANT.png" style="width: 438px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/30183i83CBE53DB3F758D7/image-size/large?v=v2&amp;amp;px=999" role="button" title="DISCORDANT.png" alt="DISCORDANT.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 11 Jun 2019 11:51:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565193#M158651</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2019-06-11T11:51:09Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565203#M158657</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/132289"&gt;@Cruise&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I truly appreciate your pointing out to the pct31 an pct32 are both associated with 31% of the data. I apologize for not being clear. I have different sizes of samples in actuality depending on the anatomical site of the body. N=27,440 is the site of a pancreas. Your rank and rand approach yielded exact matching results. Is it possible to convert your code to percentile so pct31 would cover the exact 31% of the data and pct32 covers exact 32% of the data., for example. Again, sorry for not being clear. Big lesson learned here.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;A slight modification to my earlier solution will work for the case of N=27,440 (or any other value of N), and provides every percent variable with the exact amount of values of 1 that you want, to within round-off error. (You will note that for any arbitrary N, the value of 32% of that N may not be an integer, we can't overcome that with these methods).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
	do id=1 to 27440;
    rand=rand('uniform');
	output;
	end;
run;
proc rank data=have out=ranked;
    var rand;
	ranks ranks;
run;
/* this next PROC SUMMARY is used if you want SAS to determine n for any data set, even though */
/* for this example we know that n = 27440 */
proc summary data=ranked;
	var id;
	output out=n n=n;
run;
data want;
	if _n_=1 then set n;
	set ranked;
	array mark mark1-mark99;
	do i=1 to dim(mark);
	    if ranks&amp;lt;=i*n/100 then mark(i)=1; /* here we use the computed value of n */
		else mark(i)=0;
	end;
run;
proc summary data=want;
	var mark1-mark99;
	output out=stats mean=;
    	format mark1-mark99 percent7.1;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 11 Jun 2019 12:12:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565203#M158657</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-06-11T12:12:04Z</dc:date>
    </item>
    <item>
      <title>Re: Select 50%, 25% of the rows randomly and mark them</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565213#M158659</link>
      <description>I tried this method on three difference datasets. All outputs had exact percent variable with the exact amount of values. Thank you !</description>
      <pubDate>Tue, 11 Jun 2019 12:47:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-50-25-of-the-rows-randomly-and-mark-them/m-p/565213#M158659</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2019-06-11T12:47:27Z</dc:date>
    </item>
  </channel>
</rss>

