<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Generating an array of random bernoulli variables with a minimum and maximum 1's in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591220#M28911</link>
    <description>Make fake data and post that please.</description>
    <pubDate>Tue, 24 Sep 2019 16:34:33 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2019-09-24T16:34:33Z</dc:date>
    <item>
      <title>Generating an array of random bernoulli variables with a minimum and maximum 1's</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591212#M28907</link>
      <description>&lt;P&gt;Hi there,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am trying to generate missingness for a summative scale. This involves: (1) randomly selecting individuals to be missing a summative score and (2) deleting individual items within the scale for those identified as being missing. I am struggling with #2 as all individuals need at least 1 item to be deleted and a pre-specified number to have all items deleted (9.17% missing all items) and individuals can have 1 to 5 missing items (within a 5 item scale).&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Probability of missing:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;item1=0.&lt;SPAN style="display: inline !important; float: none; background-color: #ffffff; color: #333333; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;2782&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="text-align: left; color: #333333; text-transform: none; text-indent: 0px; letter-spacing: normal; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; text-decoration: none; word-spacing: 0px; display: inline !important; white-space: normal; orphans: 2; float: none; -webkit-text-stroke-width: 0px; background-color: #ffffff;"&gt;item2=&lt;SPAN style="text-align: left; color: #333333; text-transform: none; text-indent: 0px; letter-spacing: normal; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; text-decoration: none; word-spacing: 0px; display: inline !important; white-space: normal; orphans: 2; float: none; -webkit-text-stroke-width: 0px; background-color: #ffffff;"&gt;0.3497&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;item3=&lt;SPAN style="text-align: left; color: #333333; text-transform: none; text-indent: 0px; letter-spacing: normal; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; text-decoration: none; word-spacing: 0px; display: inline !important; white-space: normal; orphans: 2; float: none; -webkit-text-stroke-width: 0px; background-color: #ffffff;"&gt;0.3035&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="display: inline !important; float: none; background-color: #ffffff; color: #333333; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;item 4=&lt;SPAN style="display: inline !important; float: none; background-color: #ffffff; color: #333333; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt; 0.3207&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="display: inline !important; float: none; background-color: #ffffff; color: #333333; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;&lt;SPAN style="display: inline !important; float: none; background-color: #ffffff; color: #333333; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;item 5=&lt;SPAN style="display: inline !important; float: none; background-color: #ffffff; color: #333333; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;0.3289&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-align: left; color: #333333; text-transform: none; text-indent: 0px; letter-spacing: normal; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; text-decoration: none; word-spacing: 0px; display: inline !important; white-space: normal; orphans: 2; float: none; -webkit-text-stroke-width: 0px; background-color: #ffffff;"&gt;&lt;SPAN style="text-align: left; color: #333333; text-transform: none; text-indent: 0px; letter-spacing: normal; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; text-decoration: none; word-spacing: 0px; display: inline !important; white-space: normal; orphans: 2; float: none; -webkit-text-stroke-width: 0px; background-color: #ffffff;"&gt;&lt;SPAN style="text-align: left; color: #333333; text-transform: none; text-indent: 0px; letter-spacing: normal; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; text-decoration: none; word-spacing: 0px; display: inline !important; white-space: normal; orphans: 2; float: none; -webkit-text-stroke-width: 0px; background-color: #ffffff;"&gt;All items=&lt;SPAN style="text-align: left; color: #333333; text-transform: none; text-indent: 0px; letter-spacing: normal; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; text-decoration: none; word-spacing: 0px; display: inline !important; white-space: normal; orphans: 2; float: none; -webkit-text-stroke-width: 0px; background-color: #ffffff;"&gt;0.0917&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;*remaining probability of each item after accounting for all being missing*&lt;/P&gt;&lt;P&gt;item1=0.1865&lt;BR /&gt;item2=0.258&lt;BR /&gt;item3=0.2118&lt;BR /&gt;item 4= 0.229&lt;BR /&gt;item 5=0.2372&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Essentially, I want to delete all items for 9.17% of the identified sample for missingness - likely based on a Bernoulli distribution as follows...&lt;/P&gt;&lt;P&gt;if js_Sel=. then sel_items=rand('BERNOULLI', 0.0917); else sel_items=0;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;...and then, conditional on the full scale being missing (i.e. js_Sel=.) and not having all items missing (i.e. sel_items=0), using the remaining probabilities to delete the remaining individual items. However, if I do this using separate random bernoulli variables, I end up getting about 25% with no missing at all (when all identified observations need to have at least one item missing) and 10% extra with all items missing.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there a way to create an array of Bernoulli random variables, based on the remaining probabilities, where at least 1 column needs to be =1 and it is not possible for all 5 columns to =1?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance!&lt;/P&gt;&lt;P&gt;Jillian&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 24 Sep 2019 16:16:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591212#M28907</guid>
      <dc:creator>halladje</dc:creator>
      <dc:date>2019-09-24T16:16:30Z</dc:date>
    </item>
    <item>
      <title>Re: Generating an array of random bernoulli variables with a minimum and maximum 1's</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591215#M28909</link>
      <description>Yes it's possible, but it would be much easier if you showed some sample data. &lt;BR /&gt;Because you have defined probabilities, use RAND() with the TABLE option first and then use the Bernoulli option to create the 1.</description>
      <pubDate>Tue, 24 Sep 2019 16:21:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591215#M28909</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-09-24T16:21:54Z</dc:date>
    </item>
    <item>
      <title>Re: Generating an array of random bernoulli variables with a minimum and maximum 1's</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591216#M28910</link>
      <description>&lt;P&gt;Hi there,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am unable to post sample data - my apologies for the inconvenience.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For the table option, this would only allow for one variable to be selected though, correct? Several observations have multiple observations deleted so the cumulative probabilities across items are &amp;gt;1. When using the table function, don't the probabilities need to =1 since only one variable is selected?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for your thoughts&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 24 Sep 2019 16:27:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591216#M28910</guid>
      <dc:creator>halladje</dc:creator>
      <dc:date>2019-09-24T16:27:02Z</dc:date>
    </item>
    <item>
      <title>Re: Generating an array of random bernoulli variables with a minimum and maximum 1's</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591220#M28911</link>
      <description>Make fake data and post that please.</description>
      <pubDate>Tue, 24 Sep 2019 16:34:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591220#M28911</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-09-24T16:34:33Z</dc:date>
    </item>
    <item>
      <title>Re: Generating an array of random bernoulli variables with a minimum and maximum 1's</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591221#M28912</link>
      <description>&lt;P&gt;I am not quite sure how to do that. Any general thoughts I would be able to test the table random function on my data (or other alternatives)?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks and again my apologies, Jillian&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 24 Sep 2019 16:39:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591221#M28912</guid>
      <dc:creator>halladje</dc:creator>
      <dc:date>2019-09-24T16:39:11Z</dc:date>
    </item>
    <item>
      <title>Re: Generating an array of random bernoulli variables with a minimum and maximum 1's</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591315#M28920</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159403"&gt;@halladje&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If I understand your requirements correctly, you want to modify &lt;EM&gt;one&lt;/EM&gt; existing dataset (by setting a number of variables to missing). So, your probabilities (0.2782, 0.3497, etc.) are actually&amp;nbsp;&lt;EM&gt;expected&lt;/EM&gt;&amp;nbsp;&lt;EM&gt;relative frequencies&lt;/EM&gt; in that dataset (after the modification).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The main issue is: Most of the probabilities you've specified are &lt;EM&gt;marginal&lt;/EM&gt; probabilities, but constraints such that "&lt;SPAN&gt;a pre-specified number to have all items deleted (9.17% missing all items)" or "it is not possible for all 5 columns to =1" imply that the Bernoulli random variables you're trying to simulate are statistically &lt;EM&gt;dependent&lt;/EM&gt;. This means, you can't simply use &lt;FONT face="courier new,courier"&gt;RAND('bern',0.2782)&lt;/FONT&gt;,&amp;nbsp;&lt;FONT face="courier new,courier"&gt;RAND('bern',0.3497)&lt;/FONT&gt;, etc. (or&amp;nbsp;&lt;FONT face="courier new,courier"&gt;RAND('bern',0.1865)&lt;/FONT&gt;,&amp;nbsp;&lt;FONT face="courier new,courier"&gt;RAND('bern',0.258)&lt;/FONT&gt;, etc. for that matter).&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Maybe there is an additional issue: The relative frequencies would most likely differ from the specified probabilities due to random fluctuations. For example, on average, more than one out of ten selections from 1000 individuals using independent&amp;nbsp;&lt;FONT face="courier new,courier"&gt;RAND('bern',0.3497)&lt;/FONT&gt;&amp;nbsp;values will contain &amp;gt;368 individuals. Given the precision of the specified probabilities, you might not be happy with the results.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Here's an outline of how you could avoid both of these issues:&lt;/SPAN&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;SPAN&gt;There are 2**5 - 2 = 30 different combinations of five zeros and ones after excluding "00000" (="no item missing")&amp;nbsp; and "11111" (="all items missing"). Denote the relative frequencies to be determined for the 30 combinations "00001" (="only item 1 missing"), ..., "11110" (="only item 1 nonmissing") with x&lt;FONT size="1 2 3 4 5 6 7"&gt;1&lt;/FONT&gt;, ..., x&lt;FONT size="1 2 3 4 5 6 7"&gt;30&lt;/FONT&gt;.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;Write down the constraints for the x&lt;FONT size="1 2 3 4 5 6 7"&gt;i&lt;/FONT&gt; (besides x&lt;FONT size="1 2 3 4 5 6 7"&gt;i&lt;/FONT&gt;&amp;gt;=0). These are linear equations. Examples: The constraint that 9.17% of the observations are to have all items missing translates to x&lt;FONT size="1 2 3 4 5 6 7"&gt;1&lt;/FONT&gt;+x&lt;FONT size="1 2 3 4 5 6 7"&gt;2&lt;/FONT&gt;+...+x&lt;FONT size="1 2 3 4 5 6 7"&gt;30&lt;/FONT&gt;=0.9083 (=1-0.0917).&amp;nbsp;The constraint that 23.72% of the observations are to have item 5 missing, but not all items missing, translates to x&lt;FONT size="1 2 3 4 5 6 7"&gt;16&lt;/FONT&gt;+x&lt;FONT size="1 2 3 4 5 6 7"&gt;17&lt;/FONT&gt;+...+x&lt;FONT size="1 2 3 4 5 6 7"&gt;30&lt;/FONT&gt;=0.2372 (see first digit of 16, 17, ..., 30 in the binary system).&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;Solve the resulting system of linear equations (SAS/IML?). There will be many free parameters in the solution. Think of reasonable values for these parameters (or specify more constraints in step 2).&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;Compute the corresponding absolute frequencies from the solution obtained in step 3: If your dataset contains N individuals, determine n&lt;FONT size="1 2 3 4 5 6 7"&gt;1&lt;/FONT&gt;, ..., n&lt;FONT size="1 2 3 4 5 6 7"&gt;30&lt;/FONT&gt; by n&lt;FONT size="1 2 3 4 5 6 7"&gt;i&lt;/FONT&gt;=floor(x&lt;FONT size="1 2 3 4 5 6 7"&gt;i&lt;/FONT&gt;*N) or n&lt;FONT size="1 2 3 4 5 6 7"&gt;i&lt;/FONT&gt;=ceil(x&lt;FONT size="1 2 3 4 5 6 7"&gt;i&lt;/FONT&gt;*N) and similarly n&lt;FONT size="1 2 3 4 5 6 7"&gt;31&lt;/FONT&gt;=floor(0.0917*N) or n&lt;FONT size="1 2 3 4 5 6 7"&gt;31&lt;/FONT&gt;=ceil(0.0917*N) so that n&lt;FONT size="1 2 3 4 5 6 7"&gt;1&lt;/FONT&gt;+...+n&lt;FONT size="1 2 3 4 5 6 7"&gt;31&lt;/FONT&gt;=N.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;Use PROC SURVEYSELECT with the &lt;A href="https://documentation.sas.com/?docsetId=statug&amp;amp;docsetTarget=statug_surveyselect_syntax01.htm&amp;amp;docsetVersion=14.3&amp;amp;locale=en#statug.surveyselect.selectgroups" target="_blank" rel="noopener"&gt;GROUPS=&lt;/A&gt;(n&lt;FONT size="1 2 3 4 5 6 7"&gt;1&lt;/FONT&gt; ... n&lt;FONT size="1 2 3 4 5 6 7"&gt;31&lt;/FONT&gt;) option to assign the individuals randomly to the 31 groups (numbered 1, ..., 31).&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;In a DATA step, use the 1st, ..., 5th digit of the respective individual's group number in BINARY5. format (i.e. "00001", ..., "11111") to determine which of the items 5, 4, 3, 2, 1 need to be set to missing and perform this operation in a DO loop (1 to 5).&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 24 Sep 2019 21:52:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Generating-an-array-of-random-bernoulli-variables-with-a-minimum/m-p/591315#M28920</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2019-09-24T21:52:06Z</dc:date>
    </item>
  </channel>
</rss>

