<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: proc means standard deviation too high? SAS 9.4 in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470511#M70885</link>
    <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc means data=sinwom  vardef=wdf ;
   var PregMonth PregBirth MarrMonth ShotGunMar;
   weight post_wt;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;-------------------&lt;/P&gt;&lt;P&gt;Edit: I am sorry that&amp;nbsp;my code&amp;nbsp;is wrong, because i missed your below comment.&lt;/P&gt;&lt;P&gt;&amp;gt;&lt;SPAN&gt;The variables in the statement below are all binary,&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 15 Jun 2018 08:01:40 GMT</pubDate>
    <dc:creator>amatsu</dc:creator>
    <dc:date>2018-06-15T08:01:40Z</dc:date>
    <item>
      <title>proc means standard deviation too high? SAS 9.4</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470508#M70883</link>
      <description>&lt;P&gt;When I run proc means, the standard deviations are very high.&amp;nbsp;&lt;/P&gt;&lt;P&gt;The variables in the statement below are all binary, so the std should be sqrt(p*(1-p)), as shown in the table below. Instead they are orders of magnitude bigger. I get the same problem with continuous variables; stata delivers much smaller std errors.&lt;/P&gt;&lt;P&gt;No-one else seems to have this problem, and Ive been having it for years, so it has to be something I have misunderstood about proc means.I t might have something to do with the weight statement. The weights&amp;nbsp;&lt;SPAN&gt;post_wt&amp;nbsp;&lt;/SPAN&gt;range in size from 350 to 33500.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please let me know how to get the correct std dev. I'm getting tired of porting my sas data sets to stata.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;THanks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;J&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE border="0" cellspacing="0" cellpadding="0"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Variable&lt;/TD&gt;&lt;TD&gt;N&lt;/TD&gt;&lt;TD&gt;Mean&lt;/TD&gt;&lt;TD&gt;Std Dev&lt;/TD&gt;&lt;TD&gt;Minimum&lt;/TD&gt;&lt;TD&gt;Maximum&lt;/TD&gt;&lt;TD&gt;sqrt(p(1-p))&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;PregMonth&lt;/TD&gt;&lt;TD&gt;296527&lt;/TD&gt;&lt;TD&gt;0.006651&lt;/TD&gt;&lt;TD&gt;5.973354&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;0.081281&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;PregBirth&lt;/TD&gt;&lt;TD&gt;296527&lt;/TD&gt;&lt;TD&gt;0.003967&lt;/TD&gt;&lt;TD&gt;4.619279&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;0.062856&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;MarrMonth&lt;/TD&gt;&lt;TD&gt;296527&lt;/TD&gt;&lt;TD&gt;0.006756&lt;/TD&gt;&lt;TD&gt;6.020052&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;0.081917&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;ShotGunMar&lt;/TD&gt;&lt;TD&gt;296527&lt;/TD&gt;&lt;TD&gt;0.00106&lt;/TD&gt;&lt;TD&gt;2.391512&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;0.032542&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;20462 proc means data=sinwom; *noprint;;&lt;BR /&gt;20463 var PregMonth PregBirth MarrMonth ShotGunMar;&lt;BR /&gt;20464 *output out=temp mean = PregMonth PregBirth MarrMonth ShotGunMar;&lt;BR /&gt;20465 weight post_wt;&lt;BR /&gt;20466 *proc print;&lt;BR /&gt;20467 run;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;NOTE: There were 296527 observations read from the data set WORK.SINWOM.&lt;BR /&gt;NOTE: PROCEDURE MEANS used (Total process time):&lt;BR /&gt;real time 0.19 seconds&lt;BR /&gt;cpu time 0.29 seconds&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 03:25:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470508#M70883</guid>
      <dc:creator>john_knowles</dc:creator>
      <dc:date>2018-06-15T03:25:16Z</dc:date>
    </item>
    <item>
      <title>Re: proc means standard deviation too high? SAS 9.4</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470511#M70885</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc means data=sinwom  vardef=wdf ;
   var PregMonth PregBirth MarrMonth ShotGunMar;
   weight post_wt;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;-------------------&lt;/P&gt;&lt;P&gt;Edit: I am sorry that&amp;nbsp;my code&amp;nbsp;is wrong, because i missed your below comment.&lt;/P&gt;&lt;P&gt;&amp;gt;&lt;SPAN&gt;The variables in the statement below are all binary,&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 08:01:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470511#M70885</guid>
      <dc:creator>amatsu</dc:creator>
      <dc:date>2018-06-15T08:01:40Z</dc:date>
    </item>
    <item>
      <title>Re: proc means standard deviation too high? SAS 9.4</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470512#M70886</link>
      <description>&lt;P&gt;Formula is sqrt(n*p*(1-p)) isn't it?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Have you tried using PROC SURVEYMEANS if you have weighed data, from surveys in particular, it may be more appropriate.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There's also a note in PROC MEANS weight statement:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="xis-cautionGenText"&gt;&lt;FONT color="#FF0000"&gt;&lt;STRONG&gt;CAUTION:&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV class="xis-cautionLeadin"&gt;Single extreme weight values can cause inaccurate results.&lt;/DIV&gt;
&lt;DIV id="p0ygab99611evwn1gkuczljlxaa0" class="xis-paraSimple"&gt;When one (and only one) weight value is many orders of magnitude larger than the other weight values (for example, 49 weight values of 1 and one weight value of 1×10&lt;SUP class="xis-superscript"&gt;14&lt;/SUP&gt;), certain statistics might not be within acceptable accuracy limits. The affected statistics are based on the second moment (such as standard deviation, corrected sum of squares, variance, and standard error of the mean). Under certain circumstances, no warning is written to the SAS log.&lt;/DIV&gt;
&lt;DIV class="xis-paraSimple"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="xis-paraSimple"&gt;&lt;A href="http://documentation.sas.com/?docsetId=proc&amp;amp;docsetVersion=9.4&amp;amp;docsetTarget=p0ctnx21fdgs7qn1jgolu1ihl7kf.htm&amp;amp;locale=en" target="_blank"&gt;http://documentation.sas.com/?docsetId=proc&amp;amp;docsetVersion=9.4&amp;amp;docsetTarget=p0ctnx21fdgs7qn1jgolu1ihl7kf.htm&amp;amp;locale=en&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV class="xis-paraSimple"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="xis-paraSimple"&gt;The detailed section on weights has more information that may help you narrow down the issue.&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="xis-paraSimple"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="xis-paraSimple"&gt;Can you replicate these results on someone else's machine? Or provide a sample that reflects this issue. You don't have to provide any individual information, just the 1/0 and the respective weight so we can replicate the issue and debug if necessary.&lt;/DIV&gt;
&lt;DIV class="xis-paraSimple"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="xis-paraSimple"&gt;Edit: From &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/211799"&gt;@amatsu&lt;/a&gt; users answer, the weight pages has examples. In their case, the weights increase the StDev though.&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="xis-paraSimple"&gt;&lt;A href="http://documentation.sas.com/?docsetId=proc&amp;amp;docsetTarget=n1xkqt7u5ylr2kn11174pq11od76.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en#n0osbs4cchlvc6n136b1dkaw19gr" target="_blank"&gt;http://documentation.sas.com/?docsetId=proc&amp;amp;docsetTarget=n1xkqt7u5ylr2kn11174pq11od76.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en#n0osbs4cchlvc6n136b1dkaw19gr&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV class="xis-paraSimple"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="xis-paraSimple"&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/215703"&gt;@john_knowles&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;When I run proc means, the standard deviations are very high.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The variables in the statement below are all binary, so the std should be sqrt(p*(1-p)), as shown in the table below. Instead they are orders of magnitude bigger. I get the same problem with continuous variables; stata delivers much smaller std errors.&lt;/P&gt;
&lt;P&gt;No-one else seems to have this problem, and Ive been having it for years, so it has to be something I have misunderstood about proc means.I t might have something to do with the weight statement. The weights&amp;nbsp;&lt;SPAN&gt;post_wt&amp;nbsp;&lt;/SPAN&gt;range in size from 350 to 33500.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Please let me know how to get the correct std dev. I'm getting tired of porting my sas data sets to stata.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;THanks&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;J&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE border="0" cellspacing="0" cellpadding="0"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;Variable&lt;/TD&gt;
&lt;TD&gt;N&lt;/TD&gt;
&lt;TD&gt;Mean&lt;/TD&gt;
&lt;TD&gt;Std Dev&lt;/TD&gt;
&lt;TD&gt;Minimum&lt;/TD&gt;
&lt;TD&gt;Maximum&lt;/TD&gt;
&lt;TD&gt;sqrt(p(1-p))&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;PregMonth&lt;/TD&gt;
&lt;TD&gt;296527&lt;/TD&gt;
&lt;TD&gt;0.006651&lt;/TD&gt;
&lt;TD&gt;5.973354&lt;/TD&gt;
&lt;TD&gt;0&lt;/TD&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;TD&gt;0.081281&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;PregBirth&lt;/TD&gt;
&lt;TD&gt;296527&lt;/TD&gt;
&lt;TD&gt;0.003967&lt;/TD&gt;
&lt;TD&gt;4.619279&lt;/TD&gt;
&lt;TD&gt;0&lt;/TD&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;TD&gt;0.062856&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;MarrMonth&lt;/TD&gt;
&lt;TD&gt;296527&lt;/TD&gt;
&lt;TD&gt;0.006756&lt;/TD&gt;
&lt;TD&gt;6.020052&lt;/TD&gt;
&lt;TD&gt;0&lt;/TD&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;TD&gt;0.081917&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;ShotGunMar&lt;/TD&gt;
&lt;TD&gt;296527&lt;/TD&gt;
&lt;TD&gt;0.00106&lt;/TD&gt;
&lt;TD&gt;2.391512&lt;/TD&gt;
&lt;TD&gt;0&lt;/TD&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;TD&gt;0.032542&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;20462 proc means data=sinwom; *noprint;;&lt;BR /&gt;20463 var PregMonth PregBirth MarrMonth ShotGunMar;&lt;BR /&gt;20464 *output out=temp mean = PregMonth PregBirth MarrMonth ShotGunMar;&lt;BR /&gt;20465 weight post_wt;&lt;BR /&gt;20466 *proc print;&lt;BR /&gt;20467 run;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;NOTE: There were 296527 observations read from the data set WORK.SINWOM.&lt;BR /&gt;NOTE: PROCEDURE MEANS used (Total process time):&lt;BR /&gt;real time 0.19 seconds&lt;BR /&gt;cpu time 0.29 seconds&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 03:45:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470512#M70886</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-06-15T03:45:08Z</dc:date>
    </item>
    <item>
      <title>Re: proc means standard deviation too high? SAS 9.4</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470543#M70889</link>
      <description>&lt;P&gt;&lt;A href="http://go.documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.2&amp;amp;docsetId=proc&amp;amp;docsetTarget=p0v0y1on1hbxukn0zqgsp5ky8hc0.htm&amp;amp;locale=en" target="_self"&gt;Be aware that &lt;/A&gt;&lt;A href="https://blogs.sas.com/content/iml/2013/09/13/frequencies-vs-weights-in-regression.html" target="_self"&gt;weights are not frequencies.&lt;/A&gt;&amp;nbsp;If these numbers represent the number of cases, you should use the FREQ statement instead of the WEIGHT statement.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you really have weights, then&amp;nbsp;t&lt;A href="http://go.documentation.sas.com/?cdcId=pgmsascdc&amp;amp;cdcVersion=9.4_3.2&amp;amp;docsetId=proc&amp;amp;docsetTarget=p0v0y1on1hbxukn0zqgsp5ky8hc0.htm&amp;amp;locale=en" target="_self"&gt;he SAS documentation provides the formulas for weighted statistics.&lt;/A&gt;&amp;nbsp;Compare those formulas against&amp;nbsp;the formulas for the other software. My guess is that it is a divisor issue. SAS provides four possible divisors for a weighted statistic such as the&amp;nbsp;StdDev. By default, the divisor is (n-1), which is the usual unweighted divisor. Try using VARDEF=WEIGHT on the PROC MEANS statement to use the sum of weights as the divisor.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 10:43:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470543#M70889</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2018-06-15T10:43:56Z</dc:date>
    </item>
    <item>
      <title>Re: proc means standard deviation too high? SAS 9.4</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470611#M70894</link>
      <description>&lt;P&gt;This solved the problem. std dev now exactly as predicted in my post. also vardef=weight gave the same result.&amp;nbsp;Many thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 14:40:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470611#M70894</guid>
      <dc:creator>john_knowles</dc:creator>
      <dc:date>2018-06-15T14:40:45Z</dc:date>
    </item>
    <item>
      <title>Re: proc means standard deviation too high? SAS 9.4</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470613#M70895</link>
      <description>&lt;P&gt;You are right.&amp;nbsp;Using FREQ instead of WEIGHT gave the correct results. I wonder if back in SAS&amp;nbsp;5&amp;nbsp;when I first learned SAS, this distinction was not critical....&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 14:44:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470613#M70895</guid>
      <dc:creator>john_knowles</dc:creator>
      <dc:date>2018-06-15T14:44:09Z</dc:date>
    </item>
    <item>
      <title>Re: proc means standard deviation too high? SAS 9.4</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470614#M70896</link>
      <description>&lt;P&gt;Thanks for the suggestion. It looks like the problem was not the dataset but my&amp;nbsp;incomplete understanding of the weight statement.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jun 2018 14:46:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/proc-means-standard-deviation-too-high-SAS-9-4/m-p/470614#M70896</guid>
      <dc:creator>john_knowles</dc:creator>
      <dc:date>2018-06-15T14:46:11Z</dc:date>
    </item>
  </channel>
</rss>

