<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Comparison of two empirical distributions via KS in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689291#M33223</link>
    <description>Then I'd argue your hypothesis is not well defined. Is it diff&amp;gt;0 or is it diff&amp;gt;x% or diff&amp;gt;45 units.&lt;BR /&gt;Right now you're using the 'default' hypothesis of 0 but that doesn't have to be true...</description>
    <pubDate>Tue, 06 Oct 2020 18:05:16 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2020-10-06T18:05:16Z</dc:date>
    <item>
      <title>Comparison of two empirical distributions via KS</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/688966#M33209</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;I’m testing whether two empirical distributions are identical or not. I have a group of people with two observations for the variable ‘EKC’, ‘before’ and ‘after’ some intervention. I’m using K-S for the comparison of both distributions. Additionally, observations come from a national survey, so each individual contains a survey weight (i.e. weight) to produce national estimates. See code below:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;ods graphics on;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;STRONG&gt;proc&lt;/STRONG&gt; &lt;STRONG&gt;npar1way&lt;/STRONG&gt;&amp;nbsp; data = dat edf;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp; freq weight;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp; class&amp;nbsp; time;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp; var&amp;nbsp;&amp;nbsp;&amp;nbsp; ekc;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp; ods output KS2Stats=ks;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;STRONG&gt;run&lt;/STRONG&gt;;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Notice in the table below, that both distribution are very similar (almost identical).&lt;/P&gt;
&lt;TABLE width="0"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="93"&gt;
&lt;P&gt;Percentiles&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;Before&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;After&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="94"&gt;
&lt;P&gt;Differences&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;% change&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="93"&gt;
&lt;P&gt;&lt;STRONG&gt;100% Max&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;3289.1&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;3279.5&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="94"&gt;
&lt;P&gt;-9.6&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;0.3&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="93"&gt;
&lt;P&gt;&lt;STRONG&gt;99%&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;2443.7&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;2436.6&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="94"&gt;
&lt;P&gt;-7.1&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;0.3&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="93"&gt;
&lt;P&gt;&lt;STRONG&gt;95%&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;2180.5&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;2173.5&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="94"&gt;
&lt;P&gt;-6.9&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;0.3&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="93"&gt;
&lt;P&gt;&lt;STRONG&gt;90%&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;2047.3&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;2040.8&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="94"&gt;
&lt;P&gt;-6.4&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;0.3&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="93"&gt;
&lt;P&gt;&lt;STRONG&gt;75% Q3&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;1838.8&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;1832.6&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="94"&gt;
&lt;P&gt;-6.2&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;0.3&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="93"&gt;
&lt;P&gt;&lt;STRONG&gt;50% Median&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;1623.3&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;1617.6&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="94"&gt;
&lt;P&gt;-5.6&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;0.3&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="93"&gt;
&lt;P&gt;&lt;STRONG&gt;25% Q1&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;1427.8&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;1422.7&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="94"&gt;
&lt;P&gt;-5.2&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;0.4&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="93"&gt;
&lt;P&gt;&lt;STRONG&gt;10%&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;1271.8&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;1266.9&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="94"&gt;
&lt;P&gt;-4.9&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;0.4&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="93"&gt;
&lt;P&gt;&lt;STRONG&gt;5%&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;1187.6&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;1182.7&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="94"&gt;
&lt;P&gt;-4.9&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;0.4&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="93"&gt;
&lt;P&gt;&lt;STRONG&gt;1%&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;1029.4&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;1024.9&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="94"&gt;
&lt;P&gt;-4.6&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;0.4&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="93"&gt;
&lt;P&gt;&lt;STRONG&gt;0% Min&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;642.3&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;638.4&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="94"&gt;
&lt;P&gt;-3.9&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="73"&gt;
&lt;P&gt;0.6&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/P&gt;
&lt;P&gt;Nevertheless, the K-S for the comparison of the two samples suggest the both distributions are different ((Pr &amp;gt; KSa) &amp;lt;.0001). &amp;nbsp;&lt;/P&gt;
&lt;P&gt;I’m not sure how the fact that both empirical distributions are not independent (note they come from the same groups of individuals before and after some intervention) can affect the test. If so, can you please suggest an alternative valid test?&lt;/P&gt;
&lt;P&gt;Thanks a lot,&lt;/P&gt;
&lt;P&gt;A.G.&lt;/P&gt;</description>
      <pubDate>Mon, 05 Oct 2020 18:22:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/688966#M33209</guid>
      <dc:creator>alexgonzalez</dc:creator>
      <dc:date>2020-10-05T18:22:38Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of two empirical distributions via KS</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/688982#M33210</link>
      <description>P-Values measure statistical significance not practical significance. The difference there is measured and is statistically significant but perhaps a 0.3% decrease is not what you were looking for? In this case the distribution has shifted so it is different. &lt;BR /&gt;And remember if you have a large N, small differences are easier to pick up and more likely to be statistically significant even if they're not practically significant.</description>
      <pubDate>Mon, 05 Oct 2020 19:27:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/688982#M33210</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-10-05T19:27:50Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of two empirical distributions via KS</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689016#M33211</link>
      <description>&lt;P&gt;The apparent extreme sensitivity of the KS test here is due to the use of the FREQ statement. Freq specifies a frequency, not a weight. When you say "x=10, freq=100" the procedure considers that you have 100 independent measurements at 10, not a single measurement with a sampling weight of 100. SAS does not provide a weighted KS test (if such a thing exists).&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Properly weighted statistics are provided by the SURVEYxxxx procs.&lt;/P&gt;</description>
      <pubDate>Mon, 05 Oct 2020 21:55:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689016#M33211</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2020-10-05T21:55:05Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of two empirical distributions via KS</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689148#M33219</link>
      <description>You're right Reeza, statistical and practical significance are not the same. Having said that, in this case I'm shocked the p-value for the KS test is &amp;lt;0.0001 even though both distributions almost perfectly overlap when plotted together. I would't expect to have such a small p-value. I should probably stick with my approach to comparing distributions using % changes, it's way more meaningful to me in this case. &lt;BR /&gt;Thank you!</description>
      <pubDate>Tue, 06 Oct 2020 11:02:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689148#M33219</guid>
      <dc:creator>alexgonzalez</dc:creator>
      <dc:date>2020-10-06T11:02:06Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of two empirical distributions via KS</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689152#M33220</link>
      <description>When certain statistical procedure is not available for weighted observations (i.e. SURVEYxxxx proc), and alternative way to deal with that is to replicate observations in the dataset based on the survey weights. &lt;BR /&gt;I suspect there might be two issues here. As Reeza pointed out, the big sample size might be causing picking up statisticial significance when there is no. Additionally, observations from both groups are not independent. Not sure how sensitive KS is to this.&lt;BR /&gt;Thank you!</description>
      <pubDate>Tue, 06 Oct 2020 11:10:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689152#M33220</guid>
      <dc:creator>alexgonzalez</dc:creator>
      <dc:date>2020-10-06T11:10:28Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of two empirical distributions via KS</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689253#M33221</link>
      <description>You cannot visually see it well, but the curve has shifted, if you graph the densities you may see it more easily. &lt;BR /&gt;If it's pre-post measures though, you usually analyze the difference in the scores and see if that's centered on 0.</description>
      <pubDate>Tue, 06 Oct 2020 15:45:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689253#M33221</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-10-06T15:45:08Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of two empirical distributions via KS</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689274#M33222</link>
      <description>This is clearly one of those cases whether there might be statistical significance, but not a practical one. I ran an alternative analysis to compare the two means (paired comparison), and they turned out to be 'statistically' significant. &lt;BR /&gt;</description>
      <pubDate>Tue, 06 Oct 2020 17:16:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689274#M33222</guid>
      <dc:creator>alexgonzalez</dc:creator>
      <dc:date>2020-10-06T17:16:53Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of two empirical distributions via KS</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689291#M33223</link>
      <description>Then I'd argue your hypothesis is not well defined. Is it diff&amp;gt;0 or is it diff&amp;gt;x% or diff&amp;gt;45 units.&lt;BR /&gt;Right now you're using the 'default' hypothesis of 0 but that doesn't have to be true...</description>
      <pubDate>Tue, 06 Oct 2020 18:05:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689291#M33223</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-10-06T18:05:16Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of two empirical distributions via KS</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689306#M33224</link>
      <description>I'm not sure why you think my hypothesis is not well defined. I'm interested in the 'zero' differences. Maybe I was not clear enough in my previous message. Both, K-S and the paired test for the difference in means are both consistent and yield 'statistical significance'.  Can you please clarify what you do think so?&lt;BR /&gt;Thanks.</description>
      <pubDate>Tue, 06 Oct 2020 18:49:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689306#M33224</guid>
      <dc:creator>alexgonzalez</dc:creator>
      <dc:date>2020-10-06T18:49:24Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of two empirical distributions via KS</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689317#M33225</link>
      <description>Your test is for a difference of 0, but you seem to want a difference of X% or Y raw value as a minimum which is a different hypothesis. You can change your hypothesis to account for practical significance....&lt;BR /&gt;</description>
      <pubDate>Tue, 06 Oct 2020 19:16:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689317#M33225</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-10-06T19:16:34Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison of two empirical distributions via KS</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689504#M33242</link>
      <description>Now I get what you mean, Reeza. That's exactly what I should do. Any advice on how to specify a difference other tha zero for the K-S test to compare two distributions in the NPAR1WAY procedure?</description>
      <pubDate>Wed, 07 Oct 2020 11:33:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-of-two-empirical-distributions-via-KS/m-p/689504#M33242</guid>
      <dc:creator>alexgonzalez</dc:creator>
      <dc:date>2020-10-07T11:33:10Z</dc:date>
    </item>
  </channel>
</rss>

