<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Comparing two means, independent samples, non-normal distribution in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467945#M24327</link>
    <description>&lt;P&gt;I don't get it &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;. Maybe our understanding of the nature of the data differs. How would the following&amp;nbsp;obs be paired?:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;physician caseId method timing
John           1 manual    1.0
John           2 manual    1.1
John           3 manual    1.2
John           4 machine   0.9
John           5 machine   1.3
John           6 machine   1.0



&lt;/PRE&gt;</description>
    <pubDate>Wed, 06 Jun 2018 03:17:59 GMT</pubDate>
    <dc:creator>PGStats</dc:creator>
    <dc:date>2018-06-06T03:17:59Z</dc:date>
    <item>
      <title>Comparing two means, independent samples, non-normal distribution</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467796#M24320</link>
      <description>&lt;P&gt;Hi, all&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm helping a colleague to evaluate some data. It pertains to two methods of capturing and correcting medical notes concerning patients:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. The physician dictates the notes to an automated voice-to-text program, receives the text version of the notes, and corrects them manually.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;2. The physician dictates the notes to a recording device, and the recording is then transcribed to a text file by a human at a keyboard. Again, the text file is then corrected manually by the physician.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Out of these processes come the word count of the resulting text file, and the time in seconds required by the physician to correct the errors in the file. The time is divided by the word count to result in a "seconds per word" statistic. We wish to compare the mean statistics for the two methods.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Although it is the same physician dictating both sets of notes, I'm treating them as independent samples. I'd appreciate feedback on whether this decision is correct.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Using PROC UNIVARIATE indicates that the results from the manual process aren't distributed normally. Based on what I've read, the independent samples and lack of normality indicate that a Wilcoxon rank-sum test (PROC NPAR1WAY) is the correct one to indicate whether the mean correction times are significantly different.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If I'm making any errors in this process, I would appreciate advice about how to approach it.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks,&lt;BR /&gt;&amp;nbsp; &amp;nbsp;Tom&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jun 2018 18:40:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467796#M24320</guid>
      <dc:creator>TomKari</dc:creator>
      <dc:date>2018-06-05T18:40:39Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing two means, independent samples, non-normal distribution</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467797#M24321</link>
      <description>&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;Although it is the same physician dictating both sets of notes, I'm treating them as independent samples. I'd appreciate feedback on whether this decision is correct.&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;These seem paired to me.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;Using PROC UNIVARIATE indicates that the results from the manual process aren't distributed normally. Based on what I've read, the independent samples and lack of normality indicate that a Wilcoxon rank-sum test (PROC NPAR1WAY) is the correct one to indicate whether the mean correction times are significantly different.&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;This is one way to determine if the means are different. However if the data is really paired, then you want to look at the distribution of the differences between method 1 and method 2 to see if it is normally distributed. Even if the difference is not normally distributed, sometimes the Central Limit Theorem comes to the rescue if you have enough observations, and you can treat it as normal.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jun 2018 18:47:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467797#M24321</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2018-06-05T18:47:24Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing two means, independent samples, non-normal distribution</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467802#M24323</link>
      <description>&lt;P&gt;IIRC the t-test is pretty robust to deviations from normal - as long as they have the same distributions. There was a twitter debate on this a few weeks ago with some really good references (Yes, I'm that much of a data geek).&lt;/P&gt;
&lt;P&gt;I would probably also be looking at grouping the analysis by physician and/or transcriber to ensure that there is no big deviations depending on who's doing the coding versus the machine. That's more EDA though.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We found some interesting physician and&amp;nbsp;recorder counts that you can ask me about if we ever meet....one of the more memorable analysis projects of my career.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Fun project though &lt;span class="lia-unicode-emoji" title=":grinning_face_with_smiling_eyes:"&gt;😄&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jun 2018 18:59:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467802#M24323</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-06-05T18:59:58Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing two means, independent samples, non-normal distribution</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467821#M24325</link>
      <description>&lt;P&gt;The&amp;nbsp;timings should be considered paired only if&amp;nbsp;the physician is having the same notes transcribed by both methods. Otherwise there is no possible pairing.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The non-parametric comparison will protect you from the influence of extreme observations at almost no cost in terms of power. You should stratify the test&amp;nbsp;by physician.&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jun 2018 19:35:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467821#M24325</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2018-06-05T19:35:41Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing two means, independent samples, non-normal distribution</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467929#M24326</link>
      <description>&lt;P&gt;The stimulus is the same. The treatment differs. This is paired.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 06 Jun 2018 01:40:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467929#M24326</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2018-06-06T01:40:33Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing two means, independent samples, non-normal distribution</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467945#M24327</link>
      <description>&lt;P&gt;I don't get it &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;. Maybe our understanding of the nature of the data differs. How would the following&amp;nbsp;obs be paired?:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;physician caseId method timing
John           1 manual    1.0
John           2 manual    1.1
John           3 manual    1.2
John           4 machine   0.9
John           5 machine   1.3
John           6 machine   1.0



&lt;/PRE&gt;</description>
      <pubDate>Wed, 06 Jun 2018 03:17:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/467945#M24327</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2018-06-06T03:17:59Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing two means, independent samples, non-normal distribution</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/468019#M24340</link>
      <description>&lt;P&gt;Then I would say that &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15142"&gt;@TomKari&lt;/a&gt; has not given us enough information to determine if it is paired or not.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This also fits the description he gave in his original post&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;physician caseId method timing
John           1 manual    1.0
John           2 manual    1.1
John           3 manual    1.2
John           1 machine   0.9
John           2 machine   1.3
John           3 machine   1.0&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp; &lt;/P&gt;</description>
      <pubDate>Wed, 06 Jun 2018 11:48:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/468019#M24340</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2018-06-06T11:48:15Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing two means, independent samples, non-normal distribution</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/469035#M24399</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;,&amp;nbsp;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;,&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you all for your insightful comments. They are a great help for someone relatively inexperienced at this.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First of all, the details that I should have included.&lt;/P&gt;
&lt;P&gt;1. All of the notes are being dictated by a single physician.&lt;/P&gt;
&lt;P&gt;2. They are not duplicates; he dictated notes for some patients using the first method, and some notes using the second method.&lt;/P&gt;
&lt;P&gt;3. We don't know who the transcriber is, and at this point have no way to find out.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Based on your comments, I believe that the observations are not paired. Apologies for leaving out the pertinent details.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Although Reeza indicates that a t-test is probably fine, I'm can't be sure if the distributions of the two samples are the same. I think I'll stick with the Wilcoxon rank-sum.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you again for all of your help!&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;Tom&lt;/P&gt;</description>
      <pubDate>Sun, 10 Jun 2018 16:03:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/469035#M24399</guid>
      <dc:creator>TomKari</dc:creator>
      <dc:date>2018-06-10T16:03:49Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing two means, independent samples, non-normal distribution</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/469038#M24400</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15142"&gt;@TomKari&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Hi,&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;,&amp;nbsp;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;,&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you all for your insightful comments. They are a great help for someone relatively inexperienced at this.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First of all, the details that I should have included.&lt;/P&gt;
&lt;P&gt;1. All of the notes are being dictated by a single physician.&lt;/P&gt;
&lt;P&gt;2. They are not duplicates; he dictated notes for some patients using the first method, and some notes using the second method.&lt;/P&gt;
&lt;P&gt;3. We don't know who the transcriber is, and at this point have no way to find out.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Based on your comments, I believe that the observations are not paired. Apologies for leaving out the pertinent details.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Although Reeza indicates that a t-test is probably fine, I'm can't be sure if the distributions of the two samples are the same. I think I'll stick with the Wilcoxon rank-sum.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you again for all of your help!&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;Tom&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Yes, now I agree the data is not paired. In the future, if possible to do, setting this experiment up as paired would be a better statistical approach. To do this, the doctor would have to dictate the notes for a patient twice, using both methods.&lt;/P&gt;</description>
      <pubDate>Sun, 10 Jun 2018 16:32:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparing-two-means-independent-samples-non-normal-distribution/m-p/469038#M24400</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2018-06-10T16:32:15Z</dc:date>
    </item>
  </channel>
</rss>

