Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Help with T-test code

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-08-2013 10:52 PM

I want to find the significant difference between the values a) highlighted in red and blue. b) blue and green

1) For instance I would like to compare significant difference between Drug1-yes Drug2-no (9) vs Drug1-no Drug2-yes (13).

2) Also the significant difference between Drug1-yes Drug2-no (9) vs Drug1-no Drug2-no (8).

Category Drug1 Drug2 average length of stay discharges

Hemoglobin yes no 9 60

no yes 13 40

no 8 80

Could any one suggest what T-test need to be used ? If possible please provide the code.

Thanks,

Accepted Solutions

Solution

08-09-2013
12:28 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to robertrao

08-09-2013 12:28 PM

Some clean up and it should be fine; Hyphens are to be avoided in variable names, especially where it is a conjoing of vaiables with similar alpha parts and differing numeric parts. So:

data have;

set raw;

if drug1='yes' and drug2='no' then drug1_drug2=1;

if drug1='no' and drug2='yes' then drug1_drug2=2;

run;

proc ttest data=have;

class drug1_drug2;

var <analysis var>; /*whatever you are analyzing*/

run;

This will give results. Whether they are meaningful is another question. Length of stay is almost certainly skewed, so the t-test will not have optimal properties.

Steve Denham

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to robertrao

08-09-2013 04:48 AM

Hi,

Here is my try...based on my understanding, it is the case of comparing two proportions of a categorical variable(CHI-SQUARE TEST)...so all you need to do is to manipulate the data before you use it to make infrences...The following one is the possible solution of your only 1 Question which you have posted...If it is meeting your requirement then we can try with second quetion...

/* Drug1(Yes) VS Drug2(No) */

**data** drug1_yes_drug2_no(drop = drug1 drug2);

retain drug average;

set have;

if drug1 = "yes" and drug2 = "no";

array temp(**1**) $ drug;

temp(**1**) = drug1;

output;

temp(**1**) = drug2;

output;

**run**;

**proc** **freq** data = drug1_yes_drug2_no;

tables drug / chisq;

**run**;

/* Drug1(No) VS Drug2(Yes) */

**data** drug1_no_drug2_yes(drop = drug1 drug2);

retain drug average;

set have;

if drug1 = "no" and drug2 = "yes";

array temp(**1**) $ drug;

temp(**1**) = drug1;

output;

temp(**1**) = drug2;

output;

**run**;

**proc** **freq** data = drug1_no_drug2_yes;

tables drug / chisq;

**run**;

-Urvish

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to UrvishShah

08-09-2013 07:36 AM

A t-test is not appropriate for categorical data. Urvish's approach of using PROC FREQ is much more applicable to this data set.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

08-09-2013 11:02 AM

**Hi,**

**If you want to make the infrence between ****Drug1 yes-Drug2 no. VS Drug1no-Drug2yes then PROC TTEST is suitable...**

**Please have a look at following modified code...**

**i have appended the two categories along with their averages and then calculated T-Test between them that Two Sample T-Test...Hope it meet the requirement**

**data** have;

input drug1 $ drug2 $ average;

cards4;

yes no 9

no yes 13

. no 8

;;;;

/* Drug1(Yes) VS Drug2(No) */

**data** drug1_yes_drug2_no(drop = drug1 drug2);

retain drug average;

set have;

if drug1 = "yes" and drug2 = "no";

array temp(**1**) $ drug;

temp(**1**) = drug1;

output;

temp(**1**) = drug2;

output;

**run**;

/* Drug1(No) VS Drug2(Yes) */

**data** drug1_no_drug2_yes(drop = drug1 drug2);

retain drug average;

set have;

if drug1 = "no" and drug2 = "yes";

array temp(**1**) $ drug;

temp(**1**) = drug1;

output;

temp(**1**) = drug2;

output;

**run**;

**data** want;

set drug1_yes_drug2_no drug1_no_drug2_yes;

**run**;

**proc** **ttest** data = want;

class drug;

var average;

**run**;

-Urvish

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to UrvishShah

08-09-2013 11:15 AM

Hi,

Thanks for your time..

As i was indicating Average is not my analysis variable.They are values for example :Length of stay: since there are 60 discharges in the

drug1 = "yes" and drug2 = "no"; category I will have 60 LOS values

FOR drug1_yes_drug2_no VS Drug1(No) VS Drug2(Yes) i create a variable **drug1-drug2 and use this variable in the CLASS of TTEST!!!!**

if drug1 is yes and drug2 is no then **drug1-drug2**=drug1-y-drug2-no;

iif drug1 is no and drug2 is yes then **drug1-drug2**=drug1-no-drug2-yes;

run;

proc treat data=have;

class drug1-drug2;

var Analysis-var;

run;

WILL THIS WORK OUT????

Similarly I do it for the other

Also I am not clear on why to use arrays here?could you please explain

Thanks

Solution

08-09-2013
12:28 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to robertrao

08-09-2013 12:28 PM

Some clean up and it should be fine; Hyphens are to be avoided in variable names, especially where it is a conjoing of vaiables with similar alpha parts and differing numeric parts. So:

data have;

set raw;

if drug1='yes' and drug2='no' then drug1_drug2=1;

if drug1='no' and drug2='yes' then drug1_drug2=2;

run;

proc ttest data=have;

class drug1_drug2;

var <analysis var>; /*whatever you are analyzing*/

run;

This will give results. Whether they are meaningful is another question. Length of stay is almost certainly skewed, so the t-test will not have optimal properties.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to robertrao

08-11-2013 05:39 AM

Well, there are different ways to manipulate the data so i thought ARRAY will help to get the desired output data to be use for T-Test...

Your code is also correct...Based on your very first post i was not that much clear about what you want to compare so code in such a way that will give split the data in two...

Anyways you can also modified your above code so that user can pass any ANALYSIS VARIABLES...

%macro t-test(analysis_var);

proc ttest data = have;

class drug1_drug2;

var &analysis_var.;

run;

%mend;

%ttest(discharge);

-Urvish

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to UrvishShah

08-09-2013 09:37 AM

Hi Urvish,

The one I posted above is a consolidated summary report...

I want to do the TTest on the actual data and not on the average as you mentioned!!!!

Please help me understand this corectly

Thanks