BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
robertrao
Quartz | Level 8

I want to find the significant difference between the values a) highlighted in red and blue. b) blue and green

1) For instance I would like to compare significant difference between Drug1-yes Drug2-no (9)  vs   Drug1-no Drug2-yes (13).

2) Also the significant  difference between Drug1-yes Drug2-no (9)   vs   Drug1-no Drug2-no  (8).

Category                     Drug1        Drug2       average length of stay   discharges           

Hemoglobin                  yes            no                 9                          60                                  

                                  no               yes               13                        4                                

                                                    no                 8                          80         

Could any one suggest what T-test need to be used ? If possible please provide the code.

Thanks,

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

Some clean up and it should be fine;  Hyphens are to be avoided in variable names, especially where it is a conjoing of vaiables with similar alpha parts and differing numeric parts.  So:

data have;

set raw;

if drug1='yes' and drug2='no' then drug1_drug2=1;

if drug1='no' and drug2='yes' then drug1_drug2=2;

run;

proc ttest data=have;

class drug1_drug2;

var <analysis var>; /*whatever you are analyzing*/

run;

This will give results.  Whether they are meaningful is another question.  Length of stay is almost certainly skewed, so the t-test will not have optimal properties.

Steve Denham

View solution in original post

7 REPLIES 7
UrvishShah
Fluorite | Level 6

Hi,

Here is my try...based on my understanding, it is the case of comparing two proportions of a categorical variable(CHI-SQUARE TEST)...so all you need to do is to manipulate the data before you use it to make infrences...The following one is the possible solution of your only 1 Question which you have posted...If it is meeting your requirement then we can try with second quetion...

/* Drug1(Yes) VS Drug2(No) */

data drug1_yes_drug2_no(drop = drug1 drug2);

   retain drug average;

   set have;

   if drug1 = "yes" and drug2 = "no";

   array temp(1) $ drug;

   temp(1) = drug1;

   output;

   temp(1) = drug2;

   output;

run;

proc freq data = drug1_yes_drug2_no;

   tables drug / chisq;

run;

/* Drug1(No) VS Drug2(Yes) */

data drug1_no_drug2_yes(drop = drug1 drug2);

   retain drug average;

   set have;

   if drug1 = "no" and drug2 = "yes";

   array temp(1) $ drug;

   temp(1) = drug1;

   output;

   temp(1) = drug2;

   output;

run;

proc freq data = drug1_no_drug2_yes;

   tables drug / chisq;

run;

-Urvish

SteveDenham
Jade | Level 19

A t-test is not appropriate for categorical data. Urvish's approach of using PROC FREQ is much more applicable to this data set.

Steve Denham

UrvishShah
Fluorite | Level 6

Hi,

If you want to make the infrence between Drug1 yes-Drug2 no. VS Drug1no-Drug2yes then PROC TTEST is suitable...

Please have a look at following modified code...

i have appended the two categories along with their averages and then calculated T-Test between them that Two Sample T-Test...Hope it meet the requirement

data have;

   input drug1 $ drug2 $ average;

   cards4;

yes no 9

no yes 13

. no 8

;;;;

/* Drug1(Yes) VS Drug2(No) */

data drug1_yes_drug2_no(drop = drug1 drug2);

   retain drug average;

   set have;

   if drug1 = "yes" and drug2 = "no";

   array temp(1) $ drug;

   temp(1) = drug1;

   output;

   temp(1) = drug2;

   output;

run;

/* Drug1(No) VS Drug2(Yes) */

data drug1_no_drug2_yes(drop = drug1 drug2);

   retain drug average;

   set have;

   if drug1 = "no" and drug2 = "yes";

   array temp(1) $ drug;

   temp(1) = drug1;

   output;

   temp(1) = drug2;

   output;

run;

data want;

   set drug1_yes_drug2_no drug1_no_drug2_yes;

run;

proc ttest data = want;

   class drug;

   var average;

run;

-Urvish

robertrao
Quartz | Level 8

Hi,

Thanks for your time..

As i was indicating Average is not my analysis variable.They are values for example :Length of stay: since there are 60 discharges in the

drug1 = "yes" and drug2 = "no"; category I will have 60 LOS values

FOR drug1_yes_drug2_no   VS  Drug1(No) VS Drug2(Yes) i create a variable drug1-drug2 and use this variable in the CLASS of TTEST!!!!

if drug1 is yes and drug2 is no then drug1-drug2=drug1-y-drug2-no;

iif drug1 is no and drug2 is yes then drug1-drug2=drug1-no-drug2-yes;

run;

proc treat data=have;

class drug1-drug2;

var Analysis-var;

run;

WILL THIS WORK OUT????

Similarly I do it for the other

Also I am not clear on why to use arrays here?could you please explain

Thanks

SteveDenham
Jade | Level 19

Some clean up and it should be fine;  Hyphens are to be avoided in variable names, especially where it is a conjoing of vaiables with similar alpha parts and differing numeric parts.  So:

data have;

set raw;

if drug1='yes' and drug2='no' then drug1_drug2=1;

if drug1='no' and drug2='yes' then drug1_drug2=2;

run;

proc ttest data=have;

class drug1_drug2;

var <analysis var>; /*whatever you are analyzing*/

run;

This will give results.  Whether they are meaningful is another question.  Length of stay is almost certainly skewed, so the t-test will not have optimal properties.

Steve Denham

UrvishShah
Fluorite | Level 6

Well, there are different ways to manipulate the data so i thought ARRAY will help to get the desired output data to be use for T-Test...

Your code is also correct...Based on your very first post i was not that much clear about what you want to compare so code in such a way that will give split the data in two...

Anyways you can also modified your above code so that user can pass any ANALYSIS VARIABLES...

%macro t-test(analysis_var);

   proc ttest data = have;

      class drug1_drug2;

      var &analysis_var.;

   run;

%mend;

%ttest(discharge);

-Urvish

robertrao
Quartz | Level 8

Hi Urvish,

The one I posted above is a consolidated summary report...

I want to do the TTest on the actual data and not on the average as you mentioned!!!!

Please help me understand this corectly

Thanks

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1091 views
  • 3 likes
  • 3 in conversation