Solved: Testing differences in means of scale data

lboyd · Posted 11-09-2017 10:47 AM

I merged two datasets, a pre and post, but due to a small post size compared to the pre, I am treating the data as two independent samples and am aiming to run a ttest:

data prepostindep;
set pre post;
run;

I need to test the difference between the mean of a scale that was created by:

meanscale1pre=mean (of q5 q6 q7a q8 q9 ); *data that came from pre dataset
meanscale1post=mean (of q5p q6p q7ap q8p q9p); *data that came from post dataset

Because it is the mean of scale data that I am testing between the two groups, the proc ttest statement doesn't work.

I tried running proc genmod but go this error:

ERROR: No valid observations due to invalid or missing values in the response, explanatory, offset, frequency, or weight variable.

Possibly due to the fact that there are missing values for both pre and post since post only has variables that end in p (shown above), while pre doesn't and vice versa. In addition, the sample size for post is about 50 whereas it's 400 for the pre. That may change as we receive more posts in the future.

What's the best way to analyze this?

ballardw · Posted 11-09-2017 12:30 PM

@lboyd wrote:

I want to run an independent ttest using proc ttest, however the data is set up in a way that will not allow that since I am testing the difference between the means of two scales (as described above). Scale1pre is comprised of 5 questions, while Scale2post has those same questions but are from differently named variables. To conduct a proper proc ttest, I need to have a class statement, but here I do not, and I have two variables, scale1pre and scale1post. It is simple to run this data using a paired ttest, but I am wondering how to treat the data as two independent samples. I hope I am describing this clearly, let me know if I am not.

If you want to run a TTest your data has to conform to rules. ONE variable not 5. You may have to reshape data by getting a single variable and adding a classification variable. Here is an example modifying the data from the online Ttest documentation for a paired test to your test:

data pressure;
   input SBPbefore SBPafter @@;
   datalines;
120 128   124 131   130 131   118 127
140 132   128 125   140 141   135 137
126 118   130 132   126 129   127 135
;
run;

data forttest;
   set pressure;
   Period= 'Before' ; SBP =  SBPbefore; output;
   Period= 'After'  ; SBP =  SBPAfter; output;
   keep period SBP;
run;

proc ttest data=forttest;
   class period;
   var SBP;
run;

r

View solution in original post

PaigeMiller · Posted 11-09-2017 11:44 AM

Because it is the mean of scale data that I am testing between the two groups, the proc ttest statement doesn't work.

Given the stated sample sizes, the Central Limit Theorem does apply and you could use PROC TTEST. I would do a paired analysis on the 50 samples where you have both pre and post.

--
Paige Miller

lboyd · Posted 11-09-2017 11:48 AM

I am also running a paired analysis, but I also want to run it as two independent samples, what would I run or how would I transform the data?

PaigeMiller · Posted 11-09-2017 11:53 AM

I don't understand the question

--
Paige Miller

lboyd · Posted 11-09-2017 12:03 PM

I want to run an independent ttest using proc ttest, however the data is set up in a way that will not allow that since I am testing the difference between the means of two scales (as described above). Scale1pre is comprised of 5 questions, while Scale2post has those same questions but are from differently named variables. To conduct a proper proc ttest, I need to have a class statement, but here I do not, and I have two variables, scale1pre and scale1post. It is simple to run this data using a paired ttest, but I am wondering how to treat the data as two independent samples. I hope I am describing this clearly, let me know if I am not.

ballardw · Posted 11-09-2017 12:30 PM

@lboyd wrote:

I want to run an independent ttest using proc ttest, however the data is set up in a way that will not allow that since I am testing the difference between the means of two scales (as described above). Scale1pre is comprised of 5 questions, while Scale2post has those same questions but are from differently named variables. To conduct a proper proc ttest, I need to have a class statement, but here I do not, and I have two variables, scale1pre and scale1post. It is simple to run this data using a paired ttest, but I am wondering how to treat the data as two independent samples. I hope I am describing this clearly, let me know if I am not.

If you want to run a TTest your data has to conform to rules. ONE variable not 5. You may have to reshape data by getting a single variable and adding a classification variable. Here is an example modifying the data from the online Ttest documentation for a paired test to your test:

data pressure;
   input SBPbefore SBPafter @@;
   datalines;
120 128   124 131   130 131   118 127
140 132   128 125   140 141   135 137
126 118   130 132   126 129   127 135
;
run;

data forttest;
   set pressure;
   Period= 'Before' ; SBP =  SBPbefore; output;
   Period= 'After'  ; SBP =  SBPAfter; output;
   keep period SBP;
run;

proc ttest data=forttest;
   class period;
   var SBP;
run;

r

lboyd · Posted 11-09-2017 12:38 PM

This is exactly what I was looking for. Thank you!

Testing differences in means of scale data

Re: Testing differences in means of scale data

Re: Testing differences in means of scale data

Re: Testing differences in means of scale data

Re: Testing differences in means of scale data

Re: Testing differences in means of scale data

Re: Testing differences in means of scale data

Re: Testing differences in means of scale data

SAS Innovate 2025: Call for Content

Classroom Training Available!