BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lboyd
Calcite | Level 5

I merged two datasets, a pre and post, but due to a small post size compared to the pre, I am treating the data as two independent samples and am aiming to run a ttest:

 

data prepostindep;
set pre  post;
run;

 

I need to test the difference between the mean of a scale that was created by:

meanscale1pre=mean (of q5 q6 q7a q8 q9 ); *data that came from pre dataset
meanscale1post=mean (of q5p q6p q7ap q8p q9p); *data that came from post dataset

 

Because it is the mean of scale data that I am testing between the two groups, the proc ttest statement doesn't work.

I tried running proc genmod but go this error:

ERROR: No valid observations due to invalid or missing values in the response, explanatory, offset, frequency, or weight variable.

 

Possibly due to the fact that there are missing values for both pre and post since post only has variables that end in p (shown above), while pre doesn't and vice versa. In addition, the sample size for post is about 50 whereas it's 400 for the pre.  That may change as we receive more posts in the future.

 

What's the best way to analyze this?

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@lboyd wrote:

I want to run an independent ttest using proc ttest, however the data is set up in a way that will not allow that since I am testing the difference between the means of two scales (as described above). Scale1pre is comprised of 5 questions, while Scale2post has those same questions but are from differently named variables. To conduct a proper proc ttest, I need to have a class statement, but here I do not, and I have two variables, scale1pre and scale1post. It is simple to run this data using a paired ttest, but I am wondering how to treat the data as two independent samples. I hope I am describing this clearly, let me know if I am not.


If you want to run a TTest your data has to conform to rules. ONE variable not 5. You may have to reshape data by getting a single variable and adding a classification variable. Here is an example modifying the data from the online Ttest documentation for a paired test to your test:

data pressure;
   input SBPbefore SBPafter @@;
   datalines;
120 128   124 131   130 131   118 127
140 132   128 125   140 141   135 137
126 118   130 132   126 129   127 135
;
run;

data forttest;
   set pressure;
   Period= 'Before' ; SBP =  SBPbefore; output;
   Period= 'After'  ; SBP =  SBPAfter; output;
   keep period SBP;
run;

proc ttest data=forttest;
   class period;
   var SBP;
run;

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

Because it is the mean of scale data that I am testing between the two groups, the proc ttest statement doesn't work.

Given the stated sample sizes, the Central Limit Theorem does apply and you could use PROC TTEST. I would do a paired analysis on the 50 samples where you have both pre and post.

 

 

--
Paige Miller
lboyd
Calcite | Level 5

I am also running a paired analysis, but I also want to run it as two independent samples, what would I run or how would I transform the data?

PaigeMiller
Diamond | Level 26

I don't understand the question

--
Paige Miller
lboyd
Calcite | Level 5

I want to run an independent ttest using proc ttest, however the data is set up in a way that will not allow that since I am testing the difference between the means of two scales (as described above). Scale1pre is comprised of 5 questions, while Scale2post has those same questions but are from differently named variables. To conduct a proper proc ttest, I need to have a class statement, but here I do not, and I have two variables, scale1pre and scale1post. It is simple to run this data using a paired ttest, but I am wondering how to treat the data as two independent samples. I hope I am describing this clearly, let me know if I am not.

ballardw
Super User

@lboyd wrote:

I want to run an independent ttest using proc ttest, however the data is set up in a way that will not allow that since I am testing the difference between the means of two scales (as described above). Scale1pre is comprised of 5 questions, while Scale2post has those same questions but are from differently named variables. To conduct a proper proc ttest, I need to have a class statement, but here I do not, and I have two variables, scale1pre and scale1post. It is simple to run this data using a paired ttest, but I am wondering how to treat the data as two independent samples. I hope I am describing this clearly, let me know if I am not.


If you want to run a TTest your data has to conform to rules. ONE variable not 5. You may have to reshape data by getting a single variable and adding a classification variable. Here is an example modifying the data from the online Ttest documentation for a paired test to your test:

data pressure;
   input SBPbefore SBPafter @@;
   datalines;
120 128   124 131   130 131   118 127
140 132   128 125   140 141   135 137
126 118   130 132   126 129   127 135
;
run;

data forttest;
   set pressure;
   Period= 'Before' ; SBP =  SBPbefore; output;
   Period= 'After'  ; SBP =  SBPAfter; output;
   keep period SBP;
run;

proc ttest data=forttest;
   class period;
   var SBP;
run;

lboyd
Calcite | Level 5
This is exactly what I was looking for. Thank you!

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1422 views
  • 0 likes
  • 3 in conversation