I merged two datasets, a pre and post, but due to a small post size compared to the pre, I am treating the data as two independent samples and am aiming to run a ttest:
data prepostindep;
set pre post;
run;
I need to test the difference between the mean of a scale that was created by:
meanscale1pre=mean (of q5 q6 q7a q8 q9 ); *data that came from pre dataset
meanscale1post=mean (of q5p q6p q7ap q8p q9p); *data that came from post dataset
Because it is the mean of scale data that I am testing between the two groups, the proc ttest statement doesn't work.
I tried running proc genmod but go this error:
ERROR: No valid observations due to invalid or missing values in the response, explanatory, offset, frequency, or weight variable.
Possibly due to the fact that there are missing values for both pre and post since post only has variables that end in p (shown above), while pre doesn't and vice versa. In addition, the sample size for post is about 50 whereas it's 400 for the pre. That may change as we receive more posts in the future.
What's the best way to analyze this?
@lboyd wrote:
I want to run an independent ttest using proc ttest, however the data is set up in a way that will not allow that since I am testing the difference between the means of two scales (as described above). Scale1pre is comprised of 5 questions, while Scale2post has those same questions but are from differently named variables. To conduct a proper proc ttest, I need to have a class statement, but here I do not, and I have two variables, scale1pre and scale1post. It is simple to run this data using a paired ttest, but I am wondering how to treat the data as two independent samples. I hope I am describing this clearly, let me know if I am not.
If you want to run a TTest your data has to conform to rules. ONE variable not 5. You may have to reshape data by getting a single variable and adding a classification variable. Here is an example modifying the data from the online Ttest documentation for a paired test to your test:
data pressure; input SBPbefore SBPafter @@; datalines; 120 128 124 131 130 131 118 127 140 132 128 125 140 141 135 137 126 118 130 132 126 129 127 135 ; run; data forttest; set pressure; Period= 'Before' ; SBP = SBPbefore; output; Period= 'After' ; SBP = SBPAfter; output; keep period SBP; run; proc ttest data=forttest; class period; var SBP; run;
r
Because it is the mean of scale data that I am testing between the two groups, the proc ttest statement doesn't work.
Given the stated sample sizes, the Central Limit Theorem does apply and you could use PROC TTEST. I would do a paired analysis on the 50 samples where you have both pre and post.
I am also running a paired analysis, but I also want to run it as two independent samples, what would I run or how would I transform the data?
I don't understand the question
I want to run an independent ttest using proc ttest, however the data is set up in a way that will not allow that since I am testing the difference between the means of two scales (as described above). Scale1pre is comprised of 5 questions, while Scale2post has those same questions but are from differently named variables. To conduct a proper proc ttest, I need to have a class statement, but here I do not, and I have two variables, scale1pre and scale1post. It is simple to run this data using a paired ttest, but I am wondering how to treat the data as two independent samples. I hope I am describing this clearly, let me know if I am not.
@lboyd wrote:
I want to run an independent ttest using proc ttest, however the data is set up in a way that will not allow that since I am testing the difference between the means of two scales (as described above). Scale1pre is comprised of 5 questions, while Scale2post has those same questions but are from differently named variables. To conduct a proper proc ttest, I need to have a class statement, but here I do not, and I have two variables, scale1pre and scale1post. It is simple to run this data using a paired ttest, but I am wondering how to treat the data as two independent samples. I hope I am describing this clearly, let me know if I am not.
If you want to run a TTest your data has to conform to rules. ONE variable not 5. You may have to reshape data by getting a single variable and adding a classification variable. Here is an example modifying the data from the online Ttest documentation for a paired test to your test:
data pressure; input SBPbefore SBPafter @@; datalines; 120 128 124 131 130 131 118 127 140 132 128 125 140 141 135 137 126 118 130 132 126 129 127 135 ; run; data forttest; set pressure; Period= 'Before' ; SBP = SBPbefore; output; Period= 'After' ; SBP = SBPAfter; output; keep period SBP; run; proc ttest data=forttest; class period; var SBP; run;
r
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.