Hello,
I have a pre- and post-survey data set where 10 items were ranked in order (1-10, from least to greatest) by the surveyed individuals (n=~500 total, having varying catergorical personal attributes such as sex, education level). What would be the most appropriate method to compare pre and post survey results of the ordering, determining which of the particular ranked items had statistically significant changes to their ordering, and determining the significance of influential personal attributes. An example of the data set could be as follows:
data surv;
length prepost $ 4;
input id $ sex $ educ $ prepost $ apple banana blueberry cherry grape melon
orange pineapple rasberry strawberry;
cards;
1 M HS pre 1 2 3 4 5 6 7 8 9 10
1 M HS post 1 3 2 4 5 6 10 8 9 7
2 F BS pre 1 4 3 2 5 6 7 8 9 10
2 F BS post 1 3 2 4 5 6 8 7 9 10
3 M BS pre 2 1 3 4 5 7 6 8 9 10
3 M BS post 1 3 2 4 5 6 10 7 9 8
4 F HS pre 1 2 3 4 7 6 5 8 9 10
4 F HS post 1 5 2 3 4 6 10 7 8 9
;
updated the last two rows to correct for ID = 4
Two basic things I would start with.
First a proc freq for each of the variables crossed with the prepost variable and generate chi-square tests. Something like:
proc freq data=surv; tables prepost *(apple banana blueberry cherry grape melon orange pineapple rasberry strawberry) / chisq ; run;
That would tell you whether the distributions are the same. Options to consider would be Deviation, showing the difference of actual vs expected frequency for cells, JT, and might look at Plots as well.
I might also look at Proc TTest to see if the means move for any of the "fruit" variables but your data will need some resturturing to do a before/after structure. See TTEST documentation for an example.
What is the meaning of ID? It isn't unique to a pair of pre-post observations?
Thank you for your response/question. I hope the following clarifies my request.
ID represents a unique study subject. All subjects have pre- and post- data, that is, the survey was administered (pre) to each subject, there was an intervention, the survey was re-administered (post) to each subject.
basic questions:
1) overall, did the intervention have an impact, were the rankings affected?
2) where/what particular items had the greatest/most significant change?
3) are there influential personal variables of that affect the ranking (pre, post, and/or the change)?
Thanks again
Two basic things I would start with.
First a proc freq for each of the variables crossed with the prepost variable and generate chi-square tests. Something like:
proc freq data=surv; tables prepost *(apple banana blueberry cherry grape melon orange pineapple rasberry strawberry) / chisq ; run;
That would tell you whether the distributions are the same. Options to consider would be Deviation, showing the difference of actual vs expected frequency for cells, JT, and might look at Plots as well.
I might also look at Proc TTest to see if the means move for any of the "fruit" variables but your data will need some resturturing to do a before/after structure. See TTEST documentation for an example.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.