Hope this is the right location!
I have some data I'm trying to run t-tests on, but have to get creative with backing into the data to get the SD. I have the frequency distribution and N sizes, but not individual level data. What I've done so far (in excel, since I don't know how in SAS) is multiply the N size by the percent that answered each question either a 1,2,3,4,or 5 to determine what # of people in each group responses a certain way to each question (Sheet 1) and then transposed it to the format I *THINK* SAS will want it in (Sheet 2). It looks something like this
Question_Answer | Response | Group_1 | Group_2 | Group_3 |
Q0A1 | 1 | 50 | 200 | 50 |
Q0A2 | 2 | 100 | 0 | 25 |
Q0A3 | 3 | 25 | 25 | 50 |
Q0A4 | 4 | 75 | 25 | 25 |
Q0A5 | 5 | 50 | 25 | 50 |
Q1A1 | 1 | 0 | 50 | 25 |
Q1A2 | 2 | 100 | 50 | 100 |
Q1A3 | 3 | 100 | 75 | 0 |
Q1A4 | 4 | 25 | 75 | 25 |
Q1A5 | 5 | 75 | 25 | 50 |
From here I need to get the standard deviation of each question for each group and am at a total loss of how to do that. I'm not sure if I should keep the Question column parsed out the way it is (Question 1 Answer 1, Question 1 Answer 2) or just repeat Question1 for five rows and let the response column do the heavy lifting.
Any help would be greatly appreciated!
Is there a reason GROUP3 for Q0A1-5 doesn't add up to 300 like Group1/2? How should they add up here?
Standard deviation is usually used for continuous measurements, not categorical data. Can you explain a little more what you're looking to achieve here?
@maldito wrote:
Hope this is the right location!
I have some data I'm trying to run t-tests on, but have to get creative with backing into the data to get the SD. I have the frequency distribution and N sizes, but not individual level data. What I've done so far (in excel, since I don't know how in SAS) is multiply the N size by the percent that answered each question either a 1,2,3,4,or 5 to determine what # of people in each group responses a certain way to each question (Sheet 1) and then transposed it to the format I *THINK* SAS will want it in (Sheet 2). It looks something like this
Question_Answer Response Group_1 Group_2 Group_3 Q0A1 1 50 200 50 Q0A2 2 100 0 25 Q0A3 3 25 25 50 Q0A4 4 75 25 25 Q0A5 5 50 25 50 Q1A1 1 0 50 25 Q1A2 2 100 50 100 Q1A3 3 100 75 0 Q1A4 4 25 75 25 Q1A5 5 75 25 50
From here I need to get the standard deviation of each question for each group and am at a total loss of how to do that. I'm not sure if I should keep the Question column parsed out the way it is (Question 1 Answer 1, Question 1 Answer 2) or just repeat Question1 for five rows and let the response column do the heavy lifting.
Any help would be greatly appreciated!
Yes!
Group 1 is 300 people, Group 2 is 275, Group 3 is 200.
So what I essentially want to do is determine the difference between the responses to each of the questions and whether or not the difference is statistically significant. I hope to accomplish this by running the following independent t-tests
Group 1 vs Group 2
Group 2 vs Group 3
Group 3 vs Group 1
For each of the questions.
Which, of course, requires the standard deviation to do.
@maldito wrote:
Which, of course, requires the standard deviation to do.
I would have expected a Chi Square test since this is count/frequency data instead of t-tests using standard deviation.You can look at PROC FREQ for that.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.