Hope this is the right location!
I have some data I'm trying to run t-tests on, but have to get creative with backing into the data to get the SD. I have the frequency distribution and N sizes, but not individual level data. What I've done so far (in excel, since I don't know how in SAS) is multiply the N size by the percent that answered each question either a 1,2,3,4,or 5 to determine what # of people in each group responses a certain way to each question (Sheet 1) and then transposed it to the format I *THINK* SAS will want it in (Sheet 2). It looks something like this
| Question_Answer | Response | Group_1 | Group_2 | Group_3 |
| Q0A1 | 1 | 50 | 200 | 50 |
| Q0A2 | 2 | 100 | 0 | 25 |
| Q0A3 | 3 | 25 | 25 | 50 |
| Q0A4 | 4 | 75 | 25 | 25 |
| Q0A5 | 5 | 50 | 25 | 50 |
| Q1A1 | 1 | 0 | 50 | 25 |
| Q1A2 | 2 | 100 | 50 | 100 |
| Q1A3 | 3 | 100 | 75 | 0 |
| Q1A4 | 4 | 25 | 75 | 25 |
| Q1A5 | 5 | 75 | 25 | 50 |
From here I need to get the standard deviation of each question for each group and am at a total loss of how to do that. I'm not sure if I should keep the Question column parsed out the way it is (Question 1 Answer 1, Question 1 Answer 2) or just repeat Question1 for five rows and let the response column do the heavy lifting.
Any help would be greatly appreciated!
Is there a reason GROUP3 for Q0A1-5 doesn't add up to 300 like Group1/2? How should they add up here?
Standard deviation is usually used for continuous measurements, not categorical data. Can you explain a little more what you're looking to achieve here?
@maldito wrote:
Hope this is the right location!
I have some data I'm trying to run t-tests on, but have to get creative with backing into the data to get the SD. I have the frequency distribution and N sizes, but not individual level data. What I've done so far (in excel, since I don't know how in SAS) is multiply the N size by the percent that answered each question either a 1,2,3,4,or 5 to determine what # of people in each group responses a certain way to each question (Sheet 1) and then transposed it to the format I *THINK* SAS will want it in (Sheet 2). It looks something like this
Question_Answer Response Group_1 Group_2 Group_3 Q0A1 1 50 200 50 Q0A2 2 100 0 25 Q0A3 3 25 25 50 Q0A4 4 75 25 25 Q0A5 5 50 25 50 Q1A1 1 0 50 25 Q1A2 2 100 50 100 Q1A3 3 100 75 0 Q1A4 4 25 75 25 Q1A5 5 75 25 50
From here I need to get the standard deviation of each question for each group and am at a total loss of how to do that. I'm not sure if I should keep the Question column parsed out the way it is (Question 1 Answer 1, Question 1 Answer 2) or just repeat Question1 for five rows and let the response column do the heavy lifting.
Any help would be greatly appreciated!
Yes!
Group 1 is 300 people, Group 2 is 275, Group 3 is 200.
So what I essentially want to do is determine the difference between the responses to each of the questions and whether or not the difference is statistically significant. I hope to accomplish this by running the following independent t-tests
Group 1 vs Group 2
Group 2 vs Group 3
Group 3 vs Group 1
For each of the questions.
Which, of course, requires the standard deviation to do.
@maldito wrote:
Which, of course, requires the standard deviation to do.
I would have expected a Chi Square test since this is count/frequency data instead of t-tests using standard deviation.You can look at PROC FREQ for that.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.