BookmarkSubscribeRSS Feed
maldito
Calcite | Level 5

Hope this is the right location! 

 

I have some data I'm trying to run t-tests on, but have to get creative with backing into the data to get the SD. I have the frequency distribution and N sizes, but not individual level data. What I've done so far (in excel, since I don't know how in SAS) is multiply the N size by the percent that answered each question either a 1,2,3,4,or 5 to determine what # of people in each group responses a certain way to each question (Sheet 1) and then transposed it to the format I *THINK* SAS will want it in (Sheet 2). It looks something like this

 

Question_AnswerResponseGroup_1Group_2Group_3
Q0A115020050
Q0A22100025
Q0A33252550
Q0A44752525
Q0A55502550
Q1A1105025
Q1A2210050100
Q1A33100750
Q1A44257525
Q1A55752550

 

From here I need to get the standard deviation of each question for each group and am at a total loss of how to do that. I'm not sure if I should keep the Question column parsed out the way it is (Question 1 Answer 1, Question 1 Answer 2) or just repeat Question1 for five rows and let the response column do the heavy lifting. 

 

Any help would be greatly appreciated! 

3 REPLIES 3
Reeza
Super User

Is there a reason GROUP3 for Q0A1-5 doesn't add up to 300 like Group1/2? How should they add up here?

 

Standard deviation is usually used for continuous measurements, not categorical data. Can you explain a little more what you're looking to achieve here?

 


@maldito wrote:

Hope this is the right location! 

 

I have some data I'm trying to run t-tests on, but have to get creative with backing into the data to get the SD. I have the frequency distribution and N sizes, but not individual level data. What I've done so far (in excel, since I don't know how in SAS) is multiply the N size by the percent that answered each question either a 1,2,3,4,or 5 to determine what # of people in each group responses a certain way to each question (Sheet 1) and then transposed it to the format I *THINK* SAS will want it in (Sheet 2). It looks something like this

 

Question_Answer Response Group_1 Group_2 Group_3
Q0A1 1 50 200 50
Q0A2 2 100 0 25
Q0A3 3 25 25 50
Q0A4 4 75 25 25
Q0A5 5 50 25 50
Q1A1 1 0 50 25
Q1A2 2 100 50 100
Q1A3 3 100 75 0
Q1A4 4 25 75 25
Q1A5 5 75 25 50

 

From here I need to get the standard deviation of each question for each group and am at a total loss of how to do that. I'm not sure if I should keep the Question column parsed out the way it is (Question 1 Answer 1, Question 1 Answer 2) or just repeat Question1 for five rows and let the response column do the heavy lifting. 

 

Any help would be greatly appreciated! 


 

maldito
Calcite | Level 5

Yes! 

 

Group 1 is 300 people, Group 2 is 275, Group 3 is 200. 

 

So what I essentially want to do is determine the difference between the responses to each of the questions and whether or not the difference is statistically significant. I hope to accomplish this by running the following independent t-tests

 

Group 1 vs Group 2 

Group 2 vs Group 3

Group 3 vs Group 1

 

For each of the questions.

 

Which, of course, requires the standard deviation to do. 

Reeza
Super User

@maldito wrote:

 

 

Which, of course, requires the standard deviation to do. 


 

I would have expected a Chi Square test since this is count/frequency data instead of t-tests using standard deviation.You can look at PROC FREQ for that. 

https://stats.idre.ucla.edu/other/mult-pkg/whatstat/

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1084 views
  • 0 likes
  • 2 in conversation