I am working with survey data. I have been asked to create a new variable which is the sum of all of the variables below, and then run a t-test so that we can use the standard deviation.
I have tried to use proc means and proc summary but they are providing me with the sums of the rows now the sums of all the variables added up, so I can't quite figure out the right statements to use.
Thank you!
My example of proc summary code is here below, to show how I am trying to enter these variables
Total | % | Hesitant -1 | % | NotHesitant -0 | % | ||
N = 7197 | N = 281 | N = 6916 | |||||
n (%) | p-value | ||||||
What factors affect yout attitudes and thoughts about getting the COVID- vaccine? n (%) | |||||||
Politics | 694 | 9.64% | 100 | 35.59% | 594 | 8.59% | 0.0001 |
The timeline in which the vaccines were developed and approved | 1383 | 19.22% | 207 | 73.67 | 1176 | 17.00% | 0.0001 |
The frequently changing messages around COVID-19 | 918 | 12.76 | 174 | 61.92 | 744 | 10.76% | 0.0001 |
Actions and opinions of my friends and family regarding the vaccine | 841 | 11.69 | 41 | 14.59 | 800 | 11.57% | 0.122 |
My trust in scientists | 5147 | 71.52% | 54 | 19.22 | 5093 | 73.62 | 0.0001 |
My trust in doctors | 4601 | 63.93% | 51 | 18.15 | 4550 | 65.79 | 0.0001 |
My trust in public health officials | 3890 | 54.05% | 133 | 47.33 | 3757 | 54.32 | 0.0211 |
My own reading and research on coronavirus (COVID-19) vaccines | 4396 | 61.08 | 169 | 60.14 | 4227 | 61.12 | 0.742 |
The country in which a vaccine is manufactured | 662 | 9.20% | 22 | 7.83 | 640 | 9.25 | 0.4179 |
The potential cost of a coronavirus (COVID-19) vaccine | 284 | 3.95% | 4 | 1.42 | 280 | 4.05 | 0.0267 |
Other factors | 734 | 10.2 | 84 | 28.89 | 650 | 9.4 | 0.0001 |
Total factors considered, mean (SD) |
Try this :
data have;
vaccinefactors1 = 7;
vaccinefactors2 = 3;
vaccinefactors3 = 2.58;
vaccinefactors4 = 1.20;
vaccinefactors5 = 17;
vaccinefactors6 = 18;
vaccinefactors7 = 7;
vaccinefactors8 = 7;
vaccinefactors9 = 5;
vaccinefactors10= 7;
vaccinefactors11= 3;
output;
run;
data want;
LENGTH sum_vaccinefactors 8;
set have;
sum_vaccinefactors=sum(of vaccinefactors1-vaccinefactors11);
run;
/* end of program */
Koen
So you want to get the horizontal sum of all vaccine... variables in every observation?
Then you build the sum as already suggested.
DATA need;
set WORK.vaccine_hesitancy_data;
VaccineFactor = sum(of vaccinefactors1-vaccinefactors11);
run;
@Guerraje wrote:
We have 11 choices a person could have selected as a reason for their vaccine hesitancy. We decided we want to get the sum of all of them now and create a new variable we could title it:
"vaccinefactor"
which is just the total of all the data.
The suggested codes do exactly that, you only need to change the name of the target variable.
When running that code, I happen to get an error that my variable is not found. I am attaching my code and my log.
Thank you
The sum variable is in the new dataset, so you need to use that in the TTEST procedure.
@Guerraje wrote:
We have 11 choices a person could have selected as a reason for their vaccine hesitancy. We decided we want to get the sum of all of them now and create a new variable we could title it:
"vaccinefactor"
which is just the total of all the data.
Summing this coding sounds like an exercise in adding up football team names.
What exactly do some of the "numbers" mean? Even with a rating scale adding responses from multiple responses from the scale does not mean that a total of 12 for one person is anything at like a total of 12 for a different respondent.
It might not hurt to provide a very small example of something and show the expected results from that.
I think that you may need to sum the variables in each row such as in a data step
DATA need; set WORK.vaccine_hesitancy_data; VacTot = sum(of vaccinefactors1-vaccinefactors11); run;
HOWEVER, there is a concern with this data starting from a SURVEY. Does your data include weights for the observations? If so you may need to use another approach to use the weights properly after getting that total.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.