Hello,
I have 21 variables named HS04A --> HS04U with possible values 1,2,8 and 9. I need to find out for every variable how many times the value 1 occurs and make a percentage out of it (divided by all observations). Basically the variables are diseases and value 1 means disease is present for that record (a person). I need to find out for every disease in how many percent out of all the people it occurs. Then I need to output a table where the 21 variables are rows and the one column is the percentage. I include also image of the layout of desired output. Is it possible to make something like that with proc tabulate? Or should I rather find out how to do it with proc sql? Thanks in advance for any advice
Hey Tom, thanks a lot for the idea, tried it and it worked. I did some primitive recoding, not sure if it's effective, instead of 1 I put 100 so the percentage is in format for example 3,1 instead of 0,031 and then executed proc means.
data work.subset6a;
set library.database;
keep pid HS04A--HS06U WGT;
if HS04A^=1 then HS04A=0; else if HS04a=1 then HS04A=100;
if HS04B^=1 then HS04B=0; else if HS04b=1 then HS04b=100;
if HS04C^=1 then HS04C=0; else if HS04c=1 then HS04c=100;
....etc.
run;
proc means data=work.subset6a mean maxdec=1;
var HS04A--HS04U;
freq WGT;
run;
And the output looked like this:
Sounds like you have values from a survey with the question coded as 1=YES 2=NO with 8 and 9 some type of missing categories.
If you recode it into a a nice BINARY variable then the MEAN is the PERCENTAGE.
Hey Tom, thanks a lot for the idea, tried it and it worked. I did some primitive recoding, not sure if it's effective, instead of 1 I put 100 so the percentage is in format for example 3,1 instead of 0,031 and then executed proc means.
data work.subset6a;
set library.database;
keep pid HS04A--HS06U WGT;
if HS04A^=1 then HS04A=0; else if HS04a=1 then HS04A=100;
if HS04B^=1 then HS04B=0; else if HS04b=1 then HS04b=100;
if HS04C^=1 then HS04C=0; else if HS04c=1 then HS04c=100;
....etc.
run;
proc means data=work.subset6a mean maxdec=1;
var HS04A--HS04U;
freq WGT;
run;
And the output looked like this:
To expand on Tom's response,with a binary 0/1 coding you can use:
var*mean*f=percent8.1 (for example to get a percent) or if you don't want % signs then use a custom picture format.
The added advantages of this coding (besides use in logistic regression)
Sum= will give you the total Yes answers
N= gives you the number of responses
Thanks, at first I tried to recode the data with hundreds and zeros to get my mean correct, but after your advice that formatting can help, I found proc template for proc means and changed format for mean to percent and it worked, so I can have my data recoded to ones and zeros.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.