Hello,
I have 21 variables named HS04A --> HS04U with possible values 1,2,8 and 9. I need to find out for every variable how many times the value 1 occurs and make a percentage out of it (divided by all observations). Basically the variables are diseases and value 1 means disease is present for that record (a person). I need to find out for every disease in how many percent out of all the people it occurs. Then I need to output a table where the 21 variables are rows and the one column is the percentage. I include also image of the layout of desired output. Is it possible to make something like that with proc tabulate? Or should I rather find out how to do it with proc sql? Thanks in advance for any advice
Hey Tom, thanks a lot for the idea, tried it and it worked. I did some primitive recoding, not sure if it's effective, instead of 1 I put 100 so the percentage is in format for example 3,1 instead of 0,031 and then executed proc means.
data work.subset6a;
set library.database;
keep pid HS04A--HS06U WGT;
if HS04A^=1 then HS04A=0; else if HS04a=1 then HS04A=100;
if HS04B^=1 then HS04B=0; else if HS04b=1 then HS04b=100;
if HS04C^=1 then HS04C=0; else if HS04c=1 then HS04c=100;
....etc.
run;
proc means data=work.subset6a mean maxdec=1;
var HS04A--HS04U;
freq WGT;
run;
And the output looked like this:
Sounds like you have values from a survey with the question coded as 1=YES 2=NO with 8 and 9 some type of missing categories.
If you recode it into a a nice BINARY variable then the MEAN is the PERCENTAGE.
Hey Tom, thanks a lot for the idea, tried it and it worked. I did some primitive recoding, not sure if it's effective, instead of 1 I put 100 so the percentage is in format for example 3,1 instead of 0,031 and then executed proc means.
data work.subset6a;
set library.database;
keep pid HS04A--HS06U WGT;
if HS04A^=1 then HS04A=0; else if HS04a=1 then HS04A=100;
if HS04B^=1 then HS04B=0; else if HS04b=1 then HS04b=100;
if HS04C^=1 then HS04C=0; else if HS04c=1 then HS04c=100;
....etc.
run;
proc means data=work.subset6a mean maxdec=1;
var HS04A--HS04U;
freq WGT;
run;
And the output looked like this:
To expand on Tom's response,with a binary 0/1 coding you can use:
var*mean*f=percent8.1 (for example to get a percent) or if you don't want % signs then use a custom picture format.
The added advantages of this coding (besides use in logistic regression)
Sum= will give you the total Yes answers
N= gives you the number of responses
Thanks, at first I tried to recode the data with hundreds and zeros to get my mean correct, but after your advice that formatting can help, I found proc template for proc means and changed format for mean to percent and it worked, so I can have my data recoded to ones and zeros.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.