06-25-2017 04:43 AM
Hi guys. How can I determine the mean and average for fields of the Total Cholesterol and Glucose of an individual based on age and gender of an individual from the diabetes dataset? Should I sort it first? Any help and advice will be appreciated. Thanks
06-25-2017 11:12 AM - edited 06-25-2017 11:15 AM
wouldn't you want to categorize the age first, since it is a numeric variable
convert it into a character variable. You could get the means from proc means as below. This is an untested code so please change the variable names as per your dataset. The autoname will create the mean variable names by concatenating the variable name with statistics name example chol_mean glucose_mean
proc means data=have nway; class age gender; var chol glucose; output out=means mean=/autoname; run;
06-25-2017 11:15 AM - edited 06-25-2017 11:39 AM
You could apply a format to the AGE variable to group it, as explained by @Jim_G in this thread:
Then you can use PROC MEANS or PROC SUMMARY on the data. If you use PROC SUMMARY with a CLASS statement, no sorting is necessary.
By the way, the "mean" is the same thing as the "average".
06-25-2017 01:16 PM
If you're using EG and the tasks you can use the Summary Task.
Add age and gender into the GROUP variables and Cholesterol into the analysis variables.
If you're looking for code here's some examples: