Hi guys. How can I determine the mean and average for fields of the Total Cholesterol and Glucose of an individual based on age and gender of an individual from the diabetes dataset? Should I sort it first? Any help and advice will be appreciated. Thanks
Diabetes File:
wouldn't you want to categorize the age first, since it is a numeric variable
convert it into a character variable. You could get the means from proc means as below. This is an untested code so please change the variable names as per your dataset. The autoname will create the mean variable names by concatenating the variable name with statistics name example chol_mean glucose_mean
proc means data=have nway;
class age gender;
var chol glucose;
output out=means mean=/autoname;
run;
You could apply a format to the AGE variable to group it, as explained by @Jim_G in this thread:
Then you can use PROC MEANS or PROC SUMMARY on the data. If you use PROC SUMMARY with a CLASS statement, no sorting is necessary.
By the way, the "mean" is the same thing as the "average". 😉
If you're using EG and the tasks you can use the Summary Task.
Add age and gender into the GROUP variables and Cholesterol into the analysis variables.
If you're looking for code here's some examples:
https://github.com/statgeek/SAS-Tutorials/blob/master/add_average_value_to_dataset
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.