Home
- /
BI
- /
Enterprise Guide
- /
Calculating Distribution on Grouped Data

10-03-2015 05:31 PM

I want to run distribution analysis on the attached data file. Data in the column 'Age Group' are the age ranges in a population and the next column indicates the number of people in that respective age range. How can I calculate the distribution analysis to identify mean age and other distribution stats? I can't add the age group as Frequency count as it is a string. Please help. Thank you!

Posted in reply to Charlie_K

10-03-2015 06:36 PM

Age Group isn't a frequency isn't a CLASS or BY variable. Add it to the BY or CLASS Group and Population to the ANALYSIS section.

Posted in reply to Reeza

10-03-2015 08:52 PM

Thank you for the reply. If I add the age group to 'Group analysis by', it gives the population mean for each age group. How can I get the mean age or age group of the total population?

Posted in reply to Charlie_K

10-03-2015 10:31 PM

Do you have the age variable where it hasn't been grouped/categorized? That would be preferable.

Posted in reply to Reeza

10-03-2015 10:36 PM

Sadly, No. Is there any way I can achieve this with the data I have? Pls help!

Posted in reply to Charlie_K

10-04-2015 03:05 AM

One approach would be to derive the mid-point of age group and then take the mean of that. You will be calculating the mean of a variable that is not truly continuous, but is probably the best you can do with the data you have.

Posted in reply to SASKiwi

10-04-2015 10:04 AM

Thank you for your reply! I am not sure how to carry out your solution. Assuming the age group is 0-9, should I take the mean age as 5? That way, if I have 10 age groups (0-9,10-19, 20-29 and so on..) Should I have 10 mean ages? I just want to identify one mean age in my case age group which the Avg age falls into.

Posted in reply to Charlie_K

10-04-2015 01:01 PM

Yeah, assign some value to the groups, i.e. 0-4 =1, 5-9=2, etc. Then use proc means with the population as a WEIGHT variable. The value you get, i.e. 2.4 is the group coded to, i.e. 5-9 age group.

Posted in reply to Reeza

10-04-2015 07:26 PM

Hi,

I tried this with a different age data set where I have individual ages and created age groups. I assigned 1,2,3.. to each group and calculated the age mean. Then I calculated the actual age mean with the individual age data set I have. When I compare the actual age mean and the group age mean I got from assigning a number to each group, the individual age mean does not fall into mean age group.

Posted in reply to Charlie_K

10-05-2015 10:35 AM

Did you weight the data by the population number?

Posted in reply to Reeza

10-05-2015 11:56 AM

Yes. I multiplied each group number assigned (1,2,3...) by the corresponding population of the group. Is that what you wanted me to do?

Posted in reply to Charlie_K

10-05-2015 07:28 PM

No, use the weight option in proc means though I think you'll get similar results. I tested it and it comes close - within 1 category. I'm not aware of a different method but the average of a categorical variable isn't a common request. Consider the median instead thats easier to report/explain anyways.

Posted in reply to Reeza

10-05-2015 09:15 PM

I am sorry, but I am very new to SAS. How should I calculate proc mean in enterprise guide?