## Calculating Distribution on Grouped Data

Occasional Contributor
Posts: 7

# Calculating Distribution on Grouped Data

I want to run distribution analysis on the attached data file. Data in the column 'Age Group' are the age ranges in a population and the next column indicates the number of people in that respective age range. How can I calculate the distribution analysis to identify mean age and other distribution stats? I can't add the age group as Frequency count as it is a string. Please help. Thank you!

Super User
Posts: 23,980

## Re: Calculating Distribution on Grouped Data

Age Group isn't a frequency isn't a CLASS or BY variable. Add it to the BY or CLASS Group and Population to the ANALYSIS section.
Occasional Contributor
Posts: 7

## Re: Calculating Distribution on Grouped Data

Thank you for the reply. If I add the age group to 'Group analysis by', it gives the population mean for each age group. How can I get the mean age or age group of the total population?

Super User
Posts: 23,980

## Re: Calculating Distribution on Grouped Data

Do you have the age variable where it hasn't been grouped/categorized? That would be preferable.
Occasional Contributor
Posts: 7

## Re: Calculating Distribution on Grouped Data

Sadly, No. Is there any way I can achieve this with the data I have? Pls help!

Super User
Posts: 4,018

## Re: Calculating Distribution on Grouped Data

One approach would be to derive the mid-point of age group and then take the mean of that. You will be calculating the mean of a variable that is not truly continuous, but is probably the best you can do with the data you have.

Occasional Contributor
Posts: 7

## Re: Calculating Distribution on Grouped Data

Thank you for your reply! I am not sure how to carry out your solution. Assuming the age group is 0-9, should I take the mean age as 5? That way, if I have 10 age groups (0-9,10-19, 20-29 and so on..) Should I have 10 mean ages? I just want to identify one mean age in my case age group which the Avg age falls into.

Super User
Posts: 23,980

## Re: Calculating Distribution on Grouped Data

Yeah, assign some value to the groups, i.e. 0-4 =1, 5-9=2, etc. Then use proc means with the population as a WEIGHT variable. The value you get, i.e. 2.4 is the group coded to, i.e. 5-9 age group.
Occasional Contributor
Posts: 7

## Re: Calculating Distribution on Grouped Data

Hi,

I tried this with a different age data set where I have individual ages and created age groups. I assigned 1,2,3.. to each group and calculated the age mean. Then I calculated the actual age mean with the individual age data set I have. When I compare the actual age mean and the group age mean I got from assigning a number to each group, the individual age mean does not fall into mean age group.

Super User
Posts: 23,980

## Re: Calculating Distribution on Grouped Data

Did you weight the data by the population number?
Occasional Contributor
Posts: 7

## Re: Calculating Distribution on Grouped Data

Yes. I multiplied each group number assigned (1,2,3...) by the corresponding population of the group. Is that what you wanted me to do?

Super User
Posts: 23,980