Desktop productivity for business analysts and programmers

Calculating Distribution on Grouped Data

Reply
Occasional Contributor
Posts: 7

Calculating Distribution on Grouped Data

I want to run distribution analysis on the attached data file. Data in the column 'Age Group' are the age ranges in a population and the next column indicates the number of people in that respective age range. How can I calculate the distribution analysis to identify mean age and other distribution stats? I can't add the age group as Frequency count as it is a string. Please help. Thank you!


DataFile.PNG
Grand Advisor
Posts: 17,464

Re: Calculating Distribution on Grouped Data

Age Group isn't a frequency isn't a CLASS or BY variable. Add it to the BY or CLASS Group and Population to the ANALYSIS section.
Occasional Contributor
Posts: 7

Re: Calculating Distribution on Grouped Data

Thank you for the reply. If I add the age group to 'Group analysis by', it gives the population mean for each age group. How can I get the mean age or age group of the total population?

Grand Advisor
Posts: 17,464

Re: Calculating Distribution on Grouped Data

Do you have the age variable where it hasn't been grouped/categorized? That would be preferable.
Occasional Contributor
Posts: 7

Re: Calculating Distribution on Grouped Data

Sadly, No. Is there any way I can achieve this with the data I have? Pls help!

Respected Advisor
Posts: 3,069

Re: Calculating Distribution on Grouped Data

One approach would be to derive the mid-point of age group and then take the mean of that. You will be calculating the mean of a variable that is not truly continuous, but is probably the best you can do with the data you have.

Occasional Contributor
Posts: 7

Re: Calculating Distribution on Grouped Data

Thank you for your reply! I am not sure how to carry out your solution. Assuming the age group is 0-9, should I take the mean age as 5? That way, if I have 10 age groups (0-9,10-19, 20-29 and so on..) Should I have 10 mean ages? I just want to identify one mean age in my case age group which the Avg age falls into. 

Grand Advisor
Posts: 17,464

Re: Calculating Distribution on Grouped Data

Yeah, assign some value to the groups, i.e. 0-4 =1, 5-9=2, etc. Then use proc means with the population as a WEIGHT variable. The value you get, i.e. 2.4 is the group coded to, i.e. 5-9 age group.
Occasional Contributor
Posts: 7

Re: Calculating Distribution on Grouped Data

Hi,

I tried this with a different age data set where I have individual ages and created age groups. I assigned 1,2,3.. to each group and calculated the age mean. Then I calculated the actual age mean with the individual age data set I have. When I compare the actual age mean and the group age mean I got from assigning a number to each group, the individual age mean does not fall into mean age group.

Grand Advisor
Posts: 17,464

Re: Calculating Distribution on Grouped Data

Did you weight the data by the population number?
Occasional Contributor
Posts: 7

Re: Calculating Distribution on Grouped Data

Yes. I multiplied each group number assigned (1,2,3...) by the corresponding population of the group. Is that what you wanted me to do?

Grand Advisor
Posts: 17,464

Re: Calculating Distribution on Grouped Data

No, use the weight option in proc means though I think you'll get similar results. I tested it and it comes close - within 1 category. I'm not aware of a different method but the average of a categorical variable isn't a common request. Consider the median instead thats easier to report/explain anyways.
Occasional Contributor
Posts: 7

Re: Calculating Distribution on Grouped Data

I am sorry, but I am very new to SAS. How should I calculate proc mean in enterprise guide?

Ask a Question
Discussion stats
  • 12 replies
  • 458 views
  • 0 likes
  • 3 in conversation