# Median Age - Demographic definition

I need to find an efficient way to calculate the median age from summary tables. The method is explained in the link below, and is pretty straightforward, but I'm looking for efficiency. I'm good at brute force, moreso than efficiency ... or at least brain dead at the moment.

The correct answer for the data below is 39.8. I was also wondering if there was a way to do this using a Stat proc, but an obvious method didn't occur to me.

Methodology in quesion and first answer.

http://stats.stackexchange.com/questions/139132/how-to-calculate-the-median-age-of-a-population

A quick description would be to find the two values that surround the median, find the corresponding ages that align with that number. Since it's on a scale the actual number is somewhere between the two numbers so weight it based on the difference betwen the two numbers. Since it's a continuous scale, add 1 to scale up.

Geography      Age      2010  Cumulative

## Re: Median Age - Demographic definition

Depending on how much precision you need, brute force can be fairly efficient:

data test / view=test;
set pop;
dumPop = round(pop/1000);
do i = 0 to dumPop-1;
dumAge = age + i / dumPop;
output;
end;
keep dumAge;
run;

proc univariate data=test;
var dumAge;
run;
## Re: Median Age - Demographic definition

[ Edited ]

But more seriously, you can do the math:

data medPop / view=medPop;
set pop end=done;
if done then do;
medPop  = cumulative / 2;
output;
end;
keep medPop;
run;

data medAge;
set medPop;
do until(cumulative > medPop);
set pop;
end;
medAge = (age*(cumulative-medPop) + (age+1)*(medPop - cumulative + pop)) / pop;
output;
stop;
keep medAge;
run;
## Re: Median Age - Demographic definition

data have;
input Geography  \$    Age    _2010  Cumulative;
cards;
;
run;
data temp;
set have;
do i=0.1 to 1 by 0.1;
new_age=age+i;
output;
end;
run;
proc means data=temp median;
var new_age;
freq _2010;
run;
## Re: Median Age - Demographic definition

Reeza,

One note on the Median.  Statistically, it is not well defined when there are ties.  From the formal definition, any value between 39 and 40 is "correct".  The reference that you cite assumes that the distribution of the values between 39 and 40 is uniform.  That may be a reasonable assumption here, but less so elsewhere.

## Re: Median Age - Demographic definition

Thanks @Doc_Duke for the information.

I think the uniformity assumption for birthdates is probably fair.

## Re: Median Age - Demographic definition

And some of the issues around quantiles are why Procs Means, Summary and Tabulate have the QMETHOD and QNTLDEF options to clarify the calculation methods and results a bit.

## Re: Median Age - Demographic definition

@Ksharp - thanks for the reference and point!
Sometimes its easy to accept the way things are and not question them
