Check if median value is middle value in dataset or not

Sean_OConnor · Posted 04-20-2022 04:12 AM

Hi All,

I'm using the following proc summary statement to generate different median values

proc summary data = Quantity_of_Properties nway completetypes;
	;
	format class $class. year year.   lea $lea.;
	class class year lea  /mlf;
	var realprice;
	output out = Table_0_fil (drop=_TYPE_ /*_freq_*/) 
	median= med_price;
run;

When I generate my median price values what I would like to do is go back into my dataset and get a count as to how many values in the distribution equal the headline median price value.

So for example if the median price where year=2010, class=A and lea=75 is 150,000 I would like to go back into my dataset and count how many records equal 150,000 so I can tell if the median price is actually the middle value were there is only one or a number of different values etc.

I hope this makes sense.

sbxkoenk · Posted 04-20-2022 06:44 AM

@Sean_OConnor wrote:

so I can tell if the median price is actually the middle value were there is only one or a number of different values etc.

Or zero.
The median can also be non-observed.

Check the output dataset of this code as a proof :

proc summary data = sashelp.class nway completetypes;
	class sex / mlf;
	var weight;
	output out = Table_0_fil (drop=_TYPE_) 
	median= med_price;
run;

The female median is observed. The male median is not observed!

Koen

Patrick · Posted 04-20-2022 06:47 AM

This sure could be done but WHY? Proc Summary will do the right thing so what's the purpose of this? Plus the median value might never exactly match and individual value in your source data as it's the median.

Median means 50% of observations are below and 50% of observations are above the median value.

Sean_OConnor · Posted 04-20-2022 06:58 AM

Hi Patrick,
Potentially I need to do some further masking of the median values after I generate them but before I do it I would like to know how many are exactly an individual value in the source data and how many are not.

sbxkoenk · Posted 04-20-2022 07:09 AM

proc summary data = sashelp.class nway completetypes;
	class sex / mlf;
	var weight;
	output out = Table_0_fil (drop=_TYPE_) 
	median= med_price;
run;

PROC SQL noprint;
 create table work.Connor as
 select t1.* , /* t2._FREQ_ , */ t2.med_price
 from   sashelp.class as t1
      , Table_0_fil   as t2
 where     t1.sex = t2.sex
       AND t1.weight = t2.med_price ;
QUIT;
/* end of program */

Koen

Patrick · Posted 04-20-2022 07:12 AM

Well... you could use output table Table_0_fil as hash table in a SAS data step and then output any record from the source table that matches the median value. Or alternatively use table Table_0_fil in a SQL inner join.

PaigeMiller · Posted 04-20-2022 07:23 AM

@Sean_OConnor wrote:
Hi Patrick,
Potentially I need to do some further masking of the median values after I generate them but before I do it I would like to know how many are exactly an individual value in the source data and how many are not.

In my opinion, this doesn't actually explain anything. I agree with @Patrick that there seems to be no mathematical logic behind finding out how many observations exactly match the median, and I don't see any benefit. Just because you CAN do a calculation doesn't mean you should do that calculation.

--
Paige Miller

Check if median value is middle value in dataset or not

Re: Check if median value is middle value in dataset or not

Re: Check if median value is middle value in dataset or not

Re: Check if median value is middle value in dataset or not

Re: Check if median value is middle value in dataset or not

Re: Check if median value is middle value in dataset or not

Re: Check if median value is middle value in dataset or not

Check if median value is middle value in dataset or not

Re: Check if median value is middle value in dataset or not

Re: Check if median value is middle value in dataset or not

Re: Check if median value is middle value in dataset or not

Re: Check if median value is middle value in dataset or not

Re: Check if median value is middle value in dataset or not

Re: Check if median value is middle value in dataset or not

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away