BookmarkSubscribeRSS Feed
Sean_OConnor
Obsidian | Level 7

Hi All,

 

I'm using the following proc summary statement to generate different median values

 

proc summary data = Quantity_of_Properties nway completetypes;
	;
	format class $class. year year.   lea $lea.;
	class class year lea  /mlf;
	var realprice;
	output out = Table_0_fil (drop=_TYPE_ /*_freq_*/) 
	median= med_price;
run;

When I generate my median price values what I would like to do is go back into my dataset and get a count as to how many values in the distribution equal the headline median price value.

So for example if the median price where year=2010, class=A and lea=75 is 150,000 I would like to go back into my dataset and count how many records equal 150,000 so I can tell if the median price is actually the middle value were there is only one or a number of different values etc. 

 

I hope this makes sense. 

 

6 REPLIES 6
sbxkoenk
SAS Super FREQ

@Sean_OConnor wrote:

so I can tell if the median price is actually the middle value were there is only one or a number of different values etc. 

 


Or zero.
The median can also be non-observed.

 

Check the output dataset of this code as a proof :

proc summary data = sashelp.class nway completetypes;
	class sex / mlf;
	var weight;
	output out = Table_0_fil (drop=_TYPE_) 
	median= med_price;
run;

The female median is observed. The male median is not observed!

 

Koen

Patrick
Opal | Level 21

This sure could be done but WHY? Proc Summary will do the right thing so what's the purpose of this? Plus the median value might never exactly match and individual value in your source data as it's the median.

Median means 50% of observations are below and 50% of observations are above the median value.

Sean_OConnor
Obsidian | Level 7
Hi Patrick,
Potentially I need to do some further masking of the median values after I generate them but before I do it I would like to know how many are exactly an individual value in the source data and how many are not.
sbxkoenk
SAS Super FREQ
proc summary data = sashelp.class nway completetypes;
	class sex / mlf;
	var weight;
	output out = Table_0_fil (drop=_TYPE_) 
	median= med_price;
run;

PROC SQL noprint;
 create table work.Connor as
 select t1.* , /* t2._FREQ_ , */ t2.med_price
 from   sashelp.class as t1
      , Table_0_fil   as t2
 where     t1.sex = t2.sex
       AND t1.weight = t2.med_price ;
QUIT;
/* end of program */

Koen

Patrick
Opal | Level 21

Well... you could use output table Table_0_fil as hash table in a SAS data step and then output any record from the source table that matches the median value. Or alternatively use table Table_0_fil in a SQL inner join.

PaigeMiller
Diamond | Level 26

@Sean_OConnor wrote:
Hi Patrick,
Potentially I need to do some further masking of the median values after I generate them but before I do it I would like to know how many are exactly an individual value in the source data and how many are not.

In my opinion, this doesn't actually explain anything. I agree with @Patrick that there seems to be no mathematical logic behind finding out how many observations exactly match the median, and I don't see any benefit. Just because you CAN do a calculation doesn't mean you should do that calculation.

--
Paige Miller

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1410 views
  • 0 likes
  • 4 in conversation