Hi All,
I'm using the following proc summary statement to generate different median values
proc summary data = Quantity_of_Properties nway completetypes;
;
format class $class. year year. lea $lea.;
class class year lea /mlf;
var realprice;
output out = Table_0_fil (drop=_TYPE_ /*_freq_*/)
median= med_price;
run;
When I generate my median price values what I would like to do is go back into my dataset and get a count as to how many values in the distribution equal the headline median price value.
So for example if the median price where year=2010, class=A and lea=75 is 150,000 I would like to go back into my dataset and count how many records equal 150,000 so I can tell if the median price is actually the middle value were there is only one or a number of different values etc.
I hope this makes sense.
@Sean_OConnor wrote:
so I can tell if the median price is actually the middle value were there is only one or a number of different values etc.
Or zero.
The median can also be non-observed.
Check the output dataset of this code as a proof :
proc summary data = sashelp.class nway completetypes;
class sex / mlf;
var weight;
output out = Table_0_fil (drop=_TYPE_)
median= med_price;
run;
The female median is observed. The male median is not observed!
Koen
This sure could be done but WHY? Proc Summary will do the right thing so what's the purpose of this? Plus the median value might never exactly match and individual value in your source data as it's the median.
Median means 50% of observations are below and 50% of observations are above the median value.
proc summary data = sashelp.class nway completetypes;
class sex / mlf;
var weight;
output out = Table_0_fil (drop=_TYPE_)
median= med_price;
run;
PROC SQL noprint;
create table work.Connor as
select t1.* , /* t2._FREQ_ , */ t2.med_price
from sashelp.class as t1
, Table_0_fil as t2
where t1.sex = t2.sex
AND t1.weight = t2.med_price ;
QUIT;
/* end of program */
Koen
Well... you could use output table Table_0_fil as hash table in a SAS data step and then output any record from the source table that matches the median value. Or alternatively use table Table_0_fil in a SQL inner join.
@Sean_OConnor wrote:
Hi Patrick,
Potentially I need to do some further masking of the median values after I generate them but before I do it I would like to know how many are exactly an individual value in the source data and how many are not.
In my opinion, this doesn't actually explain anything. I agree with @Patrick that there seems to be no mathematical logic behind finding out how many observations exactly match the median, and I don't see any benefit. Just because you CAN do a calculation doesn't mean you should do that calculation.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.