Help using Base SAS procedures

proc univariate, median's option

Reply
New Contributor
Posts: 3

proc univariate, median's option

Hello,

look at this basic dataset

Var1 Var2

A 2

D 3

C 4

I'm using the proc univariate in order to calculate the median value of Var2, which is not so difficult so far. But I should save in the output dataset the value of Var1 too (in this case "D"). Is there an option in the proc univariate to do that?

Thank you

Regular Contributor
Posts: 151

Re: proc univariate, median's option

But the median can sometimes be between 2 values (e.g. 1,2,3,4 = 2.5), or there could be multiple rows that have the median value (e.g. 1,2,2,2,3 = 2).  How would you want to treat those?

New Contributor
Posts: 3

Re: proc univariate, median's option

Hi Keith,

I was just thinking the same. In any case, for my purposes, multiple values do not affect the goodness of the results. I could choose one of them.

However I think that proc univariate does not have this kind of option. Does it?

Should that be the case, I imagine that I'd have to remerge my results with the original dataset. Am I right?

Thanks again

Regular Contributor
Posts: 151

Re: proc univariate, median's option

I don't believe this option does exist, so your suggestion is one way to go, although you'll have to deal with the situations I described.  Another method would be to sort the data by Var2, then loop through until Var2 >= median and output that observation.

New Contributor
Posts: 3

Re: proc univariate, median's option

Ok, I think i''ll go with the merge solution. I was looking for an option, but it's clear that I have to do some additional work...

Thanks for your usefull advices

Super User
Posts: 19,789

Re: proc univariate, median's option

The median may not be a value in your dataset, or it may be multiple values. Something to consider.

Respected Advisor
Posts: 2,655

Re: proc univariate, median's option

That brings to mind --what about using PROC RANK?  Then just select the record with rank=floor(N/2) + 1, where N is the number of observations if odd, and select both records floor(N/2) and floor(N/2) + 1 if N is even, and take the mean of those two values.  I'm sure there is a fairly straightforward way to program this in a data step after ranking the values.

Steve Denham

SAS Super FREQ
Posts: 3,752

Re: proc univariate, median's option

Posted in reply to SteveDenham

Generalizing Steve's suggestion, why not just sort and then print out the floor(N/2)+1 observation, like this:

data _NULL_;

   if 0 then set sashelp.class nobs=n;

   call symputx('MedIndex',floor(n/2)+1);

   stop;

run;

proc sort data=sashelp.class out=class;

   by age;

run;

proc print data=class(firstobs=&MedIndex obs=&MedIndex);

run;

Ask a Question
Discussion stats
  • 7 replies
  • 288 views
  • 0 likes
  • 5 in conversation