turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- proc univariate, median's option

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-25-2013 06:29 AM

Hello,

look at this basic dataset

Var1 Var2

A 2

D 3

C 4

I'm using the proc univariate in order to calculate the median value of Var2, which is not so difficult so far. But I should save in the output dataset the value of Var1 too (in this case "D"). Is there an option in the proc univariate to do that?

Thank you

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Enomis

06-25-2013 08:37 AM

But the median can sometimes be between 2 values (e.g. 1,2,3,4 = 2.5), or there could be multiple rows that have the median value (e.g. 1,2,2,2,3 = 2). How would you want to treat those?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Keith

06-25-2013 09:11 AM

Hi Keith,

I was just thinking the same. In any case, for my purposes, multiple values do not affect the goodness of the results. I could choose one of them.

However I think that proc univariate does not have this kind of option. Does it?

Should that be the case, I imagine that I'd have to remerge my results with the original dataset. Am I right?

Thanks again

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Enomis

06-25-2013 09:38 AM

I don't believe this option does exist, so your suggestion is one way to go, although you'll have to deal with the situations I described. Another method would be to sort the data by Var2, then loop through until Var2 >= median and output that observation.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Keith

06-25-2013 10:06 AM

Ok, I think i''ll go with the merge solution. I was looking for an option, but it's clear that I have to do some additional work...

Thanks for your usefull advices

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Enomis

06-25-2013 11:11 AM

The median may not be a value in your dataset, or it may be multiple values. Something to consider.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

06-26-2013 01:35 PM

That brings to mind --what about using PROC RANK? Then just select the record with rank=floor(N/2) + 1, where N is the number of observations if odd, and select both records floor(N/2) and floor(N/2) + 1 if N is even, and take the mean of those two values. I'm sure there is a fairly straightforward way to program this in a data step after ranking the values.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveDenham

06-27-2013 08:24 AM

Generalizing Steve's suggestion, why not just sort and then print out the floor(N/2)+1 observation, like this:

data _NULL_;

if 0 then set sashelp.class nobs=n;

call symputx('MedIndex',floor(n/2)+1);

stop;

run;

proc sort data=sashelp.class out=class;

by age;

run;

proc print data=class(firstobs=&MedIndex obs=&MedIndex);

run;