BookmarkSubscribeRSS Feed
zeloc
Calcite | Level 5

Is there a SAS procedure to do a regression with a rank as the outcome variable?

Thanks!

7 REPLIES 7
Doc_Duke
Rhodochrosite | Level 12

If your sample size is reasonably large, you can just do regular regression using the ranks.  The ranks are asymptotically normal, so the inferences are valid.

zeloc
Calcite | Level 5

Thanks for your response. Are you saying to use a linear regression and invoke the central limit theorem? My sample size is only around 100 so I don't think that it would qualify. If I don't use the central limit theorem the outcome should be normally distributed and if it the outcome is a ranking then a graph of the outcome would be a horizontal line and completely nonnormal. I'm not sure what you mean by the ranks are asymptotically normal. Thanks.

zeloc
Calcite | Level 5

Any other thoughts? I think the suggestion was to due a linear regression and use the central limit theorem but my sample size is not that large, or would 100 be considered okay? There is no SAS procedure to run a regression with ranks as the outcome?

Doc_Duke
Rhodochrosite | Level 12

In something like a t-test, 20 in a group is sufficiently large for the CLT to be used.  How large you might need depends on how many predictor variables you are considering.  Another option might be "robust regression", see

http://www2.sas.com/proceedings/sugi27/p265-27.pdf .

zeloc
Calcite | Level 5

My understanding is that the CLT is not related to the number of predictors but to the degree of non-normality of the data. So for a dataset that is very nonnormal, a very large sample size may be needed, and if there is a mild departure from normality not that much of a sample size. There is no specific cutoff number for linear regression.

While if there are many predictors, a bigger dataset would be needed, this is due to overfitting, not to CLT.

Ksharp
Super User

As my opinion. It is very hard to use linear regression.

Linear Regression is firstly assuming the residual conform a known distribution Such as Normal....

But your data is Rank, It is to say you should use some non-parameter methods to do.

Ksharp

zeloc
Calcite | Level 5

HI Ksharp, yes I was asking for what method could be used instead of linear regression, whether there is a particular regression that is aimed at ranks. However I should still be able to use the CLT based on sample size and run it as a linear regression.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 2792 views
  • 0 likes
  • 3 in conversation