BookmarkSubscribeRSS Feed
Agate
Calcite | Level 5

I am using SAS Enterprise guide version 6.100 (6.100.0.2870) (64-bit) ODA.

I am analyzing which variables influence the length of stay in hospital.

The dependant variable is DaysOfStay.  This variable does not have normal distribution.

Is there a way in SAS Enterprose Guide I could normalize the distribution?

Mean

55.80348

Median

42.50000

Mode

15.00000

Skewness

Kurtosis

4.529  

27.100

Thanks

Best Regards

Agate

7 REPLIES 7
Anotherdream
Quartz | Level 8

Why do you need the variable to have a normal distribution to determine if other varaibles influence it? Two variables can have any distribution and still have influence on eachother.

Can you give further explanation?

Reeza
Super User

The assumption of normality for regression is for the errors, not the variables, though the assumption of normality matters for other tests.

1. You can standardize a variable to get a normal distn.

2. You can use non-parametric methods if your data doesn't meet the assumptions (e.g Normality)

3. I would also look at a histogram of the data to determine normality, not just stats, you may have an outlier problem you want to deal with.

Testing the assumptions of linear regression

Agate
Calcite | Level 5

Thank you Reeza for the reply.

I do have to achieve normality for other tests.

Could you advise how to standardize data in order to get a normal distribution?

And which non- parametric methods I could use?

Thanks

Kind Regards

Agate

MohammadFayaz
Calcite | Level 5

Hi,

You can use Box-Cox transformation using PROC TRANSREG in SAS to achieve normality.But by the summary statistics "log" may be a good transformation for your data. But one of the main problems with transformations are in the interpretations.

Non - parametric methods will also be useful with lower power.

But you can fit the model with Generalized Linear Models (GLM). The general method for modeling the length of stay in hospital has often GLM approach.

check the following paper to know more about modeling length of stay:

http://www.ncbi.nlm.nih.gov/pubmed/9630132

The full PDF of the above file is available by searching in Google. The SAS EG has the very good menu for GLM.

Agate
Calcite | Level 5

Hello,

As I mentioned earlier, the variable LOS in positively skewed,

I was trying to solve the problem by applying log transformations. The reason why I need to normalize the variable is to meet assumptions of multiple linear regression...

I run the following code:

data FYP.LOS_OUTLIERS_LOG;

SET FYP.LOS_OUTLIERS_LOG;

lny    = log(DaysOfStay);     /* The natural logarithm (base e) */

run;

but still new variable seems to be skewed :

I wonder if the code I run is correct?!

Thanks

Best Regards

Agate

Reeza
Super User

Testing the assumptions of linear regression

Which one of the 4 assumptions are you violating?

Agate
Calcite | Level 5

Hello,

I am violating normal distribution of errors assumption

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 10056 views
  • 2 likes
  • 4 in conversation