I am using SAS Enterprise guide version 6.100 (6.100.0.2870) (64-bit) ODA.
I am analyzing which variables influence the length of stay in hospital.
The dependant variable is DaysOfStay. This variable does not have normal distribution.
Is there a way in SAS Enterprose Guide I could normalize the distribution?
Mean | 55.80348 |
Median | 42.50000 |
Mode | 15.00000 |
Skewness Kurtosis | 4.529 27.100 |
Thanks
Best Regards
Agate
Why do you need the variable to have a normal distribution to determine if other varaibles influence it? Two variables can have any distribution and still have influence on eachother.
Can you give further explanation?
The assumption of normality for regression is for the errors, not the variables, though the assumption of normality matters for other tests.
1. You can standardize a variable to get a normal distn.
2. You can use non-parametric methods if your data doesn't meet the assumptions (e.g Normality)
3. I would also look at a histogram of the data to determine normality, not just stats, you may have an outlier problem you want to deal with.
Thank you Reeza for the reply.
I do have to achieve normality for other tests.
Could you advise how to standardize data in order to get a normal distribution?
And which non- parametric methods I could use?
Thanks
Kind Regards
Agate
Hi,
You can use Box-Cox transformation using PROC TRANSREG in SAS to achieve normality.But by the summary statistics "log" may be a good transformation for your data. But one of the main problems with transformations are in the interpretations.
Non - parametric methods will also be useful with lower power.
But you can fit the model with Generalized Linear Models (GLM). The general method for modeling the length of stay in hospital has often GLM approach.
check the following paper to know more about modeling length of stay:
http://www.ncbi.nlm.nih.gov/pubmed/9630132
The full PDF of the above file is available by searching in Google. The SAS EG has the very good menu for GLM.
Hello,
As I mentioned earlier, the variable LOS in positively skewed,
I was trying to solve the problem by applying log transformations. The reason why I need to normalize the variable is to meet assumptions of multiple linear regression...
I run the following code:
data FYP.LOS_OUTLIERS_LOG;
SET FYP.LOS_OUTLIERS_LOG;
lny = log(DaysOfStay); /* The natural logarithm (base e) */
run;
but still new variable seems to be skewed :
I wonder if the code I run is correct?!
Thanks
Best Regards
Agate
Testing the assumptions of linear regression
Which one of the 4 assumptions are you violating?
Hello,
I am violating normal distribution of errors assumption
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.