Hi. In SPSS, I can run normality test for my dependent variable for each group and condition.
Is there a way to do it in SAS? Or should I always save data set for each group and condition using where function before running normality test?
Thanks in advance!
Hi, hope you find this helpful Modeling a Data Distribution.
You may also like to explore other procs such as proc capability and proc model.
Thanks for your reply. While this is the correct website that I was looking for (but couldn't), I also resolved my issue by myself.
I found adding "class" command before "var" row helped to sort normality test results by the independent variable I am interested in.
This is the code I used:
proc univariate data= a1.param_variabil normal;
class Group Whole_1st;
var Time_param;
run;
But, thanks again for your help!
Your looking for what's called a BY group analysis, which is slightly different than a CLASS statement.
In a CLASS statement all combinations are considered, versus a BY statement which runs each group independently.
Ie if Var1 has 2 levels and Var2 has 3 levels and you include both, a BY statement will produce 6 sets of analysis where as a CLASS statement will produce 12 analysis, Overall (1), one for each level of Var1 (2), one for each level of Var2 (3) and for a combination of Var1*Var2 (6) =12
You can use either PROC Univariate for parametric statistics and PROC NPAR1WAY for non-parametric statistics.
@nlpurumi wrote:
Hi. In SPSS, I can run normality test for my dependent variable for each group and condition.
Is there a way to do it in SAS? Or should I always save data set for each group and condition using where function before running normality test?
Thanks in advance!
I see. Thanks a lot for your help again.
Could you please teach me how to change my code to use "BY" instead of "Class" statement to see the results for each level of variables?
Also, I added a few codes to generate the normality figures which I took from the website the above responder provided.
However, I cannot understand each function of the code nor interpret the result from the figure.
Could you please help? Thanks.
ods graphics on;
proc univariate data= a1.param_variabil1 normal;
class Group Whole_1st;
var Time_param;
probplot Time_param / normal(mu=est sigma=est)
square
odstitle = Title;
label Time_param = 'Time Parameter';
inset mean std / format=6.4;
run;
CHange CLASS to BY, that’s all. Most Procs take a BY statement, only a few support a CLASS statement.
Have you you checked the documentation? If you have specific questions I’m happy to help but I don’t feel up to providing a full tutorial on Proc univariate. If you want tutorials either take the free statistics e-course or search on lexjansen.com for many user written papers.
Here’s the documentation for PROC UNIVARIATE, if you click on Syntax and select the statements in your code you can read the explanation of what each line and option do.
Thanks a lot! 🙂
When I change the "class" to "BY", it only generated results of one condition out of four conditions I tried to explore.
Thus, "BY" was not a good solution.
@nlpurumi wrote:
When I change the "class" to "BY", it only generated results of one condition out of four conditions I tried to explore.
Thus, "BY" was not a good solution.
Check your log - you likely didn't pre-sort your data so the PROC errored out instead of completing.
proc sort data=sashelp.class out=class;
by sex;
run;
proc means data=class n mean min max;
by sex;
var weight;
run;
proc means data=class n mean min max;
class sex;
var weight;
run;
.
Great help. I was using following code and it worked in the way I wanted (providing normality test results for each condition).
It did not require me to save the dataset according to condition I am interested in. So it was really convenient.
Thanks a lot!
proc sort data=a1.param_variabil out=a1.param_variabil1;
by SL Group Whole_1st;
run;
ods graphics on;
proc univariate data= a1.param_variabil1 normal;
BY Group Whole_1st SL;
var Time_param;
histogram / normal
ctext = blue;
/* probplot Time_param / normal(mu=est sigma=est)
square
odstitle = Title;
label Time_param = 'Time Parameter';
inset mean std / format=6.4; */
run;
I realized, I had an error message, so above code did not run to the end failing to provide me with the normality test results for each condition. I had three independent variables: SL (3 level), Group (2 level), Whole_1st (2 level).
Probably sorting and "BY" command could be a good solution, but not effective for many variables?
I am not sure if I had to include only two out of three variables for sorting.
Could anyone please help? Thanks in advance.
ERROR: Data set A1.PARAM_VARIABIL1 is not sorted in ascending sequence. The current BY group has
Group = 2 and the next BY group has Group = 1.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE UNIVARIATE used (Total process time):
real time 2.30 seconds
cpu time 0.92 seconds
Your BY statement needs to be the same as in the SORT.
Yes, it worked. I cannot imagine I would figure out all these without your help. Thanks so much!
BTW, I see that the distribution looks just like normal distribution to me, but Shapiro Wilk's result is still significant (p-value=.001). Most of the distribution looked normal to me. Do you have any idea why?
Thanks in advance again
Customize your bins so you have a better histogram with less gaps so you can see the distribution better. Also, you can look at the q-q / pp plot which is easier to see normality in my opinion.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.