BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nlpurumi
Obsidian | Level 7

Hi. In SPSS, I can run normality test for my dependent variable for each group and condition.

Is there a way to do it in SAS? Or should I always save data set for each group and condition using where function before running normality test?

 

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Your BY statement needs to be the same as in the SORT. 

 

View solution in original post

19 REPLIES 19
Miracle
Barite | Level 11

Hi, hope you find this helpful Modeling a Data Distribution.

You may also like to explore other procs such as proc capability and proc model.

nlpurumi
Obsidian | Level 7

Thanks for your reply. While this is the correct website that I was looking for (but couldn't), I also resolved my issue by myself.

I found adding "class" command before "var" row helped to sort normality test results by the independent variable I am interested in.

This is the code I used:

proc univariate data= a1.param_variabil normal;
class Group Whole_1st;
var Time_param;
run;

 

But, thanks again for your help!

 

 

Reeza
Super User

Your looking for what's called a BY group analysis, which is slightly different than a CLASS statement. 

 

In a CLASS statement all combinations are considered, versus a BY statement which runs each group independently. 

Ie if Var1 has 2 levels and Var2 has 3 levels and you include both, a BY statement will produce 6 sets of analysis where as a CLASS statement will produce 12 analysis, Overall (1), one for each level of Var1 (2), one for each level of Var2 (3) and for a combination of Var1*Var2 (6) =12

 

You can use either PROC Univariate for parametric statistics and PROC NPAR1WAY for non-parametric statistics. 

 


@nlpurumi wrote:

Hi. In SPSS, I can run normality test for my dependent variable for each group and condition.

Is there a way to do it in SAS? Or should I always save data set for each group and condition using where function before running normality test?

 

Thanks in advance!


 

nlpurumi
Obsidian | Level 7

I see. Thanks a lot for your help again.

Could you please teach me how to change my code to use "BY" instead of "Class" statement to see the results for each level of variables?

Also, I added a few codes to generate the normality figures which I took from the website the above responder provided.

However, I cannot understand each function of the code nor interpret the result from the figure. 

Could you please help? Thanks.

 

ods graphics on;

proc univariate data= a1.param_variabil1 normal;
class Group Whole_1st;
var Time_param;
probplot Time_param / normal(mu=est sigma=est)
square
odstitle = Title;
label Time_param = 'Time Parameter';
inset mean std / format=6.4;
run;

 

Normality test_SL1.png

 

Reeza
Super User

CHange CLASS to BY, that’s all. Most Procs take a BY statement, only a few support a CLASS statement. 

 

Have you you checked the documentation? If you have specific questions I’m happy to help but I don’t feel up to providing a full tutorial on Proc univariate. If you want tutorials either take the free statistics e-course or search on lexjansen.com for many user written papers. 

 

Here’s the documentation for PROC UNIVARIATE, if you click on Syntax and select the statements in your code you can read the explanation of what each line and option do. 

 

nlpurumi
Obsidian | Level 7

Thanks a lot! 🙂

nlpurumi
Obsidian | Level 7

When I change the "class" to "BY", it only generated results of one condition out of four conditions I tried to explore.

Thus, "BY" was not a good solution.

 

 

Reeza
Super User

@nlpurumi wrote:

When I change the "class" to "BY", it only generated results of one condition out of four conditions I tried to explore.

Thus, "BY" was not a good solution.

 

 


Check your log - you likely didn't pre-sort your data so the PROC errored out instead of completing.

 

proc sort data=sashelp.class out=class;
by sex;
run;

proc means data=class n mean min max;
by sex;
var weight;
run;

proc means data=class n mean min max;
class sex;
var weight;
run;

.

 

 

nlpurumi
Obsidian | Level 7

Great help. I was using following code and it worked in the way I wanted (providing normality test results for each condition). 

It did not require me to save the dataset according to condition I am interested in. So it was really convenient.

 

Thanks a lot!

 

proc sort data=a1.param_variabil out=a1.param_variabil1;
by SL Group Whole_1st;
run;

 

ods graphics on;

proc univariate data= a1.param_variabil1 normal;
BY Group Whole_1st SL;
var Time_param;
histogram / normal
ctext = blue;
/* probplot Time_param / normal(mu=est sigma=est)
square
odstitle = Title;
label Time_param = 'Time Parameter';
inset mean std / format=6.4; */
run;

 

nlpurumi
Obsidian | Level 7

I realized, I had an error message, so above code did not run to the end failing to provide me with the normality test results for each condition. I had three independent variables: SL (3 level), Group (2 level), Whole_1st (2 level).

Probably sorting and "BY" command could be a good solution, but not effective for many variables?

I am not sure if I had to include only two out of three variables for sorting. 

Could anyone please help? Thanks in advance.

 

 

ERROR: Data set A1.PARAM_VARIABIL1 is not sorted in ascending sequence. The current BY group has
Group = 2 and the next BY group has Group = 1.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE UNIVARIATE used (Total process time):
real time 2.30 seconds
cpu time 0.92 seconds

Reeza
Super User

Your BY statement needs to be the same as in the SORT. 

 

nlpurumi
Obsidian | Level 7

Yes, it worked. I cannot imagine I would figure out all these without your help. Thanks so much! 

  

BTW, I see that the distribution looks just like normal distribution to me, but Shapiro Wilk's result is still significant (p-value=.001). Most of the distribution looked normal to me. Do you have any idea why?

 

 

Thanks in advance againSL=3 Group=1 Whole_1st=1.png

Reeza
Super User

 

Customize your bins so you have a better histogram with less gaps so you can see the distribution better. Also, you can look at the q-q / pp plot which is easier to see normality in my opinion.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 19 replies
  • 12685 views
  • 1 like
  • 3 in conversation