BookmarkSubscribeRSS Feed
DmytroYermak
Lapis Lazuli | Level 10

Hi all,

 

I have small sample of severity data of one disease:

 

data test;
  infile datalines dlm="," dsd;
  input severity @@;
datalines;                      
0,0,0,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,3,3,3,3,4,4,5,5,5,5,6,7,9,10,11
;
run;

The question is: how to reveal if the sample belongs to population with normal distribution?

 

How to see that 95% of values are within mu +/- 2 sigma,

                          68% of values are within mu +/- 1 sigma,

                          median is nearby mean.

 

Here it is the code from the adjacent topic but how to see and visualize on diagram median, mean, sigma, 2 sigma, 3 sigma.

 

proc univariate data=test;
var severity;
histogram severity / href=(2.0 1.0 7.0);
inset P10 median P90 / position=NE;
run;

 

4 REPLIES 4
PaigeMiller
Diamond | Level 26

@DmytroYermak wrote:

The question is: how to reveal if the sample belongs to population with normal distribution?

 

How to see that 95% of values are within mu +/- 2 sigma,

                          68% of values are within mu +/- 1 sigma,

                          median is nearby mean.

 

 


These two are not the same. Testing for normality is not the same as seeing what percent of the values are within mu ± 2 sigma, etc.

 

If you really want to test for normality, you might take a look at Q-Q plots in PROC CAPABILITY, and the NORMALTEST option in PROC CAPABILITY. 

--
Paige Miller
DmytroYermak
Lapis Lazuli | Level 10
I am reading "Primer of Biostatistics"(Glanz) and took the first task here. I understand that there are precise normality tests but for the beginners it is 'number of values within sigmas'. Would it be possible with SAS?
PaigeMiller
Diamond | Level 26

@DmytroYermak wrote:
 for the beginners it is 'number of values within sigmas'

I object to this statement, as I don't see a need to do this computation 'number of values within sigmas' for two reasons:

 

  1. It does not really test normality
  2. There are better tests of normality that are already programmed and at your fingertips (in PROC UNIVARIATE using the NORMALTEST option, using the QQPLOT statement or the PROBPLOT statement)

But, yes, SAS can produce such inferior information if you want it.

--
Paige Miller
ballardw
Super User

One way.

 

proc stdize data=test out=temp;
var severity;
run;

proc format library=work;
value sig
-1 <-<1 = '1 sigma'
-2 <- -1= '2 sigma'
1 <-<2  = '2 sigma'
other= 'more than 2 sigma';
run;

proc freq data=temp;
   tables severity;
   format severity sig.;
   label severity='Standardized Severity';
run;

Proc STDIZE default standardization is STD which returns the value of a variables number of std deviations from the mean of the given variable.

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1433 views
  • 0 likes
  • 3 in conversation