Actual 1100 1300 1400 1500 1600 1100 1200 1600 2100 1300 1600 1300 1600 2200 2300 1700 1800 800 1400 900 2100 1400 1800 1900 1000 1800 1700 2100 800 1100 900 1600 1700 1400 1100 1200 1700 900 700 900 1300 700 1500 700 1300 1100 1700 1600 1800 2000 1500 2100
Demand 1500 2100 1600 1500 2000 1600 1200 2000 2200 2000 2200 2000 2000 2500 2500 2000 2000 1000 2000 1500 2500 1500 2500 2500 2000 2000 2500 2500 1500 1500 1400 2000 2000 2000 1500 1500 2500 1500 1500 1500 2500 1500 2000 1500 1500 2000 2000 2500 2500 2500 2500 2500
These are the data collected. Both are sequential from 1st - 52nd weeks. How do determine whether it is parametric or non-parametric?
Try Kolmogorov Smirnov test for this.
The maximum difference between the cumulative distributions, D, is: 0.5000 with a corresponding P of: 0.000
52 data points were entered
Mean = 1440.
95% confidence interval for actual Mean: 1322. thru 1559.
Standard Deviation = 425.
High = 2.300E+03 Low = 700.
Third Quartile = 1.700E+03 First Quartile = 1.100E+03
Median = 1450.
Average Absolute Deviation from Median = 352.
KS finds the data is consistent with a normal distribution: P= 0.84 where the normal distribution has mean= 1449. and sdev= 422.6
KS is not particularly happy calling this data log normally distributed: P= 0.17 where the log normal distribution has geometric mean= 1332. and multiplicative sdev= 1.394
700. 700. 700. 800. 800. 900. 900. 900. 900. 1.000E+03 1.100E+03 1.100E+03 1.100E+03 1.100E+03 1.100E+03 1.200E+03 1.200E+03 1.300E+03 1.300E+03 1.300E+03 1.300E+03 1.300E+03 1.400E+03 1.400E+03 1.400E+03 1.400E+03 1.500E+03 1.500E+03 1.500E+03 1.600E+03 1.600E+03 1.600E+03 1.600E+03 1.600E+03 1.600E+03 1.700E+03 1.700E+03 1.700E+03 1.700E+03 1.700E+03 1.800E+03 1.800E+03 1.800E+03 1.800E+03 1.900E+03 2.000E+03 2.100E+03 2.100E+03 2.100E+03 2.100E+03 2.200E+03 2.300E+03
52 data points were entered
Mean = 1948.
95% confidence interval for actual Mean: 1829. thru 2067.
Standard Deviation = 426.
High = 2.500E+03 Low = 1.000E+03
Third Quartile = 2.500E+03 First Quartile = 1.500E+03
Median = 2000.
Average Absolute Deviation from Median = 340.
KS says it's unlikely this data is normally distributed: P= 0.02 where the normal distribution has mean= 1923. and sdev= 419.7
KS says it's unlikely this data is log normally distributed: P= 0.00 where the log normal distribution has geometric mean= 1851. and multiplicative sdev= 1.288
1.000E+03 1.200E+03 1.400E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.600E+03 1.600E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.100E+03 2.200E+03 2.200E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03
KS finds the data is consistent with a normal distribution: P= 0.84 where the normal distribution has mean= 1449. and sdev= 422.6
KS is not particularly happy calling this data log normally distributed: P= 0.17 where the log normal distribution has geometric mean= 1332. <---- what do these 2 sentences mean?
Seems like data2 variable does not follow normal distribution.
so my data1 is following normal distribution and data2 is not.... therefore which test should i use and what assumption should i make ?
Why do you want to compare two variables which are not following same distribution?
I have to prove that in order to justify the problem in my project. If these 2 variables are not following same distribution,is that any way to compare them?
Your data2 variable does not seem a random variable, it contains values which are very similar to each other. It is evident that two variables are different.
in case of that, what test should I use in order to prove the significant difference in these 2 data?
Okay, I would say just divide data points of two variables into 4 to 5 classes and run chi-square test to see if there is an association between two variables.
what will be my hypothesis in chi square test?
Is there any association between two classifications?
Ok. Thanks a lot... I will proceed with that later...
If I proved that the 2 variables are independent , what test should I use then?
how about I prove they are dependent ?
Please try then we can go from there.
Chi-Square Test for Association: Worksheet rows, Worksheet columns
Rows: Worksheet rows Columns: Worksheet columns
600-1000 1100-1500 1600-2000 2100-2600 All
1 10 19 17 6 52
5.50 17.50 17.50 11.50
2 1 16 18 17 52
5.50 17.50 17.50 11.50
All 11 35 35 23 104
Cell Contents: Count
Expected count
Pearson Chi-Square = 12.910, DF = 3, P-Value = 0.005
Likelihood Ratio Chi-Square = 14.316, DF = 3, P-Value = 0.003
here is the result that i obtained from minitab
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.