BookmarkSubscribeRSS Feed
koksiang100
Calcite | Level 5

Actual 1100 1300 1400 1500 1600 1100 1200 1600 2100 1300 1600 1300 1600 2200 2300 1700 1800 800 1400 900 2100 1400 1800 1900 1000 1800 1700 2100 800 1100 900 1600 1700 1400 1100 1200 1700 900 700 900 1300 700 1500 700 1300 1100 1700 1600 1800 2000 1500 2100

Demand 1500 2100 1600 1500 2000 1600 1200 2000 2200 2000 2200 2000 2000 2500 2500 2000 2000 1000 2000 1500 2500 1500 2500 2500 2000 2000 2500 2500 1500 1500 1400 2000 2000 2000 1500 1500 2500 1500 1500 1500 2500 1500 2000 1500 1500 2000 2000 2500 2500 2500 2500 2500

These are the data collected. Both are sequential from 1st - 52nd weeks. How do determine whether it is parametric or non-parametric?

15 REPLIES 15
stat_sas
Ammonite | Level 13

Try Kolmogorov Smirnov test for this.

koksiang100
Calcite | Level 5

The maximum difference between the cumulative distributions, D, is: 0.5000 with a corresponding P of: 0.000


Data Set 1:

52 data points were entered

Mean = 1440.

95% confidence interval for actual Mean: 1322. thru 1559.

Standard Deviation = 425.

High = 2.300E+03 Low = 700.

Third Quartile = 1.700E+03 First Quartile = 1.100E+03

Median = 1450.

Average Absolute Deviation from Median = 352.

KS finds the data is consistent with a normal distribution: P= 0.84 where the normal distribution has mean= 1449. and sdev= 422.6

KS is not particularly happy calling this data log normally distributed: P= 0.17 where the log normal distribution has geometric mean= 1332. and multiplicative sdev= 1.394

Items in Data Set 1:

700. 700. 700. 800. 800. 900. 900. 900. 900. 1.000E+03 1.100E+03 1.100E+03 1.100E+03 1.100E+03 1.100E+03 1.200E+03 1.200E+03 1.300E+03 1.300E+03 1.300E+03 1.300E+03 1.300E+03 1.400E+03 1.400E+03 1.400E+03 1.400E+03 1.500E+03 1.500E+03 1.500E+03 1.600E+03 1.600E+03 1.600E+03 1.600E+03 1.600E+03 1.600E+03 1.700E+03 1.700E+03 1.700E+03 1.700E+03 1.700E+03 1.800E+03 1.800E+03 1.800E+03 1.800E+03 1.900E+03 2.000E+03 2.100E+03 2.100E+03 2.100E+03 2.100E+03 2.200E+03 2.300E+03

Data Set 2:

52 data points were entered

Mean = 1948.

95% confidence interval for actual Mean: 1829. thru 2067.

Standard Deviation = 426.

High = 2.500E+03 Low = 1.000E+03

Third Quartile = 2.500E+03 First Quartile = 1.500E+03

Median = 2000.

Average Absolute Deviation from Median = 340.

KS says it's unlikely this data is normally distributed: P= 0.02 where the normal distribution has mean= 1923. and sdev= 419.7

KS says it's unlikely this data is log normally distributed: P= 0.00 where the log normal distribution has geometric mean= 1851. and multiplicative sdev= 1.288

Items in Data Set 2:

1.000E+03 1.200E+03 1.400E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.500E+03 1.600E+03 1.600E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.000E+03 2.100E+03 2.200E+03 2.200E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03

Data Reference: 2786

KS finds the data is consistent with a normal distribution: P= 0.84 where the normal distribution has mean= 1449. and sdev= 422.6

KS is not particularly happy calling this data log normally distributed: P= 0.17 where the log normal distribution has geometric mean= 1332.  <---- what do these 2 sentences mean?

stat_sas
Ammonite | Level 13

Seems like data2 variable does not follow normal distribution.

koksiang100
Calcite | Level 5

so my data1 is following normal distribution and data2 is not.... therefore which test should i use and what assumption should i make ?

stat_sas
Ammonite | Level 13

Why do you want to compare two variables which are not following same distribution?

koksiang100
Calcite | Level 5

I have to prove that in order to justify the problem in my project. If these 2 variables are not following same distribution,is that any way to compare them?

stat_sas
Ammonite | Level 13

Your data2 variable does not seem a random variable, it contains values which are very similar to each other. It is evident that two variables are different.

koksiang100
Calcite | Level 5

in case of that, what test should I use in order to prove the significant difference in these 2 data?

stat_sas
Ammonite | Level 13

Okay, I would say just divide data points of two variables into 4 to 5 classes and run chi-square test to see if there is an association between two variables.

koksiang100
Calcite | Level 5

what will be my hypothesis in chi square test?

stat_sas
Ammonite | Level 13

Is there any association between two classifications?

koksiang100
Calcite | Level 5

Ok. Thanks a lot... I will proceed with that later...

If I proved that the 2 variables are independent , what test should I use then?

how about I prove they are dependent ?

stat_sas
Ammonite | Level 13

Please try then we can go from there.

koksiang100
Calcite | Level 5

Chi-Square Test for Association: Worksheet rows, Worksheet columns

Rows: Worksheet rows   Columns: Worksheet columns

       600-1000  1100-1500  1600-2000  2100-2600  All

1            10         19         17          6   52

           5.50      17.50      17.50      11.50

2             1         16         18         17   52

           5.50      17.50      17.50      11.50

All          11         35         35         23  104

Cell Contents:      Count

                    Expected count

Pearson Chi-Square = 12.910, DF = 3, P-Value = 0.005

Likelihood Ratio Chi-Square = 14.316, DF = 3, P-Value = 0.003

here is the result that i obtained from minitab

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 15 replies
  • 2115 views
  • 1 like
  • 2 in conversation