Solved: P value explanation and covariate

Babloo · Posted 08-07-2016 06:27 AM

Appreciate if someone help me with the following questions which I experienced in the interview.

a) At 95% confidence interval we will reject the null hypothesis when P value is less than .05. How will you explain this statement to a layman?

b) How will you determine that errors are normally distributed for regression analysis?

c) Difference between covariate and covariance

Thanks in advance for your explanation.

KachiM · Posted 08-07-2016 07:55 PM

I thought some will answer you. I share my thoughts.

a) Appreciate if you give one more example on P value understanding

There are two concepts used for testing a hypothesis namely confidence interval and p-value. Generally the former is preferred. We shall discuss the p-value now. We start an investigation with some hypothesis like the weight differnce between two groups of children(one group received some special treatment and the other group similar to the former but not received that special treatment) is about the same. We begin to take a sample of children and divide them randomly into two equal groups for the experiment. Before we start the experiment we set a pre-defined probability value known as significance level(alpha, say 0.05). After conducting the experiment, we use the data to test our null hypothesis of no difference in weight between the groups. We collect evidence from the data using a computed value(s) from the sample known as statistic. The evidence will be in terms of probability- a p value ranging from 0 to 1. If this p-value is large(larger than alpha) then it suggests that the null hypothesis is true, in other words there is no significant difference of weights between the groups and therefore special treatment has no effect at least with the size of sample we started with. If the p is lesser(lesser than alpha) then it is suggestive of significant weight gains.

b) How to check the distribution of residuals after fitting normality?

Read the Proc Univariate documentation for testing the normality.

I trust that my answer is acceptable to you and favor me so by ticking to close your post.

View solution in original post

KachiM · Posted 08-07-2016 12:06 PM

Somthing like this will help you with:

a) At 95% confidence interval we will reject the null hypothesis when P value is less than .05. How will you explain this statement to a layman?

The theoretical explanation is:

If the same population is sampled on several times(repeated samples) and interval estimates are made on each occasion, the resulting intervals would bracket the true population parameter(mean say) in approximately 95 % of the occasions.

This boils down to : the sample statistic(sample mean) has the probabilty of 0.95 when it found to fall within the two limits.

b) How will you determine that errors are normally distributed for regression analysis?

This is done post facto.

If the observations are from a Normal distribution, then the residuals MUST have a Normal distibution. So we check the distribution pattern of residuals after fitting for Normality.

c) Difference between covariate and covariance

Covariance is Variance between the product of two Variables.

Covariate is a Variable which is supposed to be related to the variable of our interest.

Babloo · Posted 08-07-2016 12:57 PM

Thank you for your explanations

a) Appreciate if you give one more example on P value understanding

b) How to check the distribution of residuals after fitting normality?

##- Please type your reply above this line. Simple formatting, no
attachments. -##

KachiM · Posted 08-07-2016 01:43 PM

Hi Babloo,

Type, Testing for Normality in SAS, you get all goodness of fit tests. Don't forget to ask Google before placing Questions.

I have said enough on confidence interval. If you really want to learn it you can do experiment with repeated samples.

KachiM · Posted 08-07-2016 07:55 PM

I thought some will answer you. I share my thoughts.

a) Appreciate if you give one more example on P value understanding

There are two concepts used for testing a hypothesis namely confidence interval and p-value. Generally the former is preferred. We shall discuss the p-value now. We start an investigation with some hypothesis like the weight differnce between two groups of children(one group received some special treatment and the other group similar to the former but not received that special treatment) is about the same. We begin to take a sample of children and divide them randomly into two equal groups for the experiment. Before we start the experiment we set a pre-defined probability value known as significance level(alpha, say 0.05). After conducting the experiment, we use the data to test our null hypothesis of no difference in weight between the groups. We collect evidence from the data using a computed value(s) from the sample known as statistic. The evidence will be in terms of probability- a p value ranging from 0 to 1. If this p-value is large(larger than alpha) then it suggests that the null hypothesis is true, in other words there is no significant difference of weights between the groups and therefore special treatment has no effect at least with the size of sample we started with. If the p is lesser(lesser than alpha) then it is suggestive of significant weight gains.

b) How to check the distribution of residuals after fitting normality?

Read the Proc Univariate documentation for testing the normality.

I trust that my answer is acceptable to you and favor me so by ticking to close your post.

Ksharp · Posted 08-07-2016 10:19 PM

Here is my two cents:

a) At 95% confidence interval we will reject the null hypothesis when P value is less than .05. How will you explain this statement to a layman?

when P value is less than .05, you can reject H0. For example:
H0: Model fit data
When P=0.01, you can say H0 is not right  or reject H0.

Assuming when H0 is true, the corresponding statistic is D0(which measure the distance from H0).
P value is the probability of greater than D0 . i.e.  P=P( D > D0) .
So when P<0.05 ,it is to say there are few D could be greater than D0 ,
In other words, D0 is very large , no other D is greater than it .
Therefore you are far away from H0.
Notice:  statistic D measure the distance from H0.

 
b) How will you determine that errors are normally distributed for regression analysis?

There is a Normal test in proc univariate .

 
c) Difference between covariate and covariance

Covariance is the variance between two variables, It measures the relationship between these two variables.

Covariate , I think it is believed that it has correlation with response/independent  variable Y .There is a so called
covariate analysis in ANOVA.
proc glm;
 class C;
 model Y=X C /solution;
quit;

P value explanation and covariate

Re: P value explanation and covariate

Re: P value explanation and covariate

Re: P value explanation and covariate

Re: P value explanation and covariate

Re: P value explanation and covariate

Re: P value explanation and covariate

Catch up on SAS Innovate 2026