BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Babloo
Rhodochrosite | Level 12

Appreciate if someone help me with the following questions which I experienced in the interview.

 

a) At 95% confidence interval we will reject the null hypothesis when P value is less than .05. How will you explain this statement to a layman?

 

b) How will you determine that errors are normally distributed for regression analysis?

 

c) Difference between covariate and covariance

 

 

Thanks in advance for your explanation.

1 ACCEPTED SOLUTION

Accepted Solutions
KachiM
Rhodochrosite | Level 12

I thought some will answer you. I share my thoughts.

 

a) Appreciate if you give one more example on P value understanding

 

There are two concepts used for testing a hypothesis namely confidence interval and p-value. Generally the former is preferred. We shall discuss the p-value now. We start an investigation with some hypothesis like the  weight differnce between two groups of children(one group received some special treatment and the other group similar to the former but not received that special treatment) is about the same. We begin to take a sample of children and divide them randomly into two equal groups for the experiment. Before we start the experiment we set a pre-defined probability value known as significance level(alpha, say 0.05). After conducting the experiment, we use the data to test our null hypothesis of no difference in weight between the groups. We collect evidence from the data using a computed value(s) from the sample known as statistic. The evidence will be in terms of probability- a p value ranging from 0 to 1. If this p-value is large(larger than alpha) then it suggests that the null hypothesis is true, in other words there is no significant difference of weights between the groups and therefore  special treatment has no effect at least with the size of sample we started with. If the p is lesser(lesser than alpha) then it is suggestive of significant weight gains.

b) How to check the distribution of residuals after fitting normality?

Read the Proc Univariate documentation for testing the normality.

 

I trust that my answer is acceptable to you and favor me so by ticking to close your post.

View solution in original post

5 REPLIES 5
KachiM
Rhodochrosite | Level 12

Somthing like this will help you with:

 

a) At 95% confidence interval we will reject the null hypothesis when P value is less than .05. How will you explain this statement to a layman?

The theoretical explanation is:

If the same population is sampled on several times(repeated samples) and interval estimates are made on each occasion, the resulting intervals would bracket the true population parameter(mean say) in approximately 95 % of the occasions.

 

This boils down to :  the sample statistic(sample mean) has the probabilty of 0.95 when it found to fall within the two limits.

 

b) How will you determine that errors are normally distributed for regression analysis?

This is done post facto.

If the observations are from a Normal distribution, then the residuals MUST have a Normal distibution. So we check the distribution pattern of residuals after fitting for Normality.

 

c) Difference between covariate and covariance

Covariance is Variance between the product of two Variables.

Covariate is a Variable which is supposed to be related to the variable of our interest.

 

Babloo
Rhodochrosite | Level 12
Thank you for your explanations

a) Appreciate if you give one more example on P value understanding

b) How to check the distribution of residuals after fitting normality?

##- Please type your reply above this line. Simple formatting, no
attachments. -##
KachiM
Rhodochrosite | Level 12

Hi Babloo,

 

Type, Testing for Normality in SAS, you get all goodness of fit tests. Don't forget to ask Google before placing Questions.

 

I have said enough on confidence interval. If you really want to learn it you can do experiment with repeated samples.

 

KachiM
Rhodochrosite | Level 12

I thought some will answer you. I share my thoughts.

 

a) Appreciate if you give one more example on P value understanding

 

There are two concepts used for testing a hypothesis namely confidence interval and p-value. Generally the former is preferred. We shall discuss the p-value now. We start an investigation with some hypothesis like the  weight differnce between two groups of children(one group received some special treatment and the other group similar to the former but not received that special treatment) is about the same. We begin to take a sample of children and divide them randomly into two equal groups for the experiment. Before we start the experiment we set a pre-defined probability value known as significance level(alpha, say 0.05). After conducting the experiment, we use the data to test our null hypothesis of no difference in weight between the groups. We collect evidence from the data using a computed value(s) from the sample known as statistic. The evidence will be in terms of probability- a p value ranging from 0 to 1. If this p-value is large(larger than alpha) then it suggests that the null hypothesis is true, in other words there is no significant difference of weights between the groups and therefore  special treatment has no effect at least with the size of sample we started with. If the p is lesser(lesser than alpha) then it is suggestive of significant weight gains.

b) How to check the distribution of residuals after fitting normality?

Read the Proc Univariate documentation for testing the normality.

 

I trust that my answer is acceptable to you and favor me so by ticking to close your post.

Ksharp
Super User
Here is my two cents:

a) At 95% confidence interval we will reject the null hypothesis when P value is less than .05. How will you explain this statement to a layman?

when P value is less than .05, you can reject H0. For example:
H0: Model fit data
When P=0.01, you can say H0 is not right  or reject H0.

Assuming when H0 is true, the corresponding statistic is D0(which measure the distance from H0).
P value is the probability of greater than D0 . i.e.  P=P( D > D0) .
So when P<0.05 ,it is to say there are few D could be greater than D0 ,
In other words, D0 is very large , no other D is greater than it .
Therefore you are far away from H0.
Notice:  statistic D measure the distance from H0.

 
b) How will you determine that errors are normally distributed for regression analysis?

There is a Normal test in proc univariate .

 
c) Difference between covariate and covariance

Covariance is the variance between two variables, It measures the relationship between these two variables.

Covariate , I think it is believed that it has correlation with response/independent  variable Y .There is a so called
covariate analysis in ANOVA.
proc glm;
 class C;
 model Y=X C /solution;
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2896 views
  • 0 likes
  • 3 in conversation