BookmarkSubscribeRSS Feed
pstarr
Calcite | Level 5

This is the complete assignment:

 

1. Select a good model that can effectively used to predict the response variable. In this step only numerical variables will be considered. (forexample, OUTLOOK, TYPEFOOD, SIZE and OWNER will not be considerednow). Apply different model selection criteria(at least two) and compare the results.
2.Once you determined a model, please add all categorical variables OUTLOOK, TYPEFOOD, SIZE and OWNERinto the existed model. (Hint: dummy variables).Please use the stepwise selection method to get the best model.
3.Conduct a regression diagnosticfor the model you determinedin step 2. (Hint: you need to checkthe following: equal variance, linearity, normality).If some assumptionsare not satisfied, please use box-cox transformationon the response to get some model which will satisfy most of the assumptions(maybe not all of them)and can explain the data well.

 

Here, I've done part 1 and 2.

 
data c1;
infile "rest.dat" firstobs=3 obs=281;
input row id outlook sales newcap value costgood wages ads typefood;
data c2;
infile "rest.dat" firstobs=284 obs=562;
input row seats owner ft pt size;
data complete; merge c1 c2;

proc reg data=complete;
model sales = newcap value costgood wages ads seats ft pt/selection=adjrsquare;
run;

proc reg data=complete;
model sales = newcap value costgood wages ads seats ft pt/selection=stepwise;
run;

data new;
set complete;
Do i=1 To _N_; If outlook = 1 then out1=1; else out1=0; end;
Do i=1 To _N_; If outlook = 2 then out2=1; else out2=0; end;
Do i=1 To _N_; If outlook = 3 then out3=1; else out3=0; end;
Do i=1 To _N_; If outlook = 4 then out4=1; else out4=0; end;
Do i=1 To _N_; If outlook = 5 then out5=1; else out5=0; end;

Do i=1 To _N_; If typefood = 1 then t1=1; else t1=0; end;
Do i=1 To _N_; If typefood = 2 then t2=1; else t2=0; end;

Do i=1 To _N_; If owner = 1 then own1=1; else own1=0; end;
Do i=1 To _N_; If owner = 2 then own2=1; else own2=0; end;

Do i=1 To _N_; If size = 1 then sz1=1; else sz1=0; end;
Do i=1 To _N_; If size = 2 then sz2=1; else sz2=0; end;

proc reg data=new;
model sales = value costgood ft pt out1 out2 out3 out4 out5 t1 t2 own1 own2 sz1 sz2/selection=stepwise;
run;

//Final model seems to be y=sales is reliant on these variables x=value ft pt out3 t1 

This is the part that I'm confused about how to do in SAS:
 
3.Conduct a regression diagnostic for the model you determined in step 2.3.Conduct a regression diagnosticfor the model you determinedin step 2. (Hint: you need to check the following: equal variance, linearity, normality).
 
The last few SAS procedures that the teacher went through were Univariate, Box-Cox, and there was some mention of transreg but I haven't been able to find a simple explanation that tells me how I'm supposed to apply any of these and my experiences with the documentation for SAS have left me more confused rather than helping to clarify. Could someone point me in the right direction of what I use to check equal variance, linearity, and normality?
2 REPLIES 2
PaigeMiller
Diamond | Level 26

To check equal variance, linearity, normality, you need to plot the residuals from the fitted model. Which should happen by default if you turn on ODS GRAPHICS before you run PROC REG.

--
Paige Miller
Reeza
Super User
The examples in the documentation walk through enough for your homework.
https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_reg_examples01.htm&docsetVersion=...

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 505 views
  • 0 likes
  • 3 in conversation