BookmarkSubscribeRSS Feed
knighsson
Obsidian | Level 7

Hello,

When we try to use linear regression, we need to do the diagnostics first. So I wonder when I use "proc surveyreg" to investigate the linear regression, should I use "proc reg" statement to do the diagnostics test (residual plot ...) or should I use "proc surveyreg" statement to do the diagnostics? If I should use "proc surveyreg" statement, how can I get the residual plot?

 

Thank you

 

6 REPLIES 6
ballardw
Super User

Which specific diagnostics are you wanting?

 

If the data comes from a complex sample and requires Proc Surveyreg then Reg is very likely not the tool to use for diagnostics. Proc Reg would not properly apply the weights from a complex sample and so the residuals would be extremely likely to be incorrect.

Perhaps Surveymeans or Surveyfreq would come into play.

knighsson
Obsidian | Level 7

Thank you so much!

I want to check the assumption of linear regression including residual normality, outliers, linear relationship between dependent variable and independent variables, homogeneity, and multicollinearity. could you please tell me how to do those test in proc surveyreg statement?

 

Thank you!

PaigeMiller
Diamond | Level 26

You can output the residuals from PROC SURVEYREG using the OUTPUT statement, and then you can plot them to take care of "residual normality, outliers, linear relationship between dependent variable and independent variables, homogeneity". I think the COVB option in the MODEL statement would address multicollinearity.

--
Paige Miller
knighsson
Obsidian | Level 7
Thank you so much, that would be very helpful!

knighsson
Obsidian | Level 7

Hello, one more question.

 

I used following code to try to get the residual:

 

ods graphics on;

PROC SURVEYREG DATA= nh.outcomes nomcar;

STRATA sdmvstra; 

CLUSTER sdmvpsu; 

CLASS alpha16 age RIAGENDR PIR SDDSRVYR RIDRETH1;

WEIGHT glucwt4yr;

DOMAIN eligible;

model BMXBMI= age RIAGENDR PIR SDDSRVYR RIDRETH1 EIEER totalcounts alpha16/ stb  adjrsq clparm solution vadjust=none COVB ;

lsmeans alpha16/ lines adjust=tukey;

output out=bmi p= predict r = residual ;

run;

quit;

ods graphics off;

 

But could you please tell me how to plot the residual? I tried proc forecast and proc sgplot statements, but they are not working

 

proc forecast data=bmi

              out=pred outfull outresid;

   id seqn;

   var age RIAGENDR PIR SDDSRVYR RIDRETH1 EIEER totalcounts alpha16;

run;

 

proc sgplot data=pred;

   where _type_='RESIDUAL';

   needle x=seqn y=age RIAGENDR PIR SDDSRVYR RIDRETH1 EIEER totalcounts alpha16 / markers;

   

run;

PaigeMiller
Diamond | Level 26

@knighsson wrote:

 

proc sgplot data=pred;

   where _type_='RESIDUAL';

   needle x=seqn y=age RIAGENDR PIR SDDSRVYR RIDRETH1 EIEER totalcounts alpha16 / markers;

run;


The NEEDLE statement allows only one variable after Y=. But I doubt you really want a NEEDLE here, looking at residuals is usually done via scatter plots, so you can use the SCATTER statement.


Regarding _type_='RESIDUAL', you need to look (with your own eyes) inside the data set that is created named BMI (it is not named PRED) and see how the data set is structured, that will identify if you need a WHERE statement and what the WHERE statement should say; and it will identify the variable names you can use. Essentially, if you look at BMI with your own eyes, you will see everything you need to code some sort of residual plot.

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1136 views
  • 4 likes
  • 3 in conversation