Help using Base SAS procedures

How do we find the interaction between two variables when running SAS for the prediction equation

Accepted Solution Solved
Reply
Contributor
Posts: 46
Accepted Solution

How do we find the interaction between two variables when running SAS for the prediction equation

[ Edited ]
data PEC;
input energyconservationscale studentattitudesscale gender $;
interaction studentattitudesscale * gender $;
datalines;
50 30 male
50 46 Female
30 45 male
38 36 Female
41 30 male
15 15 Female
20 38 male
15 50 Female
45 37 male
25 37 Female
45 47 male
50 29 Female
22 36 male
48 36 Female
40 37 male
30 14 Female
38 48 male
12 38 Female
12 18 male
15 25 Female
10 36 male
45 29 Female
24 26 male
34 27 Female
49 25 male
28 50 Female
25 26 male
10 38 Female
10 10 male
50 30 male
;
proc corr;
var studentattitudesscale energyconservationscale gender;
proc reg;
model energyconservationscale = studentattitudescale gender/ stb scorr2 p r vif;
plot residual.*predicted.;
output out=res residual=resid;
predicted = preg;
proc univariate plot;
var resid;

 


Accepted Solutions
Solution
‎05-01-2016 11:28 PM
Super User
Posts: 19,822

Re: How do we find the interaction between two variables when running SAS for the prediction equatio

[ Edited ]

You have a lot of errors in your log.  If you resolve those errors it helps to start off with. 

You really should be running your code in sections and making sure each section runs first before proceeding. I think I mentioned this before, but it bears repeating. 

 

So, to fix your code we need to fix the errors. In general some of the changes include:

 

  1. Adding RUN after each proc/data step
  2. Not explicitly referring to the data in each proc
  3. Using categorical variables (gender) in places where only continuous variables are required.
  4. Adding title statements to describe what you're doing in the output.
  5. Note: I can't reproduce the graphs/plot statements because I'm using SAS UE, hope it works for you. I had to remove plot to test the code but I left it in for you. It works without the plot statements. 
  6. There are more errors and they're itemized before each step. 

 

*This step imports the data, PEC, stores it into your WORK library. This means its stored temporarily and that between each launch of SAS you'll need to re-run the code.;

*Errors in the code are from the interaction statement - this is not a valid statement in SAS - the log generates an error indicating this as well.; 

*I've added a new variable gender_code - which dummy codes your variable so that you can use it in the interaction term or model. Proc REG doesn't have a CLASS statement otherwise I'd use that;

*Create an interaction term as the model statement in proc reg does not support the V1*V2 syntax;

*Add a RUN at the end. Not explicitly required in this step but helps to make the code readable;

 

data PEC;
input energyconservationscale studentattitudesscale gender $;
if gender='male' then gender_code=1;
else gender_code=0;
interaction_att_gender= studentattitudesscale*gender_code; datalines; 50 30 male 50 46 Female 30 45 male 38 36 Female 41 30 male 15 15 Female 20 38 male 15 50 Female 45 37 male 25 37 Female 45 47 male 50 29 Female 22 36 male 48 36 Female 40 37 male 30 14 Female 38 48 male 12 38 Female 12 18 male 15 25 Female 10 36 male 45 29 Female 24 26 male 34 27 Female 49 25 male 28 50 Female 25 26 male 10 38 Female 10 10 male 50 30 male ; run;

 

 

*In this step you're testing for correlation between variables. Correlation is generally a measure between continuous variable so Gender cannot be in this list. There is an error indicating this. 

*Remove Gender;

*Add a RUN;

*Add a DATA= to explicitly point to input data step. This helps if you try to re-run things later on and with making sure you know what exactly is happening in your code;

*Add a title statement to help describe output;

 

proc corr data=PEC;
title 'Correlation between continuous variables';
var studentattitudesscale energyconservationscale;
run;

 

 

*The next step is fixing the regression;

*Add RUN;

*Add QUIT;

*Proc Reg is an interactive procedure and one of the few that needs a quit statement;

*Add DATA=;

*Remove extra semi colon in OUTPUT statement - predicted should be part of the same statement;

*Replace gender with the gender_code variable as a numeric variable is required for the model statement;

*Fix spelling of variable studentattitudesscale in the model statement - missing an s;

*Add a title statement;

*Add ODS GRAPHICS ON statement since it will generate a bunch of graphs. This is usually on by default so you may not need it;

*Some of these threw errors and others are good coding practice;

 

proc reg data=PEC;
title 'Regression model without interaction term';
model energyconservationscale = studentattitudesscale gender_code/ stb scorr2 p r vif;
plot residual.*predicted.;
output out=res residual=resid predicted = preg;
run;quit;

 

 

*Now add the interaction term to the model;

*Change out=res2 so that you have a different output dataset with the residuals from this model;

 

proc reg data=PEC;
title 'Regression model with interaction term';
model energyconservationscale = studentattitudesscale gender_code  interaction_att_gender/ stb scorr2 p r vif;
plot residual*predicted;
output out=res2 residual=resid predicted = preg;
run;quit;

 

*And last but not least the PROC Univariate for the residuals for both models - with and without interaction terms;

*Add DATA=;

*Add RUN;

*Note that SAS produces a lot of analysis for residuals by default so pay attention to the graphs produced by default. 

 

 

proc univariate data=res plot;
title 'Residuals for model without interaction term';
var resid;
run;

proc univariate data=resw plot;
title 'Residuals for model with interaction term';
var resid;
run;

*Erase title so it doesn't keep appearing;
title;

 

 

Good Luck. 

If you have further questions or something doesn't work please be as explicit as possible in your descriptions. 

 

 

 

 

 

 

 

 

 

 

 

 

View solution in original post


All Replies
Super User
Posts: 19,822

Re: How do we find the interaction between two variables when running SAS for the prediction equatio

[ Edited ]

Your interaction statement does not belong in your first data step. That first step only imports the data. 

 

You may want to consider adding comments to your code to help clarify what part is doing what. 

 

Interactions would be be examined in the regression model and you would include it as a term in the model, with an asterisk between the terms. 

 

Variable1*variable2

 

Please use the code editor when including code in your posts. It's the i or notepad/man in the editor, just under the word preview for me. 

 

EDIT:

The variable1*variable2 method doesn't work for proc reg you have to explictly create the term  - I added some code below with a lot of comments. 

Solution
‎05-01-2016 11:28 PM
Super User
Posts: 19,822

Re: How do we find the interaction between two variables when running SAS for the prediction equatio

[ Edited ]

You have a lot of errors in your log.  If you resolve those errors it helps to start off with. 

You really should be running your code in sections and making sure each section runs first before proceeding. I think I mentioned this before, but it bears repeating. 

 

So, to fix your code we need to fix the errors. In general some of the changes include:

 

  1. Adding RUN after each proc/data step
  2. Not explicitly referring to the data in each proc
  3. Using categorical variables (gender) in places where only continuous variables are required.
  4. Adding title statements to describe what you're doing in the output.
  5. Note: I can't reproduce the graphs/plot statements because I'm using SAS UE, hope it works for you. I had to remove plot to test the code but I left it in for you. It works without the plot statements. 
  6. There are more errors and they're itemized before each step. 

 

*This step imports the data, PEC, stores it into your WORK library. This means its stored temporarily and that between each launch of SAS you'll need to re-run the code.;

*Errors in the code are from the interaction statement - this is not a valid statement in SAS - the log generates an error indicating this as well.; 

*I've added a new variable gender_code - which dummy codes your variable so that you can use it in the interaction term or model. Proc REG doesn't have a CLASS statement otherwise I'd use that;

*Create an interaction term as the model statement in proc reg does not support the V1*V2 syntax;

*Add a RUN at the end. Not explicitly required in this step but helps to make the code readable;

 

data PEC;
input energyconservationscale studentattitudesscale gender $;
if gender='male' then gender_code=1;
else gender_code=0;
interaction_att_gender= studentattitudesscale*gender_code; datalines; 50 30 male 50 46 Female 30 45 male 38 36 Female 41 30 male 15 15 Female 20 38 male 15 50 Female 45 37 male 25 37 Female 45 47 male 50 29 Female 22 36 male 48 36 Female 40 37 male 30 14 Female 38 48 male 12 38 Female 12 18 male 15 25 Female 10 36 male 45 29 Female 24 26 male 34 27 Female 49 25 male 28 50 Female 25 26 male 10 38 Female 10 10 male 50 30 male ; run;

 

 

*In this step you're testing for correlation between variables. Correlation is generally a measure between continuous variable so Gender cannot be in this list. There is an error indicating this. 

*Remove Gender;

*Add a RUN;

*Add a DATA= to explicitly point to input data step. This helps if you try to re-run things later on and with making sure you know what exactly is happening in your code;

*Add a title statement to help describe output;

 

proc corr data=PEC;
title 'Correlation between continuous variables';
var studentattitudesscale energyconservationscale;
run;

 

 

*The next step is fixing the regression;

*Add RUN;

*Add QUIT;

*Proc Reg is an interactive procedure and one of the few that needs a quit statement;

*Add DATA=;

*Remove extra semi colon in OUTPUT statement - predicted should be part of the same statement;

*Replace gender with the gender_code variable as a numeric variable is required for the model statement;

*Fix spelling of variable studentattitudesscale in the model statement - missing an s;

*Add a title statement;

*Add ODS GRAPHICS ON statement since it will generate a bunch of graphs. This is usually on by default so you may not need it;

*Some of these threw errors and others are good coding practice;

 

proc reg data=PEC;
title 'Regression model without interaction term';
model energyconservationscale = studentattitudesscale gender_code/ stb scorr2 p r vif;
plot residual.*predicted.;
output out=res residual=resid predicted = preg;
run;quit;

 

 

*Now add the interaction term to the model;

*Change out=res2 so that you have a different output dataset with the residuals from this model;

 

proc reg data=PEC;
title 'Regression model with interaction term';
model energyconservationscale = studentattitudesscale gender_code  interaction_att_gender/ stb scorr2 p r vif;
plot residual*predicted;
output out=res2 residual=resid predicted = preg;
run;quit;

 

*And last but not least the PROC Univariate for the residuals for both models - with and without interaction terms;

*Add DATA=;

*Add RUN;

*Note that SAS produces a lot of analysis for residuals by default so pay attention to the graphs produced by default. 

 

 

proc univariate data=res plot;
title 'Residuals for model without interaction term';
var resid;
run;

proc univariate data=resw plot;
title 'Residuals for model with interaction term';
var resid;
run;

*Erase title so it doesn't keep appearing;
title;

 

 

Good Luck. 

If you have further questions or something doesn't work please be as explicit as possible in your descriptions. 

 

 

 

 

 

 

 

 

 

 

 

 

Contributor
Posts: 46

Re: How do we find the interaction between two variables when running SAS for the prediction equatio

Hi Reeza,

 

Thank you so very much for all your help. Believe it or not but I have benefited under your tutelage, You do exude a passion for your subject which is a motivation to me. Statistics has never been my forte but you have not only provided me with answers but challenged me to focus on each aspect of the code. I am a work in progress but I believe that I will get there.

 

Again thank you very much,

 

Josie

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 390 views
  • 1 like
  • 2 in conversation