BookmarkSubscribeRSS Feed
jlima22
Calcite | Level 5

Part of my assignment reads as follows: 

Your goal is to create a regression model that can predict the income based on age in years. 

  1. Create a scatter plot for age vs. income.
    1. I believe I have done this. Screen Shot 2021-01-22 at 8.15.43 PM.png
  2. Fit a linear regression model in SAS
  3. How good is the overall fit of the income determination model? (Use both R2 as well as the F statistics to justify your answer.)
  4. Compute the predicted income for a 27-year-old individual.

The second part of the assignment is giving me a hard time. I'm using the Linear Regression task but I'm confused as to how to set it up. Would age or income be the dependent variable? Would the remaining variable then be used as the classification variable/intercept?

 

 Any help would be greatly appreciated! 

3 REPLIES 3
mkeintz
PROC Star

predict the income based on age

One of these variables is the predictor, one the "predictee".  Which is which should be evident from the quoted phrase above.

 

Google "dependent variable" and you will likely have started your road to understanding regression.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

In addition to what @mkeintz said, there are no CLASS variables here, there are only continuous variables, so if you include CLASS variables in your model, you get gibberish. So please look up what a CLASS variable is and is not.

--
Paige Miller
STAT_Kathleen
SAS Employee
The INCOME variable is the dependent variable and AGE is the independent variable. So if you want to use PROC REG to estimate a linear regression model. For example,
/* syntax for linear regression model, provide parameter estimates, creates output data set with predicted observations */
PROC REG DATA=YOURDATA;
MODEL INCOME=AGE;
OUTPUT OUT=OUT P=PREDINC;
QUIT;

/* provides output data set with the predicted observations */
PROC PRINT DATA=OUT;
RUN;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1407 views
  • 0 likes
  • 4 in conversation