DATA Step, Macro, Functions and more

Importing Data/Variables and Simple Linear Regression

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 13
Accepted Solution

Importing Data/Variables and Simple Linear Regression

[ Edited ]

I am asked to do a simple linear regression where Y= Total SAT and X= Expenditure. I was given a excel spreadsheet to import (attached). Here is my code:

LIBNAME MyLib "D:\Statistical Data Management\MyLib\";
     proc import datafile = "D:\Statistical Data Management\MyLib\Projects\Project #2\SAT.xlsx"
             out=TotalSATScore
             dbms=xlsx
             replace;
             sheet="Sheet1";
             getnames=yes;
run;
PROC UNIVARIATE;
            VAR Expenditure TotalSATScore ;
             HISTOGRAM/NORMAL;
TITLE1 'PROC UNIVARIATE FOR THE VARIABLE Total SAT Score and Expenditure';
TITLE2 'INCLUDES TEST OF NORMALITY';
RUN;
TITLE1;
TITLE2;

SYMBOL VALUE=DOT COLOR=BLACK;

 

PROC REG SIMPLE DATA=TotalSATScore;
            MODEL TotalSATScore=Expenditure;
            OUTPUT OUT=PREDRES P=PREDICTED R=RESIDUAL;
            PLOT TotalSATScore*Expenditure RESIDUAL.*Expenditure;
RUN;


/*Normality Check of Residuals */
PROC UNIVARIATE DATA=PredRes NORMAL;
            VAR RESIDUAL;
            HISTOGRAM/normal;
            PROBPLOT/normal(mu=est sigma=est);
            QQPLOT/normal(mu=est sigma=est);
TITLE 'Normality Check of Residuals';
RUN;
title;
quit;

 

However, my log says that the variable "TotalSATScore" is not found. My question: how do I fix that? How do I adjust for tha variable when it has 3 words. Since I can't get passed that, I do not have any output yet. If possible, could you see if the rest of the code looks okay? 


Accepted Solutions
Solution
2 weeks ago
PROC Star
Posts: 768

Re: Importing Data/Variables and Simple Linear Regression

[ Edited ]

1. It is never a good idea to have the same name for a data set and a variable. You have a data set called TotalSATScore and a variable called TotalSATScore

 

2. Secondly, it is a good idea to specify a data set in your procedures instead of relying on the procedure to figure it out. Therefore write PROC UNIVARIATE data=TotalSATScore instead in your first Univariate Procedure.

 

3. Run the following code

 

proc contents data=TotalSATScore;run;

and see what variables you have in your data set. If SAS tells you that a variable is not there, it is not there. So Check the Contents Procedure and see what the desired variable is actually called, if it is there at all. If it is not, then check your Import Procedure for errors or post your entire log for more information.

 

View solution in original post


All Replies
Solution
2 weeks ago
PROC Star
Posts: 768

Re: Importing Data/Variables and Simple Linear Regression

[ Edited ]

1. It is never a good idea to have the same name for a data set and a variable. You have a data set called TotalSATScore and a variable called TotalSATScore

 

2. Secondly, it is a good idea to specify a data set in your procedures instead of relying on the procedure to figure it out. Therefore write PROC UNIVARIATE data=TotalSATScore instead in your first Univariate Procedure.

 

3. Run the following code

 

proc contents data=TotalSATScore;run;

and see what variables you have in your data set. If SAS tells you that a variable is not there, it is not there. So Check the Contents Procedure and see what the desired variable is actually called, if it is there at all. If it is not, then check your Import Procedure for errors or post your entire log for more information.

 

Occasional Contributor
Posts: 13

Re: Importing Data/Variables and Simple Linear Regression

this was incredibly helpful, thank you so much!! Have a blessed day

PROC Star
Posts: 768

Re: Importing Data/Variables and Simple Linear Regression

Anytime, glad to help Smiley Happy 

 

You too, have a nice day.

Super User
Super User
Posts: 7,997

Re: Importing Data/Variables and Simple Linear Regression

Please refer to other posts on here as to how to post and format code.  What you have posted is totally unreadable, mixing case, no indentations etc.  Also, post test data in the form of a datastep in the body of the post using the {i} code window, I am not risking my machine downloading Excel files.

 

For your question, examine the dataset: TotalSATScore

It is lacking a variable called: TotalSATScore

Most likely because you are using Excel - the worst possible data source and compounding that by using the guessing procedure proc import which guess what the data is supposed to look like.  Check the structure on what the dataset is created as and you will see you have a different name or some other issue.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 99 views
  • 1 like
  • 3 in conversation