BookmarkSubscribeRSS Feed
newmkka
Calcite | Level 5

An automaker company is planning to advertise their newest car. The most important component of an advertisement focuses on fuel economy of the car measured in miles per gallon. Consequently, marketing people hired for that automaker are interested in predicting miles per gallon for the car based on: a total weight of a car measured in kilograms (WEIGHT), acceleration of a car measured in seconds (ACCELERATION), and power measured in horsepower (POWER).

My sample data is like

MPGPOWERWEIGHTACCELERA

18

130350412
15165365211.2
............

What PROC procedure is good for explaining some question?

and If you can give the good answer, please help me.

QUsing an appropriate PROC procedure, provide one plot which visualizes a degree of dependence between given variables (exactly this is what I don't know).
a.What can you conclude based on the plot?
b.Are any regression assumptions (which can be visually examined) violated?
c.If so, which assumptions and for which variables? Is there any way to fix those problems?

Thanks in advance.

Okay I found out this.

Can you help me for questions a, b, c under the plot?

I will attach one jpg.

PROC SGSCATTER

sgscatter.png

4 REPLIES 4
Ksharp
Super User

My statistical theory knowledge is limited. Hope Some experts can give you more detail information.

QUsing an appropriate PROC procedure, provide one plot which visualizes a degree of dependence between given variables (exactly this is what I don't know).

proc corr can do this job.

a.What can you conclude based on the plot?

power and weight has highly positive correlation .that means there are multicollinearity. so you need to drop one of them when you build a reg model.

MPG  with power or weight has highly negative correlation.

b.Are any regression assumptions (which can be visually examined) violated?

multicollinearity is the point.

c.If so, which assumptions and for which variables? Is there any way to fix those problems?

drop one of them (power or weight), So model might be MPG=weight accelera .

I prefer to choose weight not power because power and MPG has more correlation than weight and MPG.

Ksharp

ballardw
Super User

I would also note that the relationships between MPG and either Weight or Power appear to be non-linear.

I would also be tempted to examine this data in Proc G3d.

Reeza
Super User

There's more than one way to answer this question, it depends a bit on what they're looking for and I'm not 100% sure.

Using an appropriate PROC procedure, provide one plot which visualizes a degree of dependence between given variables

For some reason the answer that pops into my mind is proc reg with some of the plot statements (CORR) in that procedure. Proc sgscatter may be appropriate as well, but it looks at the relationship between variables, but I'm not sure I'd classify that as 'degree of dependence'.


What can you conclude based on the plot?


Are any regression assumptions (which can be visually examined) violated?

My answer to this would be the normal distribution of the error terms. And proc reg has some diagnostic plots that allow you to determine if it is violated or not.

Multi collinearity is also a valid suggestion.


If so which assumptions and how to fix?

Fix multi collinearity by not including both terms

Fix non-normal distribution of errors by transforming variables as required, ie log transform for certain variables.

newmkka
Calcite | Level 5

Thanks at all!

This was answers. Is it too bad? : )

a. POWER and WEIGHT is a linear regression and positive direction. However other

variables are nonlinear and negative direction. Only MPG and ACCELARATION look like

no relationships.

b. Normality and Linearity may be violated.

c. POWER and WEIGHT are violated with Normality and Linearity. In such problems, a

nonlinear transformation of variables might cure both problems

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1208 views
  • 0 likes
  • 4 in conversation