turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Studio
- /
- Interpreting Scatter Plot Matrix for sashelp.cars

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-11-2016 01:32 PM

Hi,

I am using the SAS Studio 3.5: Task Reference document to become more familiar with using a GUI for SAS 9.4. The dataset recommended for the example is sashelp.cars.

Within Tasks and Utilities > Tasks > Statistical Tasks > Data Exploration

I have created a Scatter Plot Matrix

With:

Continuous variables: Horsepower, MPG_City and MPG_Highway

Classification variables: Type and DriveTrain

On executing the generated code Scatter Plot Matrices are produced. I have attached the output as an attachment.

What do these Scatter Plot Matrices signify and how can these be interpreted for this particular dataset and the analysis variables?

Also, for future reference does SAS have any online documentation on how to interpret graphs and charts for the examples that they have in their documentation?

Cheers,

Sandesh.

Accepted Solutions

Solution

07-12-2016
12:35 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-11-2016 02:47 PM

Click on the Code tab. That will show you the programming statements and procedures that are being executed. In this case, you will see a call to PROC SGSCATTER. Then you can Google something like

SAS 9.4 "PROC SGSCATTER" site:support.sas.com

and the link to the doc will be somewhere on the first page. Two helpful links:

For the scatter plot matrix, you can see the doc for the SGSCATTER procedure. Each cell contains a scatter plot. Most high schools discuss how to interpret a scatter plot. For these data, each dot is a vehicle. The color of the dot tells what kind of vehicle (sedan, SUV, truck,...) it is. For completeness:

- The first column and second row: This scatter plot shows the MPG for City on the vertical axis and the Horsepower on the horizontal axis. The plot shows that vehicles that have high horsepower (300 or more) tend to have low MPG_City (less than 20). Conversely, low horesepower (100 HP) vehicles tend to have high MPG_City (30 mpg or more). The relationship between the Horsepower and MPG_City does not appear to be linear.
- The first column and third row: This scatter plot shows the MPG for Highway on the vertical axis and the Horsepower on the horizontal. The interpretation is similar to above.
- The second column and third row. This scatter plot shows the MPG_Highway on the vertical and the MPG_City on the horizontal. Here there appears to be a nearly linear relationship, which indicates high correlation between the two measures of fuel economy.

The upper diagonal cells in the matrix repeat the information in the lower triangular cells, except that the horizontal/vertical variables are flipped. If you want to try to use linear regression to predict a response variable (maybe MPG) and from another variable (maybe Horsepower), you would find the plot that has the response variable on the vertical axis and the independent (explanatory ) variable on the horizontal axis.

All Replies

Solution

07-12-2016
12:35 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-11-2016 02:47 PM

Click on the Code tab. That will show you the programming statements and procedures that are being executed. In this case, you will see a call to PROC SGSCATTER. Then you can Google something like

SAS 9.4 "PROC SGSCATTER" site:support.sas.com

and the link to the doc will be somewhere on the first page. Two helpful links:

For the scatter plot matrix, you can see the doc for the SGSCATTER procedure. Each cell contains a scatter plot. Most high schools discuss how to interpret a scatter plot. For these data, each dot is a vehicle. The color of the dot tells what kind of vehicle (sedan, SUV, truck,...) it is. For completeness:

- The first column and second row: This scatter plot shows the MPG for City on the vertical axis and the Horsepower on the horizontal axis. The plot shows that vehicles that have high horsepower (300 or more) tend to have low MPG_City (less than 20). Conversely, low horesepower (100 HP) vehicles tend to have high MPG_City (30 mpg or more). The relationship between the Horsepower and MPG_City does not appear to be linear.
- The first column and third row: This scatter plot shows the MPG for Highway on the vertical axis and the Horsepower on the horizontal. The interpretation is similar to above.
- The second column and third row. This scatter plot shows the MPG_Highway on the vertical and the MPG_City on the horizontal. Here there appears to be a nearly linear relationship, which indicates high correlation between the two measures of fuel economy.

The upper diagonal cells in the matrix repeat the information in the lower triangular cells, except that the horizontal/vertical variables are flipped. If you want to try to use linear regression to predict a response variable (maybe MPG) and from another variable (maybe Horsepower), you would find the plot that has the response variable on the vertical axis and the independent (explanatory ) variable on the horizontal axis.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-12-2016 12:44 AM

Thanks very much @Rick_SAS. It has been ages since high school so there's very little memory of the statistics classes I took then Your explanation is very comprehensive and helpful. The pointers to resources are much appreciated. Will keep them in mind for future data interpretation as well.

Have a nice day.

Cheers,

Sandesh.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-12-2016 06:39 AM

Hi @Rick_SAS,

I have gone back and had a look at the charts with the interpretation you provided. One additional point that I wanted to make was, the segment in row 3, column 1 (Horsepower vs. MPG_City) has a thicker plot compared to that in row 2, column 1 (Horsepower vs. MPG_Highway).

I would interpret these charts as a vehicle would be more fuel efficient on the highway compared to a similar vehicle in the city. This confirms what we would know generally know about driving in the city vis-à-vis driving on the highway.

Cheers,

Sandesh.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Monday - last edited yesterday

We can also have a scatterplot involving more than two variables by grouping them into pairs. In the example below we consider three variables and draw a** scatter plot matrix.** We get 3 pairs of resulting matrix.

**Example**

PROC sgscatter DATA=CARS1;

matrix horsepower invoice length

/ group = type;

title ‘Horsepower vs. Invoice vs. Length for car makers by types’;

RUN;

When we execute the above code, we get the following output: