- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am using the SAS Studio 3.5: Task Reference document to become more familiar with using a GUI for SAS 9.4. The dataset recommended for the example is sashelp.cars.
Within Tasks and Utilities > Tasks > Statistical Tasks > Data Exploration
I have created a Scatter Plot Matrix
With:
Continuous variables: Horsepower, MPG_City and MPG_Highway
Classification variables: Type and DriveTrain
On executing the generated code Scatter Plot Matrices are produced. I have attached the output as an attachment.
What do these Scatter Plot Matrices signify and how can these be interpreted for this particular dataset and the analysis variables?
Also, for future reference does SAS have any online documentation on how to interpret graphs and charts for the examples that they have in their documentation?
Cheers,
Sandesh.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Click on the Code tab. That will show you the programming statements and procedures that are being executed. In this case, you will see a call to PROC SGSCATTER. Then you can Google something like
SAS 9.4 "PROC SGSCATTER" site:support.sas.com
and the link to the doc will be somewhere on the first page. Two helpful links:
For the scatter plot matrix, you can see the doc for the SGSCATTER procedure. Each cell contains a scatter plot. Most high schools discuss how to interpret a scatter plot. For these data, each dot is a vehicle. The color of the dot tells what kind of vehicle (sedan, SUV, truck,...) it is. For completeness:
- The first column and second row: This scatter plot shows the MPG for City on the vertical axis and the Horsepower on the horizontal axis. The plot shows that vehicles that have high horsepower (300 or more) tend to have low MPG_City (less than 20). Conversely, low horesepower (100 HP) vehicles tend to have high MPG_City (30 mpg or more). The relationship between the Horsepower and MPG_City does not appear to be linear.
- The first column and third row: This scatter plot shows the MPG for Highway on the vertical axis and the Horsepower on the horizontal. The interpretation is similar to above.
- The second column and third row. This scatter plot shows the MPG_Highway on the vertical and the MPG_City on the horizontal. Here there appears to be a nearly linear relationship, which indicates high correlation between the two measures of fuel economy.
The upper diagonal cells in the matrix repeat the information in the lower triangular cells, except that the horizontal/vertical variables are flipped. If you want to try to use linear regression to predict a response variable (maybe MPG) and from another variable (maybe Horsepower), you would find the plot that has the response variable on the vertical axis and the independent (explanatory ) variable on the horizontal axis.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Click on the Code tab. That will show you the programming statements and procedures that are being executed. In this case, you will see a call to PROC SGSCATTER. Then you can Google something like
SAS 9.4 "PROC SGSCATTER" site:support.sas.com
and the link to the doc will be somewhere on the first page. Two helpful links:
For the scatter plot matrix, you can see the doc for the SGSCATTER procedure. Each cell contains a scatter plot. Most high schools discuss how to interpret a scatter plot. For these data, each dot is a vehicle. The color of the dot tells what kind of vehicle (sedan, SUV, truck,...) it is. For completeness:
- The first column and second row: This scatter plot shows the MPG for City on the vertical axis and the Horsepower on the horizontal axis. The plot shows that vehicles that have high horsepower (300 or more) tend to have low MPG_City (less than 20). Conversely, low horesepower (100 HP) vehicles tend to have high MPG_City (30 mpg or more). The relationship between the Horsepower and MPG_City does not appear to be linear.
- The first column and third row: This scatter plot shows the MPG for Highway on the vertical axis and the Horsepower on the horizontal. The interpretation is similar to above.
- The second column and third row. This scatter plot shows the MPG_Highway on the vertical and the MPG_City on the horizontal. Here there appears to be a nearly linear relationship, which indicates high correlation between the two measures of fuel economy.
The upper diagonal cells in the matrix repeat the information in the lower triangular cells, except that the horizontal/vertical variables are flipped. If you want to try to use linear regression to predict a response variable (maybe MPG) and from another variable (maybe Horsepower), you would find the plot that has the response variable on the vertical axis and the independent (explanatory ) variable on the horizontal axis.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks very much @Rick_SAS. It has been ages since high school so there's very little memory of the statistics classes I took then 🙂 Your explanation is very comprehensive and helpful. The pointers to resources are much appreciated. Will keep them in mind for future data interpretation as well.
Have a nice day.
Cheers,
Sandesh.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Rick_SAS,
I have gone back and had a look at the charts with the interpretation you provided. One additional point that I wanted to make was, the segment in row 3, column 1 (Horsepower vs. MPG_City) has a thicker plot compared to that in row 2, column 1 (Horsepower vs. MPG_Highway).
I would interpret these charts as a vehicle would be more fuel efficient on the highway compared to a similar vehicle in the city. This confirms what we would know generally know about driving in the city vis-à-vis driving on the highway.
Cheers,
Sandesh.