BookmarkSubscribeRSS Feed
Alica
Fluorite | Level 6

Hi, 

I am trying to create a scatterplot with my variables rNPS and pNPS.
Unfortunately, the generated scatterplot looks very strange and I can't explain why.

 

This is my code and the output scatterplot:

 

ODS Graphics on;
proc sgplot data=xwrk.nps_ebenen_vollständig; 
scatter x=pnps y=rnps;
run; 

Alica_1-1708000180831.png

I already checked the correlation between those two variables and there is a positive and significant correlation. 

 

Has anyone already had this problem and can help me?

 

Thanks a lot!

12 REPLIES 12
PaigeMiller
Diamond | Level 26

Your data in both variables rNPS and pNPS take on only integer values. This is why you get the appearance shown. In addition, the correlation doesn't appear visually because you may have 1000 data points at a given position on the plot, all are shown exactly on top of one another (because they are all integers), making it look as if there is only a single data point on the plot.

 

A potential improvement is to use the JITTER option and the JITTERWIDTH option in the SCATTER statement in PROC SGPLOT.

--
Paige Miller
Rick_SAS
SAS Super FREQ

Try this:

proc sgplot data=xwrk.nps_ebenen_vollständig; 
   scatter x=pnps y=rnps / jitter;
run; 

The JITTER option will add a small random offset to the X and Y variables before plotting the marker. (It doesn't change your data, only the plot.)

 

Alica
Fluorite | Level 6

with the jitter option I can now see small data points. I actually wanted to use the scatterplot to show a linear relationship between rNPS and pNPS. But due to the integer values this is a bit confusing in the scatterplot. Do you have any suggestions as to what I can do to show the linear relationship graphically?

 

Alica_0-1708014119754.png

 

PaigeMiller
Diamond | Level 26

To me, this does show the relationship. There is the most "ink" on the diagonal, and much less off-diagonal.

 

But, this is a limitation of plotting data like this which are all integers; the linear relationship can be hard to see.

--
Paige Miller
Alica
Fluorite | Level 6

Alright, thank you 🙂 

Is it possible to change the colour of the data points to see the relationship a little better?

PaigeMiller
Diamond | Level 26

Yes of course you can change the color, but may I recommend the TRANSPARENCY= option? This will make spots with lots of data darker than spots with a little bit of data. I don't know how it will work with integer data, you would still have to use the JITTER option, but please give it a try.


And try it without the JITTER option as well, plotting integer data this way is something I have never tried, and so trying a bunch of different variations of the plot might turn up one that you really like.

--
Paige Miller
Rick_SAS
SAS Super FREQ

If you want a crude use of color, you can overlay the curve and regression line on a heat map of the density of the data. I don't have access to your data, but it might look something like this:

 

%let DSName = xwrk.nps_ebenen_vollständig; 
%let XVar = pNPS;
%let YVar = rNPS;

proc sgplot data=&DSName; 
   heatmap x=&XVar y=&YVar / xbinsize=1 ybinsize=1 colormodel=TwoColorRamp;
   reg x=&XVar y=&YVar / jitter;
run; 

Rick_SAS_0-1708015952211.png

 

Alica
Fluorite | Level 6

Thank you so much, guys!

That helped a lot 😊

Rick_SAS
SAS Super FREQ

Use the REG statement:

 

proc sgplot data=xwrk.nps_ebenen_vollständig; 
   reg x=pnps y=rnps / jitter;
run; 
Rick_SAS
SAS Super FREQ

Do you have additional questions? If not, please mark this thread as SOLVED.

tc
Lapis Lazuli | Level 10 tc
Lapis Lazuli | Level 10

Depending on your data, I've found bubble plots helpful to show that some points have more values than others.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 2346 views
  • 13 likes
  • 5 in conversation