- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am trying to create a scatterplot with my variables rNPS and pNPS.
Unfortunately, the generated scatterplot looks very strange and I can't explain why.
This is my code and the output scatterplot:
ODS Graphics on;
proc sgplot data=xwrk.nps_ebenen_vollständig;
scatter x=pnps y=rnps;
run;
I already checked the correlation between those two variables and there is a positive and significant correlation.
Has anyone already had this problem and can help me?
Thanks a lot!
- Tags:
- scatterplot
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Your data in both variables rNPS and pNPS take on only integer values. This is why you get the appearance shown. In addition, the correlation doesn't appear visually because you may have 1000 data points at a given position on the plot, all are shown exactly on top of one another (because they are all integers), making it look as if there is only a single data point on the plot.
A potential improvement is to use the JITTER option and the JITTERWIDTH option in the SCATTER statement in PROC SGPLOT.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try this:
proc sgplot data=xwrk.nps_ebenen_vollständig;
scatter x=pnps y=rnps / jitter;
run;
The JITTER option will add a small random offset to the X and Y variables before plotting the marker. (It doesn't change your data, only the plot.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
with the jitter option I can now see small data points. I actually wanted to use the scatterplot to show a linear relationship between rNPS and pNPS. But due to the integer values this is a bit confusing in the scatterplot. Do you have any suggestions as to what I can do to show the linear relationship graphically?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To me, this does show the relationship. There is the most "ink" on the diagonal, and much less off-diagonal.
But, this is a limitation of plotting data like this which are all integers; the linear relationship can be hard to see.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Alright, thank you 🙂
Is it possible to change the colour of the data points to see the relationship a little better?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes of course you can change the color, but may I recommend the TRANSPARENCY= option? This will make spots with lots of data darker than spots with a little bit of data. I don't know how it will work with integer data, you would still have to use the JITTER option, but please give it a try.
And try it without the JITTER option as well, plotting integer data this way is something I have never tried, and so trying a bunch of different variations of the plot might turn up one that you really like.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you want a crude use of color, you can overlay the curve and regression line on a heat map of the density of the data. I don't have access to your data, but it might look something like this:
%let DSName = xwrk.nps_ebenen_vollständig;
%let XVar = pNPS;
%let YVar = rNPS;
proc sgplot data=&DSName;
heatmap x=&XVar y=&YVar / xbinsize=1 ybinsize=1 colormodel=TwoColorRamp;
reg x=&XVar y=&YVar / jitter;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you so much, guys!
That helped a lot 😊
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use the REG statement:
proc sgplot data=xwrk.nps_ebenen_vollständig;
reg x=pnps y=rnps / jitter;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
https://blogs.sas.com/content/iml/2012/07/02/create-a-contour-plot-in-sas.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Do you have additional questions? If not, please mark this thread as SOLVED.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Depending on your data, I've found bubble plots helpful to show that some points have more values than others.