- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello, I came across this scatterplot when trying to figure out ways to draw a quadrant scatterplot. This link depicts exactly what I need. Does anyone know how to code it? Many thanks.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Can't program against pictures.
You should provide a NUMERIC value for the boundary between high frequency (and I have to assume LOW FREQUENCY as your picture doesn't show any values of that) and between Low Cost and High Cost. Those numbers would used for REFLINE statements.
Your description is more of SCATTER plot if you want one point for each ID. A BUBBLE plot includes a third variable that indicates the size of the Bubble to so individual points are differentiated by a third measure such as maybe your average claim.
Which variable is your ID? When you discuss specifics you want to make sure to use the variable names as we do not know your data.
See if this gets you started: The refline values should match the rule you used to assign high/low, but based on the value of the caseload or claim values.
proc sgplot data=have; scatter x=caseload y=claim /group=eth2 datalabel=<name of your id variable> ; refline <boundary value between low/high claim> /axis=Y; refline <boundary value between low/high caseload> /axis=X; run;
To place TEXT of your quadrant label you will need to provide EXACTLY 4 records with the claim/caseload values that represent where the TEXT should appear and the text value, not one of the points to plot.Likely a different variable than your current CAT as you have that on every record. The idea is to have a single x y pair with one piece of text to display.
Then a TEXT statement with the x and y variables and TEXT=cat
One way to convert data set to data step is here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@kevsma wrote:
Hello, I came across this scatterplot when trying to figure out ways to draw a quadrant scatterplot. This link depicts exactly what I need. Does anyone know how to code it? Many thanks.
Yes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
With some snark out of the way where the question might have been better phrased "How to code".
The graph is a BUBBLE plot with data labels, with two vertical and one horizontal REFLINE. YAXIS and XAXIS statements control the appearance of the axis. Proc Sgplot should handle that just fine.
Provide some example data and we can help walk you through it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That'd be great, thanks @ballardw !!
Below is a snippet of the sample dataset I am dealing with:
uci: unique identifier of the data
eth2: hispanic or white
svscd: type of services each UCI is receiving, categorical
claim: total spending
rank_caseload and rank_avg are the two ranking variables based on caseload and svsavgclaim
cat: 4 categories I created.
I want to draw a bubble plot with four quadrant sections (identified by the "cat" column where 4 categories are included: high-frequency, low-cost; high-frequency, high-cost; low-frequency, low-cost; and low-frequency, high-cost). I want the x-axis to be caseload values and the y-axis to be claim values or vice versa. I'd want the two ethnicities: Hispanic and White to be represented using different colors. Ideally each dot would represent an ID, so that I can see how dots are clustering around what section for the two ethnicity groups. Hope that makes sense.
Thanks again!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Can't program against pictures.
You should provide a NUMERIC value for the boundary between high frequency (and I have to assume LOW FREQUENCY as your picture doesn't show any values of that) and between Low Cost and High Cost. Those numbers would used for REFLINE statements.
Your description is more of SCATTER plot if you want one point for each ID. A BUBBLE plot includes a third variable that indicates the size of the Bubble to so individual points are differentiated by a third measure such as maybe your average claim.
Which variable is your ID? When you discuss specifics you want to make sure to use the variable names as we do not know your data.
See if this gets you started: The refline values should match the rule you used to assign high/low, but based on the value of the caseload or claim values.
proc sgplot data=have; scatter x=caseload y=claim /group=eth2 datalabel=<name of your id variable> ; refline <boundary value between low/high claim> /axis=Y; refline <boundary value between low/high caseload> /axis=X; run;
To place TEXT of your quadrant label you will need to provide EXACTLY 4 records with the claim/caseload values that represent where the TEXT should appear and the text value, not one of the points to plot.Likely a different variable than your current CAT as you have that on every record. The idea is to have a single x y pair with one piece of text to display.
Then a TEXT statement with the x and y variables and TEXT=cat
One way to convert data set to data step is here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.