Hello community.
I want to generate a scatter plot with custom markers, which are specified in an attribute map. I have IDs, each has 1-2 values with different cat1 and cat2. I want to assign certain marker based on the cat1 and fillcolor based on cat2 values for each particular point. Cat1 can have one of 5 predetermined values, different symbol for each. Cat2 has one of 2 values, coded as red or blue color. So my attrmap looks like this:
ID | markersymbol | fillcolor
1 | circlefilled | red
1 | circlefilled | blue
2 | diamondfilled | red
2 | starfilled | blue
I understand that this won't work since each Id has conflicting attributes, but how do I avoid that and how should I map the attributes?
I can't remap the ID to include cat2, since this scatter plot should be generated above double waterfall plot.
In the example below, I used the class data set just to drive the example; but the technique should transfer to your use case. The real trick here is to overlay a SERIES plot instead of a SCATTER plot, because the attributes in that plot can independently address an attributes map. The key is to set the line thickness to 0 so that only the markers appear.
Two attribute maps are in the data set. ID1 is used for the bar chart and ID2 is used for the SERIES plot. When the bar chart references ID1, the MARKERSYMBOLs in that map are ignored, so the symbol values there do not matter. ID2 is referenced by the MARKERSYMBOLGROUP variable, so the FILLCOLORS in the ID2 map are ignored. For the marker KEYLEGEND, I set the type to be MARKERSYMBOL so that only the unique symbol shapes are shown. You can set an appropriate title in each legend.
Let me know if you have any questions about this technique.
proc summary data=sashelp.class nway;
class age sex;
var weight;
output out=testdata mean=weight;
run;
data attrmap;
length fillcolor $ 9 markersymbol $ 15;
input ID $ value $ fillcolor $ markersymbol $;
cards;
ID1 F lightpink circlefilled
ID1 M lightblue circlefilled
ID2 11 red circlefilled
ID2 12 red homedownfilled
ID2 13 blue diamondfilled
ID2 14 green squarefilled
ID2 15 orange trianglefilled
ID2 16 purple starfilled
;
run;
proc sgplot data=testdata dattrmap=attrmap;
vbarparm category=age response=weight / group=sex groupdisplay=cluster
attrid=ID1 name="bar";
series x=age y=weight / lineattrs=(thickness=0) group=sex name="scat"
groupms=age msattrid=ID2 groupdisplay=cluster markers;
keylegend "bar" / position=bottomleft;
keylegend "scat" / position=bottomright type=markersymbol;
run;
First, I just want to be sure everyone is clear about the attribute structure. The ID column is used to specify the map in the data set. The VALUE column is used to specify the data values that are mapped to visual attributes in the attributes map. These two columns are REQUIRED. The visual attribute columns can be specified as needed.
For your case, the solution to your problem depends on the plot type you are binding to the attributes map. That is because some plot overlays can bind to more than one attributes map, but most cannot. Can you give us more information about the plot itself?
Apologies for confusion, of course Id was meant to be VALUE, and all this is a preset with ID=ID1.
We have a double waterfall plot, where each subject has two bars: one for each value with blue or red (cat2 has 2 values).
On top of these bars it is required to show Scatter points, that are different values for same subjects with corresponding cat2 and also one of 5 possible cat1's.
In the example below, I used the class data set just to drive the example; but the technique should transfer to your use case. The real trick here is to overlay a SERIES plot instead of a SCATTER plot, because the attributes in that plot can independently address an attributes map. The key is to set the line thickness to 0 so that only the markers appear.
Two attribute maps are in the data set. ID1 is used for the bar chart and ID2 is used for the SERIES plot. When the bar chart references ID1, the MARKERSYMBOLs in that map are ignored, so the symbol values there do not matter. ID2 is referenced by the MARKERSYMBOLGROUP variable, so the FILLCOLORS in the ID2 map are ignored. For the marker KEYLEGEND, I set the type to be MARKERSYMBOL so that only the unique symbol shapes are shown. You can set an appropriate title in each legend.
Let me know if you have any questions about this technique.
proc summary data=sashelp.class nway;
class age sex;
var weight;
output out=testdata mean=weight;
run;
data attrmap;
length fillcolor $ 9 markersymbol $ 15;
input ID $ value $ fillcolor $ markersymbol $;
cards;
ID1 F lightpink circlefilled
ID1 M lightblue circlefilled
ID2 11 red circlefilled
ID2 12 red homedownfilled
ID2 13 blue diamondfilled
ID2 14 green squarefilled
ID2 15 orange trianglefilled
ID2 16 purple starfilled
;
run;
proc sgplot data=testdata dattrmap=attrmap;
vbarparm category=age response=weight / group=sex groupdisplay=cluster
attrid=ID1 name="bar";
series x=age y=weight / lineattrs=(thickness=0) group=sex name="scat"
groupms=age msattrid=ID2 groupdisplay=cluster markers;
keylegend "bar" / position=bottomleft;
keylegend "scat" / position=bottomright type=markersymbol;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.