Hi All,
I have this issue with the box plot is not spreading the data.
Please let me know if you have any suggestions.
The goal is to have a spread like this
data example;
input subjid $ parcat2 $ parcat2n aval;
datalines;
SUBJID PARCAT2 PARCAT2N AVAL
002 E 5 23.94
002 A 1 26.33
002 D 4 26.31
002 B 2 23.99
002 C 3 25.91
003 E 5 25.32
003 A 1 26.46
003 D 4 25.83
003 B 2 27.34
003 C 3 24.94
004 A 1 26.12
004 B 2 24.19
004 E 5 23.65
004 C 3 25.55
004 D 4 25.16
008 D 4 26.89
008 E 5 27.84
008 C 3 27.26
008 A 1 27.84
008 B 2 27.36
009 B 2 28.72
009 C 3 28.12
009 A 1 28.73
009 D 4 29.58
009 E 5 29.64
010 D 4 27.58
010 E 5 26.61
010 C 3 26.29
010 A 1 28.36
010 B 2 27.32
011 C 3 23.03
011 D 4 21.77
011 B 2 22.76
011 E 5 23.35
011 A 1 24.55
012 A 1 30.75
012 B 2 30.95
012 E 5 32.1
012 C 3 31.39
012 D 4 31.9
013 E 5 30.45
013 A 1 30.87
013 D 4 32.17
013 B 2 30.15
013 C 3 31.65
014 D 4 24.66
014 E 5 24.15
014 C 3 25.15
014 A 1 25.52
014 B 2 24.79
015 A 1 28.94
015 B 2 29.96
015 E 5 29.95
015 C 3 28.66
015 D 4 27.13
016 B 2 30.56
016 C 3 29.4
016 A 1 31.16
016 D 4 30.04
016 E 5 30.62
017 D 4 28.08
017 E 5 26.49
017 C 3 28.54
017 A 1 27.94
017 B 2 27.96
018 B 2 31.48
018 C 3 31.7
018 A 1 32.24
018 D 4 31.49
018 E 5 31.32
019 D 4 28.91
019 E 5 27.85
019 C 3 28.35
019 A 1 28.91
019 B 2 29.21
021 A 1 25.7
021 B 2 27.7
021 E 5 26.25
021 C 3 25.93
021 D 4 26.48
023 C 3 27.12
023 D 4 27.52
023 B 2 27.32
023 E 5 26.43
023 A 1 27.91
024 A 1 29.19
024 B 2 28.31
024 E 5 28.4
024 C 3 29.99
024 D 4 28.81
025 C 3 30.94
025 D 4 31.61
025 B 2 31.61
025 E 5 30.88
025 A 1 31.47
027 C 3 29.87
027 D 4 29.55
027 B 2 28.88
027 E 5 29.58
027 A 1 29.62
028 C 3 30.12
028 D 4 28.7
028 B 2 28.98
028 E 5 28.19
028 A 1 29.32
030 B 2 31.32
030 C 3 30.79
030 A 1 30.96
030 D 4 30.5
030 E 5 30.77
031 E 5 30.64
031 A 1 30
031 D 4 29.74
031 B 2 30.2
031 C 3 30.23
032 E 5 26.61
032 A 1 26.26
032 D 4 26.66
032 B 2 27.46
032 C 3 27.91
036 B 2 26.85
036 C 3 28.25
036 A 1 26.97
036 D 4 27.93
036 E 5 28.34
037 B 2 33.96
037 C 3 34.95
037 A 1 33.63
037 D 4 34.35
037 E 5 34.78
038 E 5 27.13
038 A 1 26.31
038 D 4 26.73
038 B 2 25.68
038 C 3 26.53
039 B 2 29.28
039 C 3 28.33
039 A 1 28.98
039 D 4 28.35
039 E 5 27.66
041 D 4 26.02
041 E 5 25.79
041 C 3 26.1
041 A 1 25.71
041 B 2 25.42
044 D 4 31.84
044 E 5 32.19
044 C 3 30.72
044 A 1 29.9
044 B 2 31.22
045 B 2 26.42
045 C 3 24.93
045 A 1 26.79
045 D 4 25.54
045 E 5 27.6
046 A 1 28.63
046 B 2 29.1
046 E 5 28.46
046 C 3 28.58
046 D 4 28.08
;
run;
* the data is overlapped here ;
proc sgplot data=example;
vbox aval / category=parcat2 group=parcat2 /*discreteoffset=0.4 boxwidth=0.3 */groupdisplay=cluster clusterwidth=0.8 lineattrs=(pattern=solid THICKNESS=2) nofill
meanattrs=(symbol=plus color=black size=20);
scatter x=parcat2 y=aval / group=parcat2 groupdisplay=cluster clusterwidth=0.8 transparency=0.5;
styleattrs datasymbols=(circlefilled);
yaxis label='Mean' grid valuesformat=f8.1 offsetmax=0.1;
xaxis offsetmax=0.08 fitpolicy=rotate Label="" ;
keylegend / title="Treatment" ;
run;
data analysis_random ;
set example ;
shift=parcat2n+rannor(0)/10;
run ;
/*trying to fix the issue but the x axis is missed up*/
proc sgplot data=analysis_random;
vbox aval / category=parcat2n group=parcat2n /*discreteoffset=0.4 boxwidth=0.3 */groupdisplay=cluster clusterwidth=0.8 lineattrs=(pattern=solid THICKNESS=2) nofill
meanattrs=(symbol=plus color=black size=20);
scatter x=shift y=aval / group=parcat2n groupdisplay=cluster clusterwidth=0.8 transparency=0.5;
styleattrs datasymbols=(circlefilled);
yaxis label='Mean' grid valuesformat=f8.1 offsetmax=0.1;
xaxis offsetmax=0.08 fitpolicy=rotate Label="" ;
keylegend / title="Treatment" ;
run;
You can a bit of want you want easily with the JITTER options.
proc sgplot data=example; vbox aval / category=parcat2 group=parcat2 /*discreteoffset=0.4 boxwidth=0.3 */ groupdisplay=cluster clusterwidth=0.8 lineattrs=(pattern=solid THICKNESS=2) nofill meanattrs=(symbol=plus color=black size=20) ; scatter x=parcat2 y=aval / group=parcat2 groupdisplay=cluster clusterwidth=0.8 transparency=0.5 jitter jitterwidth=3 ; styleattrs datasymbols=(circlefilled); yaxis label='Mean' grid valuesformat=f8.1 offsetmax=0.1; xaxis offsetmax=0.08 fitpolicy=rotate Label="" ; keylegend / title="Treatment" ; run;
However that only affects values that are identical, not just "close".
Part of your issue is use of a value like E for the xaxis. It is real hard to get E +/- 0.5 for example.
You could change your "parcat2" variable to something else that is actually numeric so that 2 +/- 0.5 would mean something. Use a custom format so that 1='A' 2='B' etc. to display the letters as needed.
Then use that format to get the Xaxis to display as desired but plot the SCATTER with an X2AXIS and suppress any appearance for that axis at all.
proc format; value parcat2n 1='A' 2='B' 3='C' 4='D' 5='E' ; run; proc sgplot data=analysis_random; vbox aval / category=parcat2n group=parcat2n /*discreteoffset=0.4 boxwidth=0.3 */ groupdisplay=cluster clusterwidth=0.8 lineattrs=(pattern=solid THICKNESS=2) nofill meanattrs=(symbol=plus color=black size=20); scatter x=shift y=aval / group=parcat2n groupdisplay=cluster clusterwidth=0.8 transparency=0.5 x2axis ; styleattrs datasymbols=(circlefilled); yaxis label='Mean' grid valuesformat=f8.1 offsetmax=0.1; xaxis offsetmax=0.08 fitpolicy=rotate Label="" ; x2axis display=none; keylegend / title="Treatment" ; format parcat2n parcat2n.; run;
Please post code, especially for data steps, into a code or text box opened with the "running man" or </> icons that appear above the message lines.
Also your data step has data errors because of the first row of "data" is the variable names.
You can a bit of want you want easily with the JITTER options.
proc sgplot data=example; vbox aval / category=parcat2 group=parcat2 /*discreteoffset=0.4 boxwidth=0.3 */ groupdisplay=cluster clusterwidth=0.8 lineattrs=(pattern=solid THICKNESS=2) nofill meanattrs=(symbol=plus color=black size=20) ; scatter x=parcat2 y=aval / group=parcat2 groupdisplay=cluster clusterwidth=0.8 transparency=0.5 jitter jitterwidth=3 ; styleattrs datasymbols=(circlefilled); yaxis label='Mean' grid valuesformat=f8.1 offsetmax=0.1; xaxis offsetmax=0.08 fitpolicy=rotate Label="" ; keylegend / title="Treatment" ; run;
However that only affects values that are identical, not just "close".
Part of your issue is use of a value like E for the xaxis. It is real hard to get E +/- 0.5 for example.
You could change your "parcat2" variable to something else that is actually numeric so that 2 +/- 0.5 would mean something. Use a custom format so that 1='A' 2='B' etc. to display the letters as needed.
Then use that format to get the Xaxis to display as desired but plot the SCATTER with an X2AXIS and suppress any appearance for that axis at all.
proc format; value parcat2n 1='A' 2='B' 3='C' 4='D' 5='E' ; run; proc sgplot data=analysis_random; vbox aval / category=parcat2n group=parcat2n /*discreteoffset=0.4 boxwidth=0.3 */ groupdisplay=cluster clusterwidth=0.8 lineattrs=(pattern=solid THICKNESS=2) nofill meanattrs=(symbol=plus color=black size=20); scatter x=shift y=aval / group=parcat2n groupdisplay=cluster clusterwidth=0.8 transparency=0.5 x2axis ; styleattrs datasymbols=(circlefilled); yaxis label='Mean' grid valuesformat=f8.1 offsetmax=0.1; xaxis offsetmax=0.08 fitpolicy=rotate Label="" ; x2axis display=none; keylegend / title="Treatment" ; format parcat2n parcat2n.; run;
Please post code, especially for data steps, into a code or text box opened with the "running man" or </> icons that appear above the message lines.
Also your data step has data errors because of the first row of "data" is the variable names.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.