I would like to make a probability plot with SAS just as the above (created with MINITAB). It's important to have 95% confidence interval bands and the same statistics in the inset.
Just in case, this is the data I used:
| Obs | Bimodal_Var |
| 1 | 11.18227 |
| 2 | 8.349733 |
| 3 | 14.40938 |
| 4 | 24.66227 |
| 5 | 9.279151 |
| 6 | 7.238792 |
| 7 | 15.46339 |
| 8 | 22.47347 |
| 9 | 15.132 |
| 10 | 21.10275 |
| 11 | 8.501284 |
| 12 | 15.77826 |
| 13 | 16.20075 |
| 14 | 17.67517 |
| 15 | 7.934397 |
| 16 | 18.39657 |
| 17 | 11.40699 |
| 18 | 8.399611 |
| 19 | 8.490487 |
| 20 | 22.70707 |
| 21 | 11.09486 |
| 22 | 11.00888 |
| 23 | 22.85953 |
| 24 | 6.783415 |
| 25 | 11.31967 |
| 26 | 25.19409 |
| 27 | 12.12261 |
| 28 | 10.3787 |
| 29 | 22.13491 |
| 30 | 20.40574 |
| 31 | 9.802019 |
| 32 | 15.45317 |
| 33 | 17.87584 |
| 34 | 10.8164 |
| 35 | 10.8377 |
| 36 | 11.4462 |
| 37 | 6.82155 |
| 38 | 10.80242 |
| 39 | 19.35702 |
| 40 | 17.50303 |
| 41 | 16.63369 |
| 42 | 21.50525 |
| 43 | 23.79174 |
| 44 | 10.77436 |
| 45 | 9.775864 |
| 46 | 10.62838 |
| 47 | 9.338187 |
| 48 | 8.149213 |
| 49 | 17.10799 |
| 50 | 26.5875 |
You want this ?
data have;
input Obs Bimodal_Var;
cards;
1 11.18227
2 8.349733
3 14.40938
4 24.66227
5 9.279151
6 7.238792
7 15.46339
8 22.47347
9 15.132
10 21.10275
11 8.501284
12 15.77826
13 16.20075
14 17.67517
15 7.934397
16 18.39657
17 11.40699
18 8.399611
19 8.490487
20 22.70707
21 11.09486
22 11.00888
23 22.85953
24 6.783415
25 11.31967
26 25.19409
27 12.12261
28 10.3787
29 22.13491
30 20.40574
31 9.802019
32 15.45317
33 17.87584
34 10.8164
35 10.8377
36 11.4462
37 6.82155
38 10.80242
39 19.35702
40 17.50303
41 16.63369
42 21.50525
43 23.79174
44 10.77436
45 9.775864
46 10.62838
47 9.338187
48 8.149213
49 17.10799
50 26.5875
;
/*Get N Mean Std */
proc summary data=have ;
var Bimodal_Var;
output out=mean_std_n n=n mean=mean std=std;
run;
data _null_;
set mean_std_n;
call symputx('mean',put(mean,8.2 -l));
call symputx('std' ,put(std,8.3 -l));
call symputx('n' ,n);
run;
%put &=mean &=std;
/*Get AD statistic and P-Value*/
ods select none;
ods output TestsForNormality= TestsForNormality;
proc univariate data=have normal ;
var Bimodal_Var;
run;
ods select all;
data _null_;
set TestsForNormality(where=(Test='Anderson-Darling'));
call symputx('AD',put(Stat,8.3 -l));
call symputx('pvalue',cats(pSign,vvalue(pValue)));
run;
%put &=ad &=pvalue;
/*Get the normal probability table*/
ods graphics /reset=index noborder;
ods listing gpath="%sysfunc(pathname(work))" style=htmlblue;; *Save this plot into a path;
ods select ProbabilityPlot;
proc reliability data=have ;
probplot Bimodal_Var/NOINSET;
run;
ods select all;
%sganno
data sganno;
%SGIMAGE(IMAGE="%sysfunc(pathname(work))\ProbabilityPlot1.png",ANCHOR="topleft",BORDER="FALSE",DRAWSPACE="LAYOUTPERCENT" ,x1=-1,y1=101)
%SGTEXT(LABEL="Mean", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=10,Y1=90)
%SGTEXT(LABEL="&Mean", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=20,Y1=90)
%SGTEXT(LABEL="StDev", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=10,Y1=85)
%SGTEXT(LABEL="&std", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=20,Y1=85)
%SGTEXT(LABEL="N", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=10,Y1=80)
%SGTEXT(LABEL="&n", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=20,Y1=80)
%SGTEXT(LABEL="AD", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=10,Y1=75)
%SGTEXT(LABEL="&ad", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=20,Y1=75)
%SGTEXT(LABEL="P-Value", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=10,Y1=70)
%SGTEXT(LABEL="&pvalue", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=20,Y1=70)
run;
ods graphics/reset noborder;
proc sgplot data=sashelp.class sganno=sganno;
scatter x=weight y=height;
run;
As it so often happens here in the SAS Communities, @Rick_SAS has written a blog post on how to compute these confidence intervals on the percentiles. https://blogs.sas.com/content/iml/2013/05/06/compute-confidence-intervals-for-percentiles-in-sas.htm...
You would have to merge these confidence intervals with the normal probability plot in order to get the plot you want. I believe PROC SGPLOT will allow you to draw these points and lines, after you merge the data sets.
Thanks, @PaigeMiller , I'll try that.
On the other hand, I just found out that PROC RELIABILITY produces a nice normal probability plot with 95% confidence bands. You have to treat the as uncensored. Basically,
ods graphics on;
ods select ProbabilityPlot;
proc reliability data=bimodal_data;
probplot Bimodal_Var;
run;
ods select all;
ods graphics off;It creates this plot:
Without the NOINSET option:
That's not the inset I need. I have to figure out how to create an inset as in the original MINITAB image. That is, with the statistics (and corresponding text) Mean, StDev, N, AD, p-Value (where AD is the Anderson-Darling statistic for the normality test).
You want this ?
data have;
input Obs Bimodal_Var;
cards;
1 11.18227
2 8.349733
3 14.40938
4 24.66227
5 9.279151
6 7.238792
7 15.46339
8 22.47347
9 15.132
10 21.10275
11 8.501284
12 15.77826
13 16.20075
14 17.67517
15 7.934397
16 18.39657
17 11.40699
18 8.399611
19 8.490487
20 22.70707
21 11.09486
22 11.00888
23 22.85953
24 6.783415
25 11.31967
26 25.19409
27 12.12261
28 10.3787
29 22.13491
30 20.40574
31 9.802019
32 15.45317
33 17.87584
34 10.8164
35 10.8377
36 11.4462
37 6.82155
38 10.80242
39 19.35702
40 17.50303
41 16.63369
42 21.50525
43 23.79174
44 10.77436
45 9.775864
46 10.62838
47 9.338187
48 8.149213
49 17.10799
50 26.5875
;
/*Get N Mean Std */
proc summary data=have ;
var Bimodal_Var;
output out=mean_std_n n=n mean=mean std=std;
run;
data _null_;
set mean_std_n;
call symputx('mean',put(mean,8.2 -l));
call symputx('std' ,put(std,8.3 -l));
call symputx('n' ,n);
run;
%put &=mean &=std;
/*Get AD statistic and P-Value*/
ods select none;
ods output TestsForNormality= TestsForNormality;
proc univariate data=have normal ;
var Bimodal_Var;
run;
ods select all;
data _null_;
set TestsForNormality(where=(Test='Anderson-Darling'));
call symputx('AD',put(Stat,8.3 -l));
call symputx('pvalue',cats(pSign,vvalue(pValue)));
run;
%put &=ad &=pvalue;
/*Get the normal probability table*/
ods graphics /reset=index noborder;
ods listing gpath="%sysfunc(pathname(work))" style=htmlblue;; *Save this plot into a path;
ods select ProbabilityPlot;
proc reliability data=have ;
probplot Bimodal_Var/NOINSET;
run;
ods select all;
%sganno
data sganno;
%SGIMAGE(IMAGE="%sysfunc(pathname(work))\ProbabilityPlot1.png",ANCHOR="topleft",BORDER="FALSE",DRAWSPACE="LAYOUTPERCENT" ,x1=-1,y1=101)
%SGTEXT(LABEL="Mean", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=10,Y1=90)
%SGTEXT(LABEL="&Mean", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=20,Y1=90)
%SGTEXT(LABEL="StDev", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=10,Y1=85)
%SGTEXT(LABEL="&std", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=20,Y1=85)
%SGTEXT(LABEL="N", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=10,Y1=80)
%SGTEXT(LABEL="&n", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=20,Y1=80)
%SGTEXT(LABEL="AD", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=10,Y1=75)
%SGTEXT(LABEL="&ad", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=20,Y1=75)
%SGTEXT(LABEL="P-Value", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=10,Y1=70)
%SGTEXT(LABEL="&pvalue", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" , LAYER= "FRONT",X1=20,Y1=70)
run;
ods graphics/reset noborder;
proc sgplot data=sashelp.class sganno=sganno;
scatter x=weight y=height;
run;
@Ksharp thanks again. Your solution works perfectly for a full-size graph. I have a follow-up question:
I want the final graph scaled down to fit in layout region of "height=3 in width=4 in". (I found the hard way that preimage doesn't resize.)
How would you change you code so that the final plot is resized proportionally to "ods graphics / height=3 in width=4 in", including the (proportionally-resized) inset text?
OK. Here is .
data have;
input Obs Bimodal_Var;
cards;
1 11.18227
2 8.349733
3 14.40938
4 24.66227
5 9.279151
6 7.238792
7 15.46339
8 22.47347
9 15.132
10 21.10275
11 8.501284
12 15.77826
13 16.20075
14 17.67517
15 7.934397
16 18.39657
17 11.40699
18 8.399611
19 8.490487
20 22.70707
21 11.09486
22 11.00888
23 22.85953
24 6.783415
25 11.31967
26 25.19409
27 12.12261
28 10.3787
29 22.13491
30 20.40574
31 9.802019
32 15.45317
33 17.87584
34 10.8164
35 10.8377
36 11.4462
37 6.82155
38 10.80242
39 19.35702
40 17.50303
41 16.63369
42 21.50525
43 23.79174
44 10.77436
45 9.775864
46 10.62838
47 9.338187
48 8.149213
49 17.10799
50 26.5875
;
/*Get N Mean Std */
proc summary data=have ;
var Bimodal_Var;
output out=mean_std_n n=n mean=mean std=std;
run;
data _null_;
set mean_std_n;
call symputx('mean',put(mean,8.2 -l));
call symputx('std' ,put(std,8.3 -l));
call symputx('n' ,n);
run;
%put &=mean &=std;
/*Get AD statistic and P-Value*/
ods select none;
ods output TestsForNormality= TestsForNormality;
proc univariate data=have normal ;
var Bimodal_Var;
run;
ods select all;
data _null_;
set TestsForNormality(where=(Test='Anderson-Darling'));
call symputx('AD',put(Stat,8.3 -l));
call symputx('pvalue',cats(pSign,vvalue(pValue)));
run;
%put &=ad &=pvalue;
/*Get the normal probability table*/
ods graphics /reset=index noborder height=3in width=4in;
ods listing gpath="%sysfunc(pathname(work))" style=htmlblue;; *Save this plot into a path;
ods select ProbabilityPlot;
proc reliability data=have ;
probplot Bimodal_Var/NOINSET;
run;
ods select all;
%sganno
data sganno;
%SGIMAGE(IMAGE="%sysfunc(pathname(work))\ProbabilityPlot1.png",ANCHOR="topleft",BORDER="FALSE",DRAWSPACE="LAYOUTPERCENT" ,x1=-1,y1=100,
WIDTH=103,
WIDTHUNIT="PERCENT",
HEIGHT=103,
HEIGHTUNIT="PERCENT"
)
%SGTEXT(LABEL="Mean", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" ,TEXTSIZE=8 , LAYER= "FRONT",X1=15,Y1=90)
%SGTEXT(LABEL="&Mean", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" ,TEXTSIZE=8, LAYER= "FRONT",X1=25,Y1=90)
%SGTEXT(LABEL="StDev", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" ,TEXTSIZE=8, LAYER= "FRONT",X1=15,Y1=85)
%SGTEXT(LABEL="&std", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" ,TEXTSIZE=8, LAYER= "FRONT",X1=25,Y1=85)
%SGTEXT(LABEL="N", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" ,TEXTSIZE=8, LAYER= "FRONT",X1=15,Y1=80)
%SGTEXT(LABEL="&n", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" ,TEXTSIZE=8, LAYER= "FRONT",X1=25,Y1=80)
%SGTEXT(LABEL="AD", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" ,TEXTSIZE=8, LAYER= "FRONT",X1=15,Y1=75)
%SGTEXT(LABEL="&ad", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" ,TEXTSIZE=8, LAYER= "FRONT",X1=25,Y1=75)
%SGTEXT(LABEL="P-Value", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" ,TEXTSIZE=8, LAYER= "FRONT",X1=15,Y1=70)
%SGTEXT(LABEL="&pvalue", BORDER= "FALSE",DRAWSPACE= "LAYOUTPERCENT" ,TEXTSIZE=8, LAYER= "FRONT",X1=25,Y1=70)
run;
ods graphics/reset noborder height=3in width=4in;
proc sgplot data=sashelp.class sganno=sganno noborder;
scatter x=weight y=height;
run;
@Ksharp PERFECT!!!. Plus I learned a few tricks. Thanks!
Dive into keynotes, announcements and breakthroughs on demand.
Explore Now →ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.