Hello all,
I have made a plot looking like the below and i want to highlight were the middle 95% is (meaning approximately from -4 to +4 on the x axis.
Is that a feature in proc sgplot ?
As discussed in the article, one way is to rescale the curve so that it becomes a density. If your data are equally spaced in X, you can get away with just rescaling the Y heights by dividing by the sum of heights. The following statements compute the (scaled) cumulative distribution and then shade the region that excludes the lower and upper alpha/2 proportion of the heights. That means it keeps the interior 1-alpha proportion:
/* compute sum and put it into macro variable */
proc sql;
select sum(y) into :sum
from Have;
quit;
%put &=sum;
/* create the cumulative density variable */
data Cumul;
set Have;
cumul + y / ∑
run;
/* shade where the cumulitive density is greater than alpha/2
and less than 1-alpha/2 */
data Want;
alpha = 0.05; /* ask for 95% region */
set Cumul;
if cumul>=alpha/2 and cumul <=1-alpha/2 then
upper = y;
else
upper = .;
run;
/* graph it */
proc sgplot data=Want;
band x=x upper=upper lower=0;
series x=x y=y/ datalabel markers;
run;
Yes: Use the BAND statement. For an example and discussion, see "Create a density curve with shaded tails."
That article shades the tails, whereas you want to shade the interior. For your problem, the solution will look like this:
data Have;
input x y;
datalines;
-14 0.5
-12 0.6
-10 0.6
-8 0.8
-6 1.2
-4 2.2
-2 4.5
0 17
2 5.4
4 1.9
6 1.4
8 2.1
10 1.4
12 1.3
14 1.4
;
data Want;
set Have;
low = -4; /* lower value to shade */
high = 4; /* upper value to shade */
if x>=low and x <=high then
upper = y;
else
upper = .;
drop low high;
run;
proc sgplot data=Want;
band x=x upper=upper lower=0;
series x=x y=y/ datalabel markers;
run;
As discussed in the article, one way is to rescale the curve so that it becomes a density. If your data are equally spaced in X, you can get away with just rescaling the Y heights by dividing by the sum of heights. The following statements compute the (scaled) cumulative distribution and then shade the region that excludes the lower and upper alpha/2 proportion of the heights. That means it keeps the interior 1-alpha proportion:
/* compute sum and put it into macro variable */
proc sql;
select sum(y) into :sum
from Have;
quit;
%put &=sum;
/* create the cumulative density variable */
data Cumul;
set Have;
cumul + y / ∑
run;
/* shade where the cumulitive density is greater than alpha/2
and less than 1-alpha/2 */
data Want;
alpha = 0.05; /* ask for 95% region */
set Cumul;
if cumul>=alpha/2 and cumul <=1-alpha/2 then
upper = y;
else
upper = .;
run;
/* graph it */
proc sgplot data=Want;
band x=x upper=upper lower=0;
series x=x y=y/ datalabel markers;
run;
Yes, sounds like you have a problem. My example handles negative X values without any problem, so be sure to study it carefully. The logic for setting up the points for the BAND statement depends only on the cumulative proportions, not on X.
Hi Rick,
Hmm i cannot understand, why this code gives me a plot looking like this as its the proportion around 0 i need to be shaded. Maybe you can help me out ?
Try alpha=0.05 instead of alpha=0.5.
I don't know. However, I suggest that you avoid overwriting the ABC data set. If you run this code a second time, it will overwrite the ABC data, which will surely corrupt the results. Don't use
data ABC;
set CUMUL;
when CUMUL was generated from ABC.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.