For an example, see http://support.sas.com/documentation/cdl/en/statug/68162/HTML/default/viewer.htm#statug_kde_examples...
The memory problem might be caused by using too large a value for NGRID=. If you double that number, SAS uses four times as much memory and computation. Try dialing back that number to NGRID=150 or 200, which is still a very fine grid.
If that doesn't fix the problem, please post your complete PROC KDE code.
> there is no method available to plot it like image.plot in R, right?
PROC KDE provides the PLOTS= option which can plot a surface plot or contour plot of the density estimate. If you need something fancier, you can also output the KDE and use SAS graphics to visualize the estimate. For example, see
How to create a surface plot in SAS
How to create a contour plot in SAS
Let me refer to the manual with regard to bivariate bandwidth selection:
For the bivariate case, Wand and Jones (1993) note that automatic bandwidth selection is both difficult and computationally expensive. Their study of various ways of specifying a bandwidth matrix also shows that using two bandwidths, one in each coordinate’s direction, is often adequate. PROC KDE enables you to adjust the two bandwidths by specifying a multiplier for the default bandwidths recommended by Bowman and Foster (1993😞
Here and
are the sample standard deviations of X and Y, respectively. These are the optimal bandwidths for two independent normal variables that have the same variances as X and Y. They are, therefore, conservative in the sense that they tend to oversmooth the surface.
The bandwidth calculation due to Bowman and Foster is performed internally by PROC KDE, and the initial bandwidths are set accordingly.
You can specify the BWM= option to adjust the aforementioned bandwidths to provide the appropriate amount of smoothing for your application.
The final bandwidth used for computing the KDE is the initial bandwidth times BWM. If you want more smoothing than the default, set BWM > 1.0. If you want less smoothing than the default, set BMW < 1.0.
Your IML code
* calculate bwm plug-in;
proc iml;
use data;
read all var {X Y FREQ} into xy;
n=sum(xy[,3]);
stdx=std(xy[,1])/(n**(1/6));
call symput("stdxg", char(stdx));
stdy=std(xy[,2])/(n**(1/6));
call symput("stdyg", char(stdy));
quit;
duplicates the internal initial bandwidth calculation. When you set BWM to &stdxg and &stdyg in
proc kde data=data;
bivar (X (bwm=&stdxg ngrid=&gridsize gridl=&minX gridu=&maxX )
Y (bwm=&stdyg ngrid=&gridsize gridl=&minY gridu=&maxY )) / plots=all;
freq FREQ;
run;
the final bandwidth used to calculate the KDE is the variance in each dimension rather than the standard deviation. Is this what you want?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.