Hi,
Could you please give me some hints on how to generate this kind of density estimate graph? Which statements and options?
Thank you so much!
Here is an example using the previous SGPANEL code to create a dataset that can be used to create such a graph and the data step needed to add variables to plot:
proc sgpanel data=sashelp.cars; panelby origin /columns=1; density horsepower/ type=kernel; ods output sgpanel=work.paneldata; run; data work.replot; set work.paneldata; if origin= 'Asia' then do; KERNEL_HORSEPOWER____Y=KERNEL_HORSEPOWER____Y+0.008; low=0.008; end; if origin= 'Europe' then do; KERNEL_HORSEPOWER____Y=KERNEL_HORSEPOWER____Y+0.004; low=0.004; end; else if origin='USA' then do; KERNEL_HORSEPOWER____Y=KERNEL_HORSEPOWER____Y; low=0; end; run; proc sgplot data=work.replot; band x= KERNEL_HORSEPOWER____X upper=KERNEL_HORSEPOWER____y lower=low/ group=origin ; series x= KERNEL_HORSEPOWER____X y= KERNEL_HORSEPOWER____Y / group=origin; run;
The vertical offset added to the Y values in the Sgpanel output are based on examining the result to pick a "reasonable" value that shows an overlap. To add color we will use a BAND plot with the upper value the Y coordinate and the lower value the base value of the offsets used to create the separation.
Warning: By default you want the first group value band plotted to have the largest Y offset. The order of the group variable will be the order drawn and the "first" is under the others. So if the offsets aren't done that way you will get some odd appearance. By default the fill colors are solid. You can set a transparency value using the FILLATTRS=option but depending on what your actual data means may not be terribly useful.
The series statement may not be needed but can be used to provide a darker line for the upper boundary of the Band plot area.
I used the default variable names from the SGPANEL output for clarity. Colors will depend upon you active style or any STYLEATTRS overrides in the SGPlot code.
For your project what would the different "baseline" bottom of the density represent?
Since a density plot uses a response variable for the height of the density curve the assumption is that the base is zero such as no count of observations for some value(s) of the value displayed on the X axis.
So a basic density plot using that plot statement may not work.
SGPanel can easily make multiple panels with similar density curves but would not show any of the vertical overlap of your example. Here is an example with the SASHELP.CARS data set you should have available.
proc sgpanel data=sashelp.cars; panelby origin /columns=1; density horsepower/ type=kernel; run;
Here the variable on the Panelby statement creates a separate graph for each level. The option Columns=1 means that the graphs are all stacked vertically.
Thank you very much. All baselines are zeros for the three groups. If I understand you correctly, SAS can't overlap the three graphs vertically.
SAS can create graphs like this, but the SGPLOT procedures do not support a simple option that creates this plot automatically. Instead, you should consider a data visualization that USES PROC SGPANEL to display each plot in a separate row in a panel of graphs.
Here is an example using the previous SGPANEL code to create a dataset that can be used to create such a graph and the data step needed to add variables to plot:
proc sgpanel data=sashelp.cars; panelby origin /columns=1; density horsepower/ type=kernel; ods output sgpanel=work.paneldata; run; data work.replot; set work.paneldata; if origin= 'Asia' then do; KERNEL_HORSEPOWER____Y=KERNEL_HORSEPOWER____Y+0.008; low=0.008; end; if origin= 'Europe' then do; KERNEL_HORSEPOWER____Y=KERNEL_HORSEPOWER____Y+0.004; low=0.004; end; else if origin='USA' then do; KERNEL_HORSEPOWER____Y=KERNEL_HORSEPOWER____Y; low=0; end; run; proc sgplot data=work.replot; band x= KERNEL_HORSEPOWER____X upper=KERNEL_HORSEPOWER____y lower=low/ group=origin ; series x= KERNEL_HORSEPOWER____X y= KERNEL_HORSEPOWER____Y / group=origin; run;
The vertical offset added to the Y values in the Sgpanel output are based on examining the result to pick a "reasonable" value that shows an overlap. To add color we will use a BAND plot with the upper value the Y coordinate and the lower value the base value of the offsets used to create the separation.
Warning: By default you want the first group value band plotted to have the largest Y offset. The order of the group variable will be the order drawn and the "first" is under the others. So if the offsets aren't done that way you will get some odd appearance. By default the fill colors are solid. You can set a transparency value using the FILLATTRS=option but depending on what your actual data means may not be terribly useful.
The series statement may not be needed but can be used to provide a darker line for the upper boundary of the Band plot area.
I used the default variable names from the SGPANEL output for clarity. Colors will depend upon you active style or any STYLEATTRS overrides in the SGPlot code.
Thank you so much! You are amazing!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.