Solved: Re: PROC SURVEYREG - Box Plot

_maldini_ · Posted 01-16-2023 01:23 PM

Is there a way to produce a box plot using PROC SURVEYREG?

I'm using an LSMEANS statement in PROC SURVEYREG to calculate adjusted means for the various levels of the categorical variable in the CLASS statement (reg_use = never, non-regular use, regular use). I'd like to see these adjusted means in a box plot.

PROC SURVEYMEANS produces a box plot, but the means are not adjusted for the other variables (Adjusting for age_yrs and sex below).

proc surveyreg data=&dataset;
  	STRATUM sdmvstra;
 	CLUSTER sdmvpsu;
 	WEIGHT &weight;
 
	class reg_use (ref="Never") sex;
           
	model htn = reg_use age_yrs sex / solution clparm;
	lsmeans  reg_use / pdiff=all adjust=tukey;

   	format 
	htn					htn_fmt.
	reg_use				reg_usefmt.
	sex					sexfmt.	
	;
run;

Thanks.

Rick_SAS · Posted 01-17-2023 11:45 AM

The doc for the LSMEANS statement indicates that it can produce several kinds of plots. I think the most appropriate plot is the MEANPLOT, which shows the estimated mean and 95% CI for each category on the LSMEANS statement:

lsmeans reg_use / pdiff=all adjust=tukey plots=(meanplot);

View solution in original post

Rick_SAS · Posted 01-17-2023 11:45 AM

The doc for the LSMEANS statement indicates that it can produce several kinds of plots. I think the most appropriate plot is the MEANPLOT, which shows the estimated mean and 95% CI for each category on the LSMEANS statement:

lsmeans reg_use / pdiff=all adjust=tukey plots=(meanplot);

_maldini_ · Posted 01-17-2023 01:28 PM

Thank you.

2 additional questions:

Is there a way to control the order of the levels/groups of the variable used in the LSMEANS statement? I set the reference category in the CLASS statement, but that level/group is last in the plot. I'd like it to be first.
Do you know of a resource that I could use to better understand how to use the SAS documentation? That might sound silly, but I struggle to make sense of the documentation sometimes.

Rick_SAS · Posted 01-17-2023 01:57 PM

The LSMEANS statement uses the same order as the CLASS statement, so look at the documentation of the CLASS statement.

By default, the levels of a classification variable are listed alphabetically by their formatted value (order=formatted). The most useful way to change the order is to sort or otherwise arrange the input data set in the order you want, and then use the ORDER=DATA option on the CLASS statement to specify that the order of levels is found in the data set. For your example, sort by STRATUM and then by the YEAR variable. If that doesn't fix your problem, there are other tricks/techniques we can use.

Regarding how to use the doc, my approach to learning a new procedure is always to start with the "Overview" and "Getting Started" sections of the doc, then browse the syntax section. I leave the "Details" section for when I need to know the math/statistics behind the computation.

Also, know that there is a section of the SAS/STAT doc devoted to "Shared Concepts." Most procedures link to that section if they support one of the shared statements (like LSMEANS).

_maldini_ · Posted 01-17-2023 11:59 PM

< If that doesn't fix your problem, there are other tricks/techniques we can use.>

It didn't, but thanks for the suggestion.

It looks like SAS just puts whatever value I put in the REF= option in the CLASS statement last, in both the LSMEANS table and the plot.

I can just change the reference level to the last value of year (i.e., REF= "2017-2018") instead of the first (i.e., REF= "2009-2010"). That solves the problem. I don't think the reference level matters in this particular situation since I am really just looking for the adjusted means.

CLASS year (ref="2017-2019") sex ;

PaigeMiller · Posted 01-17-2023 02:05 PM

@_maldini_ wrote:

Is there a way to control the order of the levels/groups of the variable used in the LSMEANS statement? I set the reference category in the CLASS statement, but that level/group is last in the plot. I'd like it to be first.
.

Assuming that YEAR is a numeric variable with a format (is it?) then ORDER=INTERNAL in the PROC SURVEYREG statement produces the proper sorting.

--
Paige Miller

_maldini_ · Posted 01-17-2023 11:47 PM

Yes, it is, but that didn't change the output.

VALUE 
yearfmt
				6 = "2009-2010"
				7 = "2011-2012"
				8 = "2013-2014"
				9 = "2015-2016"
				10 = "2017-2018";

proc surveyreg data=&dataset order=internal;
  	STRATUM sdmvstra;
 	CLUSTER sdmvpsu;
 	WEIGHT &weight;

/* HTN Prevalence by year */
/***********************************************************************************************************/                         
	class year (ref="2009-2010") sex ;
/* 	class sex; */

	model htn = year age_yrs sex  / solution clparm;
	lsmeans  year / pdiff=all adjust=tukey plots=(meanplot);

   	format 
	htn					htn_fmt.
	race_dum_white
	race_dum_hisp
	race_dum_other		dummy_fmt.
	
	reg_use				reg_usefmt.
	sex					sexfmt.	
	year 				yearfmt.
	;
run;

FreelanceReinh · Posted 01-17-2023 05:07 PM

@_maldini_ wrote:

Is there a way to control the order of the levels/groups of the variable used in the LSMEANS statement? I set the reference category in the CLASS statement, but that level/group is last in the plot. I'd like it to be first.

Hello @_maldini_,

Using the data from Example 122.8 Comparing Domain Statistics with a formatted numeric class variable added, the only simple way I found to change the tick mark position of the reference category was to use the ASCENDING or DESCENDING option of the MEANPLOT request. But then the order of tick marks depends on the LS mean values, which you don't want (e.g., "ASCENDING" would swap '2011-2012' and '2013-2014' in your example).

Two solutions did work for my test data, but were more complicated:

Creating an output dataset containing the mean plot data
```
ods output MeanPlot=mpdat;
```
(this statement goes before or into the PROC SURVEYREG step) and then using PROC SGPLOT data=mpdat ... (with a SCATTER statement) to create the plot. PROC SGPLOT has an XAXIS statement where you can specify the order of tick marks with the VALUES= option. However, more options would need to be added in order to reproduce the original mean plot from PROC SURVEYREG.
Modifying the ODS graphics template used by PROC SURVEYREG: I copied the PROC TEMPLATE code of Stat.Graphics.MeanPlot (path in the Templates window: Sashelp.Tmplstat\Stat\SurveyReg\Graphics\MeanPlot) from the template browser into the Enhanced Editor, inserted
```
discreteopts=(tickvaluelist=(list of quoted tick values))
```
into the XAXISOPTS= option of the LAYOUT OVERLAY statement, submitted this modified PROC TEMPLATE code and then the PROC SURVEYREG step.
However, modifying ODS templates can be risky (see Re: PROC PSMATCH: Is there a way to change the x-axis label ... also for more details).

SAS Innovate 2025: Call for Content