Hi there,
I am really strugling with one problem. I know the LSMEANS can give the adjusted means for a given effect. Now I am using proc quantreg to estimate quantile instead of mean. I wonder how to calculate the adjusted qunantile, for instance the predicted median, for a given effect. Does anyone have an idea? Thank you very much!
Data example
index_time | age | race | cost | weight |
-3 | ||||
-2 | ||||
-1 | ||||
0 | ||||
1 | ||||
-3 | ||||
-2 | ||||
-1 | ||||
0 | ||||
1 | ||||
2 | ||||
3 | ||||
-3 | ||||
-2 | ||||
-1 | ||||
0 | ||||
1 | ||||
2 | ||||
… |
SAS code:
proc quantreg data=test;
class index_time age race;
model cost = index_time age race / quantile=.5;
weight weight;
run;
I want the adjuted median for each level of index_time, adjusting for age race. How to get the adjusted median?
Kane
I believe the syntax is exactly the same as for the other regression procedures in SAS that support the ESTIMATE statement. Post some sample data and the model that you are using and someone can help.
If it is useful, I recently wrote about how to use the ESTIMATE statement to compute the difference in medians between two subgroups.:
Hi Rick,
Actually I do not want to compare the difference of median. I want the adjusted median, which is what LSMEANS statement of some normal regression procedure gives. However it seems that PROC QUANTREG does not have LSMEANS...Do you have any idea?
Thanks.
LSMEANS stand for least squares means, while quantile regression uses check loss (which is piecewise linear) but not square loss. In this sense, LSMEANS do not directly apply for quantile regression.
However, the goal of least squares means is to estimate the marginal means for a balanced population. Similarly, we can estimate balanced quantile effects by (1) balancing the data and (2) fitting a quantile regression model on the balanced data.
(Mimicing the example from http://dawg.utk.edu/glossary/g_least_squares_means.htm🙂
"Suppose you have a treatment applied to 3 trees (experimental unit), and 2 observations (samples) are collected on each. However, one observation is missing, giving values of (45, 36), (56, ), and (37, 41), where parentheses are around each tree. The raw average is simply (45+36+56+37+41)/5 = 43, and note the reduced influence of the second tree since it has fewer values. The least squares mean would be based on a model u + T + S(T), resulting in an average of the tree averages, as follows.
Least squares mean =[ (45+36)/2 + 56 + (37+41)/2 ] / 3 = 45.17 This more accurately reflects the average of the 3 trees, and is less affected by the missing value."
For quantile regression, the balanced data can be (45, 36), (56, 56 ), and (37, 41), where 56 for the second obs is for balance purposes. Then, we can compute balanced quantiles on this balanced data. However, this data-balancing method is still an open-question and can be very difficult for more complicated cases.
The current QUANTREG procedure does not provide this functionality. Wish that some researchers can publish a paper for solving this problem.
Thank you for your reply. Sorry to hear that the QUANTREG procedure does not provide any functions to calculate adjusted median....
Based on your reply, I wonder, is the balance data method you described for quantile regression the only way to do it? Is there any other data balance method? I am just curious.
Thanks,
What is the research question that you are trying to answer?
For instance, I want to know the adjusted median cost for each index_time after adjusting for all other covariates.
The other LSMEANS for quantile regression can be simply computing L'b(t) as described in
http://v8doc.sas.com/sashtml/stat/chap30/sect39.htm#glmlsm
where L is as in the doc page, b is the quantile effect estimates, and t is a quantile level.
It can be explained as the quantile prediction for the L regressors.
This can be implemented by using the ESTIMATE statement on the L vector with the QUANTREG procedure.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.