BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Heejeong
Obsidian | Level 7

Hello,

 

I am running a simple proc glm with one continuous predictor. I wanted to calculate the estimates of Norepinephrine at 1 SD below the mean, mean, and 1 SD above the mean of the continuous predictor (centered value of stress reactivity). 

 

proc glm data=JH.Final3 ;
model Norepinephrine= cPAreactpost;
estimate 'Centered Stress reactivity (1-SD)' intercept 1  cPAreactpost-0.0627096 ;
estimate 'Centered Stress reactivity (Mean)' intercept 1 cPAreactpost 0;
estimate 'Centered  Stress reactivity (1+SD)' intercept 1  cPAreactpost0.0627096;
run;

I have 2 quick questions:

1) Could you please confirm that the estimate syntax is correctly written for me to compare the predicted values of Norepinephrine at 1-SD, mean, and 1+SD of the predictor variables? I see the following output when I run this syntax. 

Heejeong_1-1651994472660.png

 

2) If I wanted to see if these three parameters were significantly different from each other, I would I write the syntax? I have been using both estimate and contrast syntax for categorical variables but wasn't sure how I would write a "contrast" syntax when I have a continuous predictor. 

 

Thank you so much in advance for all your help!

 

All the Best,

Joanna

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

Do not categorize your predictor if that is not needed. Doing so only throws away information which is can be used in the model. Note that your model assumes that the effect of the predictor on the response mean is linear.

 

Similarly to what is shown in the first part of this note, this can be easily done in the Margins macro whether the predictor is centered or not.

 

    proc means data=JH.Final3;
      var cPAreactpost; 
      output out=out mean=mean std=sd;
      run;
    data mdat;
      set out;
      keep cPAreactpost;
      do cPAreactpost = mean-sd, mean, mean+sd;
        output;
      end;
      run;
    %Margins(data=JH.Final3, response=Norepinephrine, model=cPAreactpost,
             margins=cPAreactpost, margindata=mdat,
             options=diff reverse cl)

Alternatively, if the ESTIMATE statement is to be used, then as discussed in this note, you can write your hypotheses in terms of the model to determine the coefficients for the ESTIMATE or CONTRAST statement. Following that process, and assuming that the predictor mean and standard deviation are really zero and 0.0627, then that produces the coefficients as below. As detailed in the note, the coefficients for the difference are determined by subtracting the coefficient vectors for the two means to be compared.

 

 

proc glm data=JH.Final3;
model Norepinephrine=cPAreactpost;
estimate "mean" intercept 1 cPAreactpost 0;
estimate "mean-sd" intercept 1 cPAreactpost -0.0627;
estimate "mean+sd" intercept 1 cPAreactpost 0.0627;
estimate "1 sd change" cPAreactpost 0.0627;
estimate "2 sd change" cPAreactpost 0.1254;
run; quit;

 

 

 

View solution in original post

18 REPLIES 18
PaigeMiller
Diamond | Level 26

Let me throw the question back to you. If the mean is 17.626 and 1 SD is 0.0627096, then what should the value of Mean – 1SD be? Does that equal the number in the output?

 

To get confidence intervals for the mean, you would use the MEANS statement in PROC GLM. Of course, confidence intervals do not use 1SD. If you really want 1SD, you should modify the ALPHA= option in the MEANS statement.

--
Paige Miller
sbxkoenk
SAS Super FREQ

Hello @Heejeong ,

 

As stated by @PaigeMiller , you do not need ESTIMATE nor CONTRAST statements for this.

This is most easily done using the MEANS, LSMEANS ( , SLICE, and | or LSMESTIMATE statement ).

 

But for the sake of writing good ESTIMATE and CONTRAST statements, I put some references to blogs here that can serve as an inspiration for you and others :

 

Usage Note 67024: Using the ESTIMATE or CONTRAST statement or Margins macro to assess continuous variable effects in interactions and splines

67024 - Using the ESTIMATE or CONTRAST statement or Margins macro to assess continuous variable effe...

 

The magical ESTIMATE (and CONTRAST) statements

By Chris Daman on SAS Learning Post April 23, 2012

https://blogs.sas.com/content/sastraining/2012/04/23/the-magical-estimate-and-contrast-statements/

 

"Easy button" for ESTIMATE statements

By Chris Daman on SAS Learning Post April 25, 2012

https://blogs.sas.com/content/sastraining/2012/04/25/easy-button-for-estimate-statements/

 

ESTIMATE Statements - the final installment

By Chris Daman on SAS Learning Post May 2, 2012

https://blogs.sas.com/content/sastraining/2012/05/02/estimate-statements-the-final-installment/

 

How to write CONTRAST and ESTIMATE statements in SAS regression procedures?

By Rick Wicklin on The DO Loop June 6, 2016

https://blogs.sas.com/content/iml/2016/06/06/write-contrast-estimate-statements-sas-regression-proce...

 

Usage Note 24447: Examples of writing CONTRAST and ESTIMATE statements

https://support.sas.com/kb/24/447.html

 

BR,

Koen

Heejeong
Obsidian | Level 7

Thank you so much for your fast response to my question @PaigeMiller!

 

And I apologize, but couldn't quite understand what you were trying to convey with your question back to me. The answer to your question (my train of thought is that I would calculate "17.626-0.0627096") is that the output does not equal this calculation?

 

And thank you so much for directing me to the Means Statement! My only concern is that my predictor is a "continuous variable" so when I tried to run the means statement, I received an error message ("ERROR: Only CLASS variables allowed in this effect") that a continuous variable is not allowed.

 

And thank you, @sbxkoenk for all your helpful links to writing ESTIMATE and CONTRAST statements, I am looking forward to reviewing them! I have the same problem as the above when trying to use "MEANS, LSMEANS (SLICE, and | or LSMESTIMATE statement)." Since my predictor is a continuous variable and I am unable to run use these helpful statements.

 

In another analysis, I did use the Proc Rank syntax to create quartiles of my predictor and ran a proc glm with a categorical predictor variable (the same variable as above), which I think is a more straightforward analysis. However, my result is not significant when my predictor is transformed into a categorical variable so I had to stick with my method of running the main analysis with a continuous predictor first then following up with additional analyses at different levels of the predictor variable.

 

Thank you both SO much for your help and support, I really appreciate it!

 

PaigeMiller
Diamond | Level 26

Sorry, I didn't notice your predictor is a continuous variable.

 

You can use these methods to get predicted value at any X value (including the mean of X, including mean(x) + 1SD and mean(X) – 1SD).

https://blogs.sas.com/content/iml/2014/02/17/the-missing-value-trick-for-scoring-a-regression-model....

--
Paige Miller
sbxkoenk
SAS Super FREQ

Sorry, @Heejeong , I hadn't read your original post thoroughly enough either.

 

In fact, you are doing a simple linear regression.

You can also do that with PROC REG.

 

And here are two examples on how to score new observations with PROC REG ( on top of the missing value trick, already suggested by @PaigeMiller ). PROC SCORE and PROC PLM :

 

Also, instead of using PROC RANK to categorize / discretize your continuous predictor into equal-sized groups, there are methods that optimize the location of the boundaries between the groups such that the predictive-ness of the resulting discretized predictor is maximal.
Weights of Evidence (WOE) is such a method. You can also do it with a decision tree (allowing for one split and multiple branches).

 

Koen

Heejeong
Obsidian | Level 7

Hi @sbxkoenk,

 

Thank you for continuing to share so many helpful links with me! Also, I'm finding the Weights of Evidence (WOE) method to be extremely helpful too, so THANK YOU!

 

 

StatDave
SAS Super FREQ

Do not categorize your predictor if that is not needed. Doing so only throws away information which is can be used in the model. Note that your model assumes that the effect of the predictor on the response mean is linear.

 

Similarly to what is shown in the first part of this note, this can be easily done in the Margins macro whether the predictor is centered or not.

 

    proc means data=JH.Final3;
      var cPAreactpost; 
      output out=out mean=mean std=sd;
      run;
    data mdat;
      set out;
      keep cPAreactpost;
      do cPAreactpost = mean-sd, mean, mean+sd;
        output;
      end;
      run;
    %Margins(data=JH.Final3, response=Norepinephrine, model=cPAreactpost,
             margins=cPAreactpost, margindata=mdat,
             options=diff reverse cl)

Alternatively, if the ESTIMATE statement is to be used, then as discussed in this note, you can write your hypotheses in terms of the model to determine the coefficients for the ESTIMATE or CONTRAST statement. Following that process, and assuming that the predictor mean and standard deviation are really zero and 0.0627, then that produces the coefficients as below. As detailed in the note, the coefficients for the difference are determined by subtracting the coefficient vectors for the two means to be compared.

 

 

proc glm data=JH.Final3;
model Norepinephrine=cPAreactpost;
estimate "mean" intercept 1 cPAreactpost 0;
estimate "mean-sd" intercept 1 cPAreactpost -0.0627;
estimate "mean+sd" intercept 1 cPAreactpost 0.0627;
estimate "1 sd change" cPAreactpost 0.0627;
estimate "2 sd change" cPAreactpost 0.1254;
run; quit;

 

 

 

Heejeong
Obsidian | Level 7

Hello @StatDave, wow, thank you so much for your helpful response and for taking the time to write out the syntax for me!

Sometimes, I will understand the concept of things but can't get the syntax to run or don't feel 100% confident that I got it right, so I am really grateful for your time and help! I'm sorry for another set of questions but I promise that these will be the last ones. I am learning so much and so thankful for this opportunity, so thank you so much again! And if you need any clarification, please let me know!

 

I had a few follow-up questions:

1) When I run the Proc Means syntax that you wrote for me, below is the ouput I see. Although the mean reads as "1.3591194E-8," I wanted to confirm with you that it's safe to use "0" as the mean value for my Estimate statements. 

Heejeong_0-1652033338525.png

2) I actually had a long list of covariates that I didn't show before to simplify my model. But now that I've found an answer to my original question, I just wanted to confirm with you that  I can add my long list of covariates in both the MACRO and ESTIMATE models? Below is my syntax where I've added in my covariates to both the MACRO & ESTIMATE statements:

 proc means data=JH.Final3;
      var cPAreactpost; 
      output out=out mean=mean std=sd;
      run;
    data mdat;
      set out;
      keep cPAreactpost;
      do cPAreactpost = mean-sd, mean, mean+sd;
        output;
      end;
      run;
    %Margins(data=JH.Final3, response=Norepinephrine, model=cage ClinicSex marriedmidus work race_orig  cedu cHHtotalIncome EverSmokeReg Exercise20mins CNSmeds cBMI cCESD cNeuroticism  cChronCondNumb  cAnyStressWide_sum cpa_mlm2 cPAreactpost,
             margins=cPAreactpost, margindata=mdat,
             options=diff reverse cl)

proc glm data=JH.Final3;
model Norepinephrine=cage ClinicSex marriedmidus work race_orig cedu cHHtotalIncome EverSmokeReg Exercise20mins CNSmeds cBMI cCESD cNeuroticism cChronCondNumb cAnyStressWide_sum cpa_mlm2 cPAreactpost;
estimate "mean" intercept 1 cPAreactpost 0;
estimate "mean-sd" intercept 1 cPAreactpost -0.0627;
estimate "mean+sd" intercept 1 cPAreactpost 0.0627;
estimate "1 sd change" cPAreactpost 0.0627;
estimate "2 sd change" cPAreactpost 0.1254;
run; quit;

3) In the Margins syntax, there isn't a PROC GLM or a PROC REG, and I was wondering if I was supposed to run a PROC GLM first then run the Macro? Or does the below part of the MARGIN syntax replace the PROC GLM syntax?

response=Norepinephrine, model=cage ClinicSex marriedmidus work race_orig  cedu cHHtotalIncome EverSmokeReg Exercise20mins CNSmeds cBMI cCESD cNeuroticism  cChronCondNumb  cAnyStressWide_sum cpa_mlm2 cPAreactpost,

4) Below are my outputs for the MACRO and ESTIMATE commands. My initial understanding is that the Predictive Margins (from the MACRO) and Estimate Parameters (from the ESTIMATE) should generate the same values. But I got a bit confused because they are different. Could you please let me know why that might be and which numbers I should be using when I want to discuss how different levels of stress reactivity are differentially related to levels of Norepinephrine? I am currently using the below syntax for graphing my results, so I'm assuming that I should be using the results from the ESTIMATE command?

proc glm data=JH.Final3;
model Norepinephrine=cage ClinicSex marriedmidus work race_orig  cedu cHHtotalIncome EverSmokeReg Exercise20mins CNSmeds cBMI cCESD cNeuroticism  cChronCondNumb  cAnyStressWide_sum cpa_mlm2 cPAreactpost;
estimate "mean" intercept 1 cPAreactpost 0;
estimate "mean-sd" intercept 1 cPAreactpost -0.0627;
estimate "mean+sd" intercept 1 cPAreactpost 0.0627;
estimate "1 sd change" cPAreactpost 0.0627;
estimate "2 sd change" cPAreactpost 0.1254;
store graph5;run;
run; quit;
proc plm restore=graph5 noinfo;
			effectplot fit (x=cPAreactpost);
			run;

MACRO

Heejeong_1-1652034098207.png

ESTIMATE

Heejeong_2-1652034108869.png

5) Lastly, I just wanted to confirm that "1 sd change" estimate is testing the difference between "1-SD estimate value" and "mean." Also that "2 sd change" estimate command is comparing "1-SD estimate value" and "1+SD estimate value." So if I wanted to compare the estimates of "mean" and "1+SD estimate value," what syntax should I be using?

 

StatDave
SAS Super FREQ
Yes, that 1E-8 value is effectively zero. And yes, you can add covariates as you did in MODEL= in the Margins macro call as you did. As noted in the documentation of the macro, it fits the model (using PROC GENMOD) as well as estimates the margins, so you don't need to fit the model beforehand. The values from the macro and from the ESTIMATE statements differ because the ESTIMATE statements use zero values for all of your covariates (add the E option to display the values) while the macro uses the actual observed values in each observation in its computations. That is, the macro by default does not set each covariate at a fixed value. This is one advantage of using margins. The values from the first three ESTIMATE statements are estimated means at the mean, mean-sd, and mean+sd of the predictor when all of the covariates are fixed at zero. The values from the macro don't assume the covariates are fixed. And yes, the "1 (and 2) sd change" estimates are the changes from the mean... again with the covariates fixed at zero. The 1 sd change applies equally to a comparison of the mean to +1sd or the mean to -1sd since the predictor is assumed to have a linear effect.
Heejeong
Obsidian | Level 7

Thank you so much for yet another extremely helpful response, @StatDave 

 

I promise this is the last question. It's great to know that the MARGINS MACRO statement generates more accurate values with actual values of covariates (and not 0 values of covariates). I'm going to use the values from the MARGINS MACRO in the final manuscript but struggling to find a way to graph the predicted margins values.

 

I thoroughly read through the two documents that you shared with me and I only see a way to graph the estimates using the PROC PLM method. Would there be a way for me to graph the PREDICTED MARGINS from the MACRO MARGINS STATEMENT? 

 

Thank you so much for your help!!

 

Heejeong
Obsidian | Level 7

In case it might be helpful, below is the syntax I have been using to plot the model at SD-1, Mean, SD+1 of the continuous variable. Would this already be plotting the Predicted Margins?

 

proc glm data=JH.Final3 ;
model Norepinephrine= cage ClinicSex marriedmidus work race_orig  cedu cHHtotalIncome EverSmokeReg Exercise20mins CNSmeds cBMI cCESD cNeuroticism  cChronCondNumb  cAnyStressWide_sum cpa_mlm2 cPAreactpost;
run;
store graph4;run;
proc plm restore=graph4 noinfo;
effectplot fit (x=cPAreactpost)  /clm;
ods output fitplot=Logfit;
run;

proc sgplot data=Logfit noautolegend;
band upper=_uclm lower=_lclm x=_xcont1 / transparency=.3;
series y=_predicted x=_xcont1;
xaxis values=(-0.0627 0 0.0627) grid offsetmin=.05 offsetmax=.05
label="Positive Affective Responsivity";
yaxis values=(10 to 40  by 5) grid offsetmin=.05 offsetmax=.05 
label="Norepinephrine";
title "Effects of Positive Affective Responsivity on Norepinephrine";
run;

Thank you so much, @StatDave !!

StatDave
SAS Super FREQ
Yes. The Margins macro automatically creates the _MARGINS data set that contains the margin estimates and their confidence limits.in variables ESTIMATE, LOWER, and UPPER. So, you can use similar PROC SGPLOT code to produce the desired plot.

Note that the Margins macro does not produce "more accurate" estimates than the ESTIMATE statement - it just computes predictive margins rather than a particular linear combination of model parameters which is basically what the ESTIMATE statement does. So, they compute different things.
Heejeong
Obsidian | Level 7

Thank you so much!

Heejeong_0-1652045454971.png

 

 

Would the below syntax be the correct way to write the code? When I run this code, I get a figure (which I've copy-pasted below), but I have a weird shape of the confidence interval between the MEAN and 1+SD of the predictor variable. Could this be something wrong with my laptop pixels or would it actually be something that's going on in the data that I would have to address? 

 proc means data=JH.Final3;
      var cPAreactpost; 
      output out=out mean=mean std=sd;
      run;
    data mdat;
      set out;
      keep cPAreactpost;
      do cPAreactpost = mean-sd, mean, mean+sd;
        output;
      end;
      run;
    %Margins(data=JH.Final3, response=B4BNOCRE, model=cage ClinicSex marriedmidus work race_orig  cedu cHHtotalIncome EverSmokeReg Exercise20mins CNSmeds cBMI cCESD cNeuroticism  cChronCondNumb  cAnyStressWide_sum cpa_mlm2 cPAreactpost,
             margins=cPAreactpost, margindata=mdat,
             options=diff reverse cl)

proc sgplot data=_margins noautolegend;
band upper=UPPER lower=LOWER x=cPAreactpost ;
series y=ESTIMATE x=cPAreactpost;
xaxis values=(-0.0627 0 0.0627) grid offsetmin=.05 offsetmax=.05
label="Positive Affective Responsivity";
yaxis values=(10 to 40  by 5) grid offsetmin=.05 offsetmax=.05 
label="Norepinephrine";
title "Effects of Positive Affective Responsivity on Norepinephrine";
run;

 

StatDave
SAS Super FREQ
Remove the VALUES= option in the XAXIS statement. The actual values are computed values that will not be exactly equal to the rough values that you entered in that option.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 18 replies
  • 1665 views
  • 6 likes
  • 4 in conversation