BookmarkSubscribeRSS Feed
uzma03505621
Obsidian | Level 7

Hi everyone, 

I am using proc reg for the analysis of my study data.

 Dependent variable= mcs score

independent variable= cat2 cat3 age income 

where cat2 and cat3 are both categorical variables. My reference group is category 1..I have created dummy variables.

My code is as follows:

ods graphics on;
proc reg data=dummyfinal plots(maxpoints=none);
model mcs42=cat2 cat3;
output out=new P=YHAT RSTUDENT=RESID L95M=LOW U95M=HIGH;
run;
quit;
ods graphics off;

 

after getting the predicted value (YHAT) of the dependent variable, I have to obtain the mean mcs scores across the 3 categories (cat1 cat2 cat3) along with the confidence intervals and do multiple comparison tests( eg: Tukey kramer).

 

 

can anyone please help me with the SAS codes that I should run to obtain the following results.

My results should look like this: 

 means and SE
    MCS
    Mean (SE) p value
cat1( reference group)     40.45 (0.94) ∗∗∗ <0.001
cat2     43.76 (0.71) ∗∗∗ <0.001
cat3     46.96 (0.78) ∗∗∗ <0.001
   
5 REPLIES 5
PaigeMiller
Diamond | Level 26

@uzma03505621 wrote:

Hi everyone, 

I am using proc reg for the analysis of my study data.

 Dependent variable= mcs score

independent variable= cat2 cat3 age income 

where cat2 and cat3 are both categorical variables. My reference group is category 1..I have created dummy variables.

My code is as follows:

ods graphics on;
proc reg data=dummyfinal plots(maxpoints=none);
model mcs42=cat2 cat3;
output out=new P=YHAT RSTUDENT=RESID L95M=LOW U95M=HIGH;
run;
quit;
ods graphics off;

 

after getting the predicted value (YHAT) of the dependent variable, I have to obtain the mean mcs scores across the 3 categories (cat1 cat2 cat3) along with the confidence intervals and do multiple comparison tests( eg: Tukey kramer).

 

 

can anyone please help me with the SAS codes that I should run to obtain the following results.

My results should look like this: 

 means and SE
    MCS
    Mean (SE) p value
cat1( reference group)     40.45 (0.94) ∗∗∗ <0.001
cat2     43.76 (0.71) ∗∗∗ <0.001
cat3     46.96 (0.78) ∗∗∗ <0.001
   

There are some things that really aren't clear, such as you say age and time are independent variables, but these are not in your model. In addition, you talk about a reference category of category1, even though you haven't put category1 into the model, and I assume these are three levels of a single variable.

 

So anyway, here is how to handle a categorical variable with 3 levels, which I have named CAT.

 

To get means in this case, you can use PROC GLM, and you don't have to create the dummy variables yourself.

 

proc glm data=dummyfinal;
class cat(ref='1');
model mcs42=cat;
means cat/t;
quit;

 

--
Paige Miller
uzma03505621
Obsidian | Level 7
I'm sorry for the typing error, the correct code is :

ods graphics on;
proc reg data=dummyfinal plots(maxpoints=none);
model mcs42=cat2 cat3 age income;
output out=new P=YHAT RSTUDENT=RESID L95M=LOW U95M=HIGH;
run;
quit;
ods graphics off;

My original variable is called category with three values (1,2,3). I created dummy variables as follows:
If category=2 then cat2=1 else 0
If category=3 then cat3=1 else 0
So when I run the proc reg program, category=1 will be used as reference by-default? ( please correct me if I am wrong)

Thanks for proc glm code, I have used this before. I am being told to use proc reg only, that's why I created dummy variables, I need help with codes to compare the new regression adjusted mcs MEANS (with confidence interval) between these 3 categories.

I appreciate your time and consideration.

Thank you.
PaigeMiller
Diamond | Level 26

@uzma03505621 wrote:
I'm sorry for the typing error, the correct code is :

ods graphics on;
proc reg data=dummyfinal plots(maxpoints=none);
model mcs42=cat2 cat3 age income;
output out=new P=YHAT RSTUDENT=RESID L95M=LOW U95M=HIGH;
run;
quit;
ods graphics off;

My original variable is called category with three values (1,2,3). I created dummy variables as follows:
If category=2 then cat2=1 else 0
If category=3 then cat3=1 else 0
So when I run the proc reg program, category=1 will be used as reference by-default? ( please correct me if I am wrong)

Thanks for proc glm code, I have used this before. I am being told to use proc reg only, that's why I created dummy variables, I need help with codes to compare the new regression adjusted mcs MEANS (with confidence interval) between these 3 categories.

I appreciate your time and consideration.

Thank you.

I don't know how to get the means that you are asking for using PROC REG only. As you can see, it's very easy to get the means from PROC GLM.

--
Paige Miller
uzma03505621
Obsidian | Level 7

I used dummy coding:

 

data dummyfinal;
set finalfile;
if category=2 then cat2=1; else cat2=0;
if category=3 then cat3=1; else cat3=0;

run;

 

Then I did proc reg (unadjusted model without income and age) as follows:

/*unadjusted model*/
ods graphics on;
proc reg data=dummyfinal plots(maxpoints=none);
model mcs42=cat2 cat3;
output out=new P=YHAT RSTUDENT=RESID L95M=LOW U95M=HIGH;
run;
quit;
ods graphics off;

 

My results look this like

Parameter Estimates
Variable Label DF Parameter
Estimate
Standard
Error
t Value Pr > |t|
Intercept Intercept 1 41.48793 0.26104 158.93 <.0001
cat2   1 -4.81016 0.43795 -10.98 <.0001
cat3   1 11.55753 0.28816 40.11 <.0001

Now Yhat is my predicted dependent variable, I want to compare the statistically significant difference in the means of my predicted dependent variable across my three category (independent) variable.

 

So should I do ANOVA+post hoc of this predicted Yhat? or is there any other method to get an output as below:

Unadjusted means and SE
  PCS MCS
Mean (SE) p value Mean (SE) p value
Category 1 (reference) 37.27 (0.96) ∗∗∗ <0.001 40.45 (0.94) ∗∗∗ <0.001
Category2 37.02 (0.97) ∗∗∗ <0.001 43.76 (0.71) ∗∗∗ <0.001
category3 38.38 (1.04)  0.016 46.96 (0.78) ∗∗∗ <0.001
   

 

PaigeMiller
Diamond | Level 26

I think the real problem here is whoever told you that PROC REG has to be used, this is bad advice, when PROC GLM makes this simple.

 

Nevertheless, I still don't know how to do this with PROC REG, specifically I'm not sure how you get the CORRECT standard errors, and so I cannot advise further if PROC REG has to be used.

--
Paige Miller

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 742 views
  • 2 likes
  • 2 in conversation