BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TomHsiung
Pyrite | Level 9
proc reg data=work.d200;
model average_daily_dose_during_the_in=age__y_ BSA gender_code af
hypertension chf VAR94 AKI_for_T_test T_test___indication AF_and_Warfarin_History
var33 var34 / selection=stepwise SLE=0.05 SLS=0.20 vif clb cli clm;
run;

The options of clm and cli would output the confidence and prediction intervals after the regression. But, the output was based on each individual observation. I want to know the overall confidence and prediction intervals based on each group of observations. How?

 

Screen Shot 2018-01-09 at 11.44.27 PM.png

 

Tom

1 ACCEPTED SOLUTION
12 REPLIES 12
Reeza
Super User

I want to know the overall confidence and prediction intervals based on each group of observations. How?

 

 

What does that mean? What's a "group" to you, there's no clear logic in the code that defines what would be a group.

 

If you're looking for estimates at specific data points you can score your data using any of the techniques illustrated here:

https://blogs.sas.com/content/iml/2014/02/19/scoring-a-regression-model-in-sas.html

TomHsiung
Pyrite | Level 9

Hi, Reeza

 

Sorry for the delay. My intention is to get the 95% CI and PI for pre-defined groups. For short, the y response variable is average daily dose (mg), for example, and the predictor variables including continuous quantitative variables such as age, body surface area, serum concentration of albumin, and other dummy (qualitative) variables such as whether the congestive heart failure present, whether specific genotype present, whether hospitalization for longer than 14 days, etc.

 

Obviously, the each qualitative variables divided these patients into two subgroups (the condition present or not). I want to know the 95% CI and PI for both subgroups. It's that when I am investigating the 95% CI and PI for both groups divided by a specific dummy variable, other predictor variables are adjusted (by the inherent nature of linear regression).

 

I want to get the 95% CI and PI for the both subgroups.

 

Tom

Reeza
Super User

Try the approach here:

https://blogs.sas.com/content/iml/2016/06/22/sas-effectplot-statement.html

 

It still sounds like you're scoring the data, once with each estimate on group and the averages of the other values I assume. 

 

The post here will answer your question on how to score data.

https://blogs.sas.com/content/iml/2014/02/19/scoring-a-regression-model-in-sas.html

TomHsiung
Pyrite | Level 9

It's a little more difficult to understand.

 

Tom

Reeza
Super User

@TomHsiung wrote:

It's a little more difficult to understand.

 

Tom


And I don't know what that means...

TomHsiung
Pyrite | Level 9

I need to do much work. The State has a function called "margins". But it seems that margins did not output confidence interval of means by subgroups.

 

Tom

TomHsiung
Pyrite | Level 9

From references, I got the formulas to compute 95% confidence and 95% prediction intervals, respectively. To know the 95% CI and PI, "yhat" and standard error of the estimates must be known first. Luckily, yhat and standard error of the estimate could be estimated from he sample data set and the distributions of residual (y - yhat) and (yhat) are luckily normal. As a consequence, the 95% CI and PI could be computed.

 

My issue is that my linear regression model has multiple predictors, including several continuous quantitative variables mixed with several other dummy (qualitative) variables. Those dummy variables are factors that divides my observations into two groups (presence of something, or absence of). I just want to know the 95% CI and PI for these two groups. SAS University Edition output a 95% CI and PI for each observation, which is not what I want.

 

Tom

 

PS:

 

95% CI after (simple) linear regression:

 

Screen-Shot-2017-10-16-at-8.46.44-PM

 

And for 95% PI the formula is:

 

Screen-Shot-2017-10-16-at-9.47.24-PM

TomHsiung
Pyrite | Level 9

Or is there any method to estimate the 95%CI and PI according to independent variables with pre-defined specific values. For instance, I set the age to 60 years, the body surface area to 1.73 m2, the serum albumin level to 30 g/L?

 

Tom

Reeza
Super User

Yes, thats what scoring does, there's examples of the several ways to do this in the blog post I initially linked to. 

 


@TomHsiung wrote:

Or is there any method to estimate the 95%CI and PI according to independent variables with pre-defined specific values. For instance, I set the age to 60 years, the body surface area to 1.73 m2, the serum albumin level to 30 g/L?

 

Tom


 

TomHsiung
Pyrite | Level 9

Would you please make a example? The blog you mentioned is a bit difficult to understand. The syntax in the blog is not straightforward. Much appreciated!

 

Tom

PaigeMiller
Diamond | Level 26

@TomHsiung wrote:

Would you please make a example? The blog you mentioned is a bit difficult to understand. The syntax in the blog is not straightforward. Much appreciated!

 

Tom


What is difficult to understand? Please be specific. In your example, you are using PROC REG. In the blog post, there is an example using PROC REG.

 

Also, another approach is given in my reply here: https://communities.sas.com/t5/SAS-Statistical-Procedures/Export-PROC-PRINQUAL-data/m-p/429561#M2256...

--
Paige Miller

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 23902 views
  • 0 likes
  • 3 in conversation