BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TomHsiung
Lapis Lazuli | Level 10
PROC REG DATA=WORK.D201;
MODEL Average_daily_dose_during_the_in = Age__y_ Gender_code BSA AF Hypertension CHF Hypoalbuminemia_code AKI_for_T_test Potential_amiodarone_DDI T_test___indication VAR33 VAR34 AKI_2C9 AKI_VKORC1 / seleciton=stepwise SLE=0.05 SLS=0.20 vif clb clm cli;
STORE WORK.DOSEMODEL / LABEL='Linear Regression';
RUN;

PROC PLM RESTORE=WORK.DOSEMODEL ALPHA=0.05;
SCORE DATA=WORK.PREDICTED out=WORK.NEWDOSE
predicted lclm uclm;
RUN;

PROC PRINT DATA=WORK.NEWDOSE;
VAR Age__y_ Gender_code BSA AF Hypertension CHF Hypoalbuminemia_code AKI_for_T_test Potential_amiodarone_DDI T_test___indication VAR33 VAR34 AKI_2C9 AKI_VKORC1 predicted lclm uclm;
RUN;

 

Just want to compute the confidence interval of dependent variable for new observations based on the result of a linear regression model. But no luck.

 

Screen Shot 2021-03-07 at 10.24.20 PM.jpg

1 ACCEPTED SOLUTION

Accepted Solutions
STAT_Kathleen
SAS Employee

The item store in PROC REG will only generate the predicted observations and not intervals in the PLM procedure.  The item store from PROC REG only stores the parameter estimates so only a subset options are available.  If you have a continuous response use the GLM procedure to estimate the model and then PLM procedure to obtain predicted observations and intervals for the new data set.  For example,

 

/* SAS CODE FOLLOWS */

data fitness;

      input age weight oxygen runtime restpulse runpulse maxpulse;

      datalines;

   44 89.47  44.609 11.37 62 178 182

   40 75.07  45.313 10.07 62 185 185

   44 85.84  54.297  8.65 45 156 168

   42 68.15  59.571  8.17 40 166 172

   38 89.02  49.874  9.22 55 178 180

   47 77.45  44.811 11.63 58 176 176

   40 75.98  45.681 11.95 70 176 180

   43 81.19  49.091 10.85 64 162 170

   44 81.42  39.442 13.08 63 174 176

   38 81.87  60.055  8.63 48 170 186

   44 73.03  50.541 10.13 45 168 168

   45 87.66  37.388 14.03 56 186 192

   45 66.45  44.754 11.12 51 176 176

   47 79.15  47.273 10.60 47 162 164

   54 83.12  51.855 10.33 50 166 170

   49 81.42  49.156  8.95 44 180 185

   51 69.63  40.836 10.95 57 168 172

   51 77.91  46.672 10.00 48 162 168

   48 91.63  46.774 10.25 48 162 164

   49 73.37  50.388 10.08 67 168 168

   57 73.37  39.407 12.63 58 174 176

   54 79.38  46.080 11.17 62 156 165

   52 76.32  45.441  9.63 48 164 166

   50 70.87  54.625  8.92 48 146 155

   ;

  run;

 

data new;

input age weight oxygen runtime restpulse runpulse maxpulse;

datalines;

   51 67.25  45.118 11.08 48 172 172

   54 91.63  39.203 12.88 44 168 172

   51 73.71  45.790 10.47 59 186 188

   57 59.08  50.545  9.93 49 148 155

   49 76.32  48.673  9.40 56 186 188

   48 61.24  47.920 11.50 52 170 176

   52 82.78  47.467 10.50 53 170 172

   ;

  run;

 proc glm data=fitness;

      model Oxygen=Age Weight RunTime RunPulse RestPulse MaxPulse;

     store glmres;

quit;

 

proc plm restore=glmres;

 score data=new out=newout predicted=predoxy lcl=lcl ucl=ucl lclm=lclm uclm=uclm;

 quit;

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

Are there confidence intervals shown from your PROC REG? Or are they missing in PROC REG as well?

--
Paige Miller
TomHsiung
Lapis Lazuli | Level 10

@PaigeMiller wrote:

Are there confidence intervals shown from your PROC REG? Or are they missing in PROC REG as well?


The Proc REG did output confidence intervals for linear regression coefficient parameters.

ballardw
Super User

I'm going out on limb with some observations.

First you have a model independent variable named Gender_code. Typically "gender" in biology is 2 categories. But regardless it is almost certainly not a continuous variable. Proc Reg is the basic regression proc and expects the result of an OLS equation like y =mx+b. to make sense numerically. If "x" is categorical and only takes two value values then the "m" doesn't likely make much sense. SAS provides a number of regression procedures that allow use of Class variables that have categories instead of continuous values. The more "categories" are involved the less sense.

 

Your example data shows a suspicious number of 0/1 values, like perhaps almost all of those variables are categories.

 

Perhaps you really should be looking at Proc GLM or another procedure entirely.

TomHsiung
Lapis Lazuli | Level 10
Hi,

Thanks. I will make further trials this weekend.
STAT_Kathleen
SAS Employee

The item store in PROC REG will only generate the predicted observations and not intervals in the PLM procedure.  The item store from PROC REG only stores the parameter estimates so only a subset options are available.  If you have a continuous response use the GLM procedure to estimate the model and then PLM procedure to obtain predicted observations and intervals for the new data set.  For example,

 

/* SAS CODE FOLLOWS */

data fitness;

      input age weight oxygen runtime restpulse runpulse maxpulse;

      datalines;

   44 89.47  44.609 11.37 62 178 182

   40 75.07  45.313 10.07 62 185 185

   44 85.84  54.297  8.65 45 156 168

   42 68.15  59.571  8.17 40 166 172

   38 89.02  49.874  9.22 55 178 180

   47 77.45  44.811 11.63 58 176 176

   40 75.98  45.681 11.95 70 176 180

   43 81.19  49.091 10.85 64 162 170

   44 81.42  39.442 13.08 63 174 176

   38 81.87  60.055  8.63 48 170 186

   44 73.03  50.541 10.13 45 168 168

   45 87.66  37.388 14.03 56 186 192

   45 66.45  44.754 11.12 51 176 176

   47 79.15  47.273 10.60 47 162 164

   54 83.12  51.855 10.33 50 166 170

   49 81.42  49.156  8.95 44 180 185

   51 69.63  40.836 10.95 57 168 172

   51 77.91  46.672 10.00 48 162 168

   48 91.63  46.774 10.25 48 162 164

   49 73.37  50.388 10.08 67 168 168

   57 73.37  39.407 12.63 58 174 176

   54 79.38  46.080 11.17 62 156 165

   52 76.32  45.441  9.63 48 164 166

   50 70.87  54.625  8.92 48 146 155

   ;

  run;

 

data new;

input age weight oxygen runtime restpulse runpulse maxpulse;

datalines;

   51 67.25  45.118 11.08 48 172 172

   54 91.63  39.203 12.88 44 168 172

   51 73.71  45.790 10.47 59 186 188

   57 59.08  50.545  9.93 49 148 155

   49 76.32  48.673  9.40 56 186 188

   48 61.24  47.920 11.50 52 170 176

   52 82.78  47.467 10.50 53 170 172

   ;

  run;

 proc glm data=fitness;

      model Oxygen=Age Weight RunTime RunPulse RestPulse MaxPulse;

     store glmres;

quit;

 

proc plm restore=glmres;

 score data=new out=newout predicted=predoxy lcl=lcl ucl=ucl lclm=lclm uclm=uclm;

 quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1845 views
  • 5 likes
  • 4 in conversation