BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sasgiaz
Quartz | Level 8

Hello again.. I have an unusual parameter estimation for my variable.

Here's the output from SAS.

ask com.JPG

Does somebody know why is the estimation of parameter for variable DL3 is very high and so different from DL1 and DL2?

The interpretation of the model feels so wrong when the value is that high, because the value of DL3 shoud be around the value of DL1 and DL2.

Thankyou.

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

@sasgiaz wrote:

Hello again.. I have an unusual parameter estimation for my variable.

Here's the output from SAS.

ask com.JPG

Does somebody know why is the estimation of parameter for variable DL3 is very high and so different from DL1 and DL2?

The interpretation of the model feels so wrong when the value is that high, because the value of DL3 shoud be around the value of DL1 and DL2.

Thankyou.

 


Since we don't know what DL1 DL2 and DL3 mean, and we don't have your data, its pretty much impossible for us (but not for you) to figure out why this is happening.

 

You can do some diagnostic checking yourself:

 

  • Make sure there are no extreme outliers in DL1 DL2 DL3
  • Actually PLOT the data (DL1 versus Y, etc.) and look at it
  • Check the influence diagnostics from PROC REG
  • Check the Variance Inflation Factors (option VIF) from PROC REG
  • Make sure there is no high correlation between DL1 DL2 DL3
--
Paige Miller

View solution in original post

17 REPLIES 17
sasgiaz
Quartz | Level 8

Here's the syntax I use for the analysis.

proc reg data = armaxvlp;                                                                                                                     
model Bill = Waktu DL1 DL2 DL3/influence;                                                                                                     
run;
PaigeMiller
Diamond | Level 26

@sasgiaz wrote:

Hello again.. I have an unusual parameter estimation for my variable.

Here's the output from SAS.

ask com.JPG

Does somebody know why is the estimation of parameter for variable DL3 is very high and so different from DL1 and DL2?

The interpretation of the model feels so wrong when the value is that high, because the value of DL3 shoud be around the value of DL1 and DL2.

Thankyou.

 


Since we don't know what DL1 DL2 and DL3 mean, and we don't have your data, its pretty much impossible for us (but not for you) to figure out why this is happening.

 

You can do some diagnostic checking yourself:

 

  • Make sure there are no extreme outliers in DL1 DL2 DL3
  • Actually PLOT the data (DL1 versus Y, etc.) and look at it
  • Check the influence diagnostics from PROC REG
  • Check the Variance Inflation Factors (option VIF) from PROC REG
  • Make sure there is no high correlation between DL1 DL2 DL3
--
Paige Miller
sasgiaz
Quartz | Level 8
all DL is dummy variables...it contains so much zeros and some dummy proportions...it's a dummy variables for holiday variation. The VIF is fine..
Ksharp
Super User

I noticed you only have 36 obs , that is not enough for a Regression Model.

Also check Std of DL3 , that is way too high . This indicated you have a wrong model .

sasgiaz
Quartz | Level 8
What do you think I should do?..
Ksharp
Super User

1) collect more data

2) try other nonlinear effect ,like 

 

model y=dl1 | dl2 | dl3 ;

 

or 

try EFFECT statement to get spline / cubie nonlinear effect .

PaigeMiller
Diamond | Level 26

Before I would advise any of these, I would certainly check for outliers and the other problems I mentioned in my earlier message. If any of those problems are present, fitting a more complicated model will result in similar problems with the new model.

--
Paige Miller
sasgiaz
Quartz | Level 8

 

Here's the plot for each DL. If the outliers should be checked I think it's obvious for each DL to have extreme value because most of the variables valued zero since it is dummy variables

dl.JPG

 

PaigeMiller
Diamond | Level 26

DL are dummy variables? Did not know that.

 

But to detect outliers you have to plot Y versis DL1, Y versus DL2, etc.

 

And then there were other recommendations I had as well, stated earlier, you need to check all of these.

--
Paige Miller
sasgiaz
Quartz | Level 8

Here's the plot Y (Bill) vs DL1, DL2, and DL3.

How can I tell about the outliers?Bill vs DL1.JPG

 

Bill vs DL2.JPG

 

Bill vs DL3.JPG

Here's the output for multicollinerity check.

The dummy variables don't have high correlation.

vif checked.JPG

 

Lastly, I dont know which one is the influence diagnostics..

 

PaigeMiller
Diamond | Level 26

At this point, I suggest we take a major step backwards and re-examine the original problem. The whole idea of fitting a regression to this data makes little sense once you look at the plots.

 

Do these data look like they fit a straight line? Do these data look like they fit any line, straight or curved?

 

I would say, no they do not, and any effort to fit a linear regression here will result in poor fits and meaningless slopes.

--
Paige Miller
sasgiaz
Quartz | Level 8

so you suggest to apply another model instead of regression? what model do you think will fit?

PaigeMiller
Diamond | Level 26

Again, you need to look at the data and see what potential models might fit. I don't see any patterns in this data. Do you?

 

When you have a huge percent of your data where x=0 and a very small percent of your data is where x>0, I don't think you can use a statistical model because the data is extremely unbalanced and there are no apparent patterns that can be modeled. I am not aware of any statistical modelling that will fit this data. In fact, it's hard to get any meaningful statistics in this extremely unbalanced case. 

 

 

--
Paige Miller
PaigeMiller
Diamond | Level 26

Adding:

 

Maybe a linear model with your y-variable and WAKTU as the x-variable makes sense, I don't know, I haven't seen the data. But modelling with DL1 DL2 DL3 isn't going to fit well, because there is no pattern.

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 17 replies
  • 2777 views
  • 0 likes
  • 3 in conversation