Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 03-11-2019 02:35 AM
(1976 views)

Hello again.. I have an unusual parameter estimation for my variable.

Here's the output from SAS.

Does somebody know why is the estimation of parameter for variable DL3 is very high and so different from DL1 and DL2?

The interpretation of the model feels so wrong when the value is that high, because the value of DL3 shoud be around the value of DL1 and DL2.

Thankyou.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@sasgiaz wrote:

Hello again.. I have an unusual parameter estimation for my variable.

Here's the output from SAS.

Does somebody know why is the estimation of parameter for variable DL3 is very high and so different from DL1 and DL2?

The interpretation of the model feels so wrong when the value is that high, because the value of DL3 shoud be around the value of DL1 and DL2.

Thankyou.

Since we don't know what DL1 DL2 and DL3 mean, and we don't have your data, its pretty much impossible for us (but not for you) to figure out why this is happening.

You can do some diagnostic checking yourself:

- Make sure there are no extreme outliers in DL1 DL2 DL3
- Actually PLOT the data (DL1 versus Y, etc.) and look at it
- Check the influence diagnostics from PROC REG
- Check the Variance Inflation Factors (option VIF) from PROC REG
- Make sure there is no high correlation between DL1 DL2 DL3

--

Paige Miller

Paige Miller

17 REPLIES 17

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Here's the syntax I use for the analysis.

proc reg data = armaxvlp; model Bill = Waktu DL1 DL2 DL3/influence; run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@sasgiaz wrote:

Hello again.. I have an unusual parameter estimation for my variable.

Here's the output from SAS.

Does somebody know why is the estimation of parameter for variable DL3 is very high and so different from DL1 and DL2?

The interpretation of the model feels so wrong when the value is that high, because the value of DL3 shoud be around the value of DL1 and DL2.

Thankyou.

Since we don't know what DL1 DL2 and DL3 mean, and we don't have your data, its pretty much impossible for us (but not for you) to figure out why this is happening.

You can do some diagnostic checking yourself:

- Make sure there are no extreme outliers in DL1 DL2 DL3
- Actually PLOT the data (DL1 versus Y, etc.) and look at it
- Check the influence diagnostics from PROC REG
- Check the Variance Inflation Factors (option VIF) from PROC REG
- Make sure there is no high correlation between DL1 DL2 DL3

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

all DL is dummy variables...it contains so much zeros and some dummy proportions...it's a dummy variables for holiday variation. The VIF is fine..

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I noticed you only have 36 obs , that is not enough for a Regression Model.

Also check Std of DL3 , that is way too high . This indicated you have a wrong model .

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

What do you think I should do?..

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

1) collect more data

2) try other nonlinear effect ,like

model y=dl1 | dl2 | dl3 ;

or

try EFFECT statement to get spline / cubie nonlinear effect .

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Before I would advise any of these, I would certainly check for outliers and the other problems I mentioned in my earlier message. If any of those problems are present, fitting a more complicated model will result in similar problems with the new model.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

DL are dummy variables? Did not know that.

But to detect outliers you have to plot Y versis DL1, Y versus DL2, etc.

And then there were other recommendations I had as well, stated earlier, you need to check all of these.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Here's the plot Y (Bill) vs DL1, DL2, and DL3.

How can I tell about the outliers?

Here's the output for multicollinerity check.

The dummy variables don't have high correlation.

Lastly, I dont know which one is the influence diagnostics..

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

At this point, I suggest we take a major step backwards and re-examine the original problem. The whole idea of fitting a regression to this data makes little sense once you look at the plots.

Do these data look like they fit a straight line? Do these data look like they fit any line, straight or curved?

I would say, no they do not, and any effort to fit a linear regression here will result in poor fits and meaningless slopes.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

so you suggest to apply another model instead of regression? what model do you think will fit?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Again, you need to look at the data and see what potential models might fit. I don't see any patterns in this data. Do you?

When you have a huge percent of your data where x=0 and a very small percent of your data is where x>0, I don't think you can use a statistical model because the data is extremely unbalanced and there are no apparent patterns that can be modeled. I am not aware of any statistical modelling that will fit this data. In fact, it's hard to get any meaningful statistics in this extremely unbalanced case.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Adding:

Maybe a linear model with your y-variable and WAKTU as the x-variable makes sense, I don't know, I haven't seen the data. But modelling with DL1 DL2 DL3 isn't going to fit well, because there is no pattern.

--

Paige Miller

Paige Miller

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

Upcoming Events

- RTSUG Webinar: Intro to Coding in SAS Viya | 20-Jun-2024
- SAS® Viya® Workbench – Available on AWS Marketplace | 25-Jun-2024
- DCSUG Virtual Meeting with Kirk Paul Lafler | 26-Jun-2024
- Ask the Expert: How Can I Use SAS® Optimization From Python? | 09-Jul-2024
- WUSS Virtual: Mastering Oncology Studies: A Comprehensive Guide for Programmers & Biostatisticians | 12-Jul-2024
- Ask the Expert: Jupyter Notebook: Your Coding Canvas | 16-Jul-2024
- MinnSUG Annual SAS Conference | 17-Jul-2024

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.