BookmarkSubscribeRSS Feed
buski
Calcite | Level 5

I used polynomial distribution lag (PDL) models to analyze the population of insect.

In the PDL model, the record with missing data will be ignored so the observation of dependent variable will be a little different.

For example: MODEL (1) Y=A + B + C

                    MODEL (2) Y=D + E + F

If there is no missing data, I can use AIC, RMSE, or Total R-Square to compare the model performence.

However, in the model (1), the A variable has some missing data so the observation number of Y will be fewer than model (2)

Under this situation, is RMSE OK to compare the model performence? 

Thanks in advance...

2 REPLIES 2
Reeza
Super User

Why not limit to only cases that are in both models for consistency?

You can also compare the model estimates and RMSE for the model and then without the observations it would lose by this method to see the effect.

buski
Calcite | Level 5

If I limit to only cases with no missing data, I will lost many observations in MODEL(2).

That's why I am wondering which estimates is appropriate to compare two models if I don't delete any obs in MODEL (2).

I have data of mutiple years so I will do cross validation year by year.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1345 views
  • 3 likes
  • 2 in conversation