BookmarkSubscribeRSS Feed
buski
Calcite | Level 5

I used polynomial distribution lag (PDL) models to analyze the population of insect.

In the PDL model, the record with missing data will be ignored so the observation of dependent variable will be a little different.

For example: MODEL (1) Y=A + B + C

                    MODEL (2) Y=D + E + F

If there is no missing data, I can use AIC, RMSE, or Total R-Square to compare the model performence.

However, in the model (1), the A variable has some missing data so the observation number of Y will be fewer than model (2)

Under this situation, is RMSE OK to compare the model performence? 

Thanks in advance...

2 REPLIES 2
Reeza
Super User

Why not limit to only cases that are in both models for consistency?

You can also compare the model estimates and RMSE for the model and then without the observations it would lose by this method to see the effect.

buski
Calcite | Level 5

If I limit to only cases with no missing data, I will lost many observations in MODEL(2).

That's why I am wondering which estimates is appropriate to compare two models if I don't delete any obs in MODEL (2).

I have data of mutiple years so I will do cross validation year by year.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1261 views
  • 3 likes
  • 2 in conversation