BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Xamius32
Calcite | Level 5

I am building 4 different logistic models based on 4 different datasets, and then scoring 1 validation dataset with all 4 models to compare the scores.

 

But for some reason, the scored data is getting different # of observations, even though I know there are no missing values in training or validation data. Is there a reason this would occur?

 

miner.PNG

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

If the category isn't in the training data, then yes it would be. It's equivalent to a missing value/category.

 

If the model is designed for sex=F or sex=M and sex = Unknown appears the model doesn't have a method to score the data and you'll end up with missing values.

View solution in original post

5 REPLIES 5
Reeza
Super User

When you say missing data, do you mean that all categories are covered in scored data are also covered in training data?

 

 

Xamius32
Calcite | Level 5
Well I guess the scored data set will have more categories than the training data. Is that a problem?
Reeza
Super User

If the category isn't in the training data, then yes it would be. It's equivalent to a missing value/category.

 

If the model is designed for sex=F or sex=M and sex = Unknown appears the model doesn't have a method to score the data and you'll end up with missing values.

Xamius32
Calcite | Level 5

Well I am not sure that is the problem. Some of the scored data match the # of obs in the training data, and some match the # of obs in the validation data.I cant figure it out.

Xamius32
Calcite | Level 5

So, I see that my socre node has different inputted data. One has the regression train data and one has the validation data, just not sure how that has happened. 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1644 views
  • 0 likes
  • 2 in conversation