BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Xamius32
Calcite | Level 5

I am building 4 different logistic models based on 4 different datasets, and then scoring 1 validation dataset with all 4 models to compare the scores.

 

But for some reason, the scored data is getting different # of observations, even though I know there are no missing values in training or validation data. Is there a reason this would occur?

 

miner.PNG

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

If the category isn't in the training data, then yes it would be. It's equivalent to a missing value/category.

 

If the model is designed for sex=F or sex=M and sex = Unknown appears the model doesn't have a method to score the data and you'll end up with missing values.

View solution in original post

5 REPLIES 5
Reeza
Super User

When you say missing data, do you mean that all categories are covered in scored data are also covered in training data?

 

 

Xamius32
Calcite | Level 5
Well I guess the scored data set will have more categories than the training data. Is that a problem?
Reeza
Super User

If the category isn't in the training data, then yes it would be. It's equivalent to a missing value/category.

 

If the model is designed for sex=F or sex=M and sex = Unknown appears the model doesn't have a method to score the data and you'll end up with missing values.

Xamius32
Calcite | Level 5

Well I am not sure that is the problem. Some of the scored data match the # of obs in the training data, and some match the # of obs in the validation data.I cant figure it out.

Xamius32
Calcite | Level 5

So, I see that my socre node has different inputted data. One has the regression train data and one has the validation data, just not sure how that has happened. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1464 views
  • 0 likes
  • 2 in conversation