BookmarkSubscribeRSS Feed
omerzeybek
Obsidian | Level 7

Hi,

I have estimated my propensity model in SAS Miner and after having 1.71 and 1.72 lifts for training and validation data i extracted optimize SAS code from miner...

Afterwards i used that score code to score my test data which have been prepared with more recent available data than training and validation..

However i received same probabilities for every individuals in my dataset...

I suppose there is a problem with my methods, because even my model would lose its power to divide dataset.... I think its weird that calculating same probability for each case.

I would be delighted if you could give some advice on this issue....

Regards.. 

8 REPLIES 8
jwexler
SAS Employee

Hi, it appears you asked a related question in a later thread, let us know if there's anything else we can do.

Thanks,

Jonathan

stat_sas
Ammonite | Level 13

I am not sure about scoring code. Please check the following two things:

1.     Variables being used in the model are highly correlated.

2.     Variables have lot of missing values.

omerzeybek
Obsidian | Level 7

Hi Stat,

1. I have already checked it highly correlated series are listed in selected variables but they haven't entered model. By the way i  thought that while deriving test data i should use all the selected variables for my model am i right?

 

2.I have filled all all missing values with array function already...

Thank You Very Much

stat_sas
Ammonite | Level 13

Hi,

Few more things based on your feedback.

How many variables are you using as predictors in your final model and did you use any data reduction technique to select them?

Have you chekced correlation among final predictors?

Are there any predictors in the final model which are not well populated and you imputed them?

Thanks,

omerzeybek
Obsidian | Level 7

No i havent use any reduction technic and i have about 98 predictors for modelling

Yes i have checked correlations. There are highly correlated variables but i have filter out these variables before estimating my regeression

No i have imputed every predictor with array function

Thank You Very Much

stat_sas
Ammonite | Level 13

Here are some suggestions

In predictive modeling selection of predictors is very crucial. If there is multicolinearity among predictors you will get instable estimates of regression parameters. If there are a number of predictors require imputation then it will make them linearly related and as a result model will produce predicted values which will be almost same.

As a first step you should check if there are a lot of predictors with bad data (10-20% filled) and you have to impute them then through them away before proceeding further. Then take rest of the predictors and reduce dimentionality of data by finding the predictors which are really contributing in the model. Build model with the reduced number of predictors and it will give you better results on test data.

omerzeybek
Obsidian | Level 7

Hi,

Thank you very much for advices on predictive analytics's art side...

As you said i have checked my variables correlation matrix but for the variables that entered model i couldnt find any correlation larger than 0.70

However i also reduce number of variables entering the model eit using a Chi-Square / R square node...

But i am still estimating same probablities for every individual for my data set. However when i was inspecting my results again and  again i realized that D_TARGET coloumn created in healty score code doesnt exists for my case...

Could it be a hint for my problem...

Thank You Very Much

stat_sas
Ammonite | Level 13

Not a problem. I would suggest explore proc varclus, proc princomp to reduce number of variables. I don't understand why D_target column does not exist. Is that variable created using from some existing variable? or it was not imputed?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1386 views
  • 1 like
  • 3 in conversation