turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Data Mining
- /
- predicted scores in SAS EG

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2014 11:58 PM

Hi,

I have estimated my propensity model in SAS Miner and after having 1.71 and 1.72 lifts for training and validation data i extracted optimize SAS code from miner...

Afterwards i used that score code to score my test data which have been prepared with more recent available data than training and validation..

However i received same probabilities for every individuals in my dataset...

I suppose there is a problem with my methods, because even my model would lose its power to divide dataset.... I think its weird that calculating same probability for each case.

I would be delighted if you could give some advice on this issue....

Regards..

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to omerzeybek

05-13-2014 10:24 AM

Hi, it appears you asked a related question in a later thread, let us know if there's anything else we can do.

Thanks,

Jonathan

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to omerzeybek

05-13-2014 10:40 AM

I am not sure about scoring code. Please check the following two things:

1. Variables being used in the model are highly correlated.

2. Variables have lot of missing values.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to stat_sas

05-14-2014 07:00 AM

Hi Stat,

1. I have already checked it highly correlated series are listed in selected variables but they haven't entered model. By the way i thought that while deriving test data i should use all the selected variables for my model am i right?

2.I have filled all all missing values with array function already...

Thank You Very Much

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to omerzeybek

05-14-2014 09:44 AM

Hi,

Few more things based on your feedback.

How many variables are you using as predictors in your final model and did you use any data reduction technique to select them?

Have you chekced correlation among final predictors?

Are there any predictors in the final model which are not well populated and you imputed them?

Thanks,

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to stat_sas

05-14-2014 12:04 PM

No i havent use any reduction technic and i have about 98 predictors for modelling

Yes i have checked correlations. There are highly correlated variables but i have filter out these variables before estimating my regeression

No i have imputed every predictor with array function

Thank You Very Much

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to omerzeybek

05-14-2014 01:28 PM

Here are some suggestions

In predictive modeling selection of predictors is very crucial. If there is multicolinearity among predictors you will get instable estimates of regression parameters. If there are a number of predictors require imputation then it will make them linearly related and as a result model will produce predicted values which will be almost same.

As a first step you should check if there are a lot of predictors with bad data (10-20% filled) and you have to impute them then through them away before proceeding further. Then take rest of the predictors and reduce dimentionality of data by finding the predictors which are really contributing in the model. Build model with the reduced number of predictors and it will give you better results on test data.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to stat_sas

05-15-2014 10:05 PM

Hi,

Thank you very much for advices on predictive analytics's art side...

As you said i have checked my variables correlation matrix but for the variables that entered model i couldnt find any correlation larger than 0.70

However i also reduce number of variables entering the model eit using a Chi-Square / R square node...

But i am still estimating same probablities for every individual for my data set. However when i was inspecting my results again and again i realized that D_TARGET coloumn created in healty score code doesnt exists for my case...

Could it be a hint for my problem...

Thank You Very Much

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to omerzeybek

05-20-2014 03:28 PM

Not a problem. I would suggest explore proc varclus, proc princomp to reduce number of variables. I don't understand why D_target column does not exist. Is that variable created using from some existing variable? or it was not imputed?