Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Miner - Scoring New Data - Scored variable not found

Accepted Solution Solved
Reply
Highlighted
New Contributor
Posts: 3
Accepted Solution

Miner - Scoring New Data - Scored variable not found

[ Edited ]

Hi,

 

Apologies, I'm REALLY new to SAS, so if this is a basic error forgive me, however I can't figure out where I've gone wrong or indeed neither can anyone else in my organisation. 

 

I have created a Logistic Regression stepwise model to predict someone's likelihood to get into debt.  The model has some grouped and imputed columns in the final selection.  It was also partitioned into training, validation, and test Partitions. 

 

I have successfully scored the test partition data without issue, and now I want to score a brand new set of data - it is with this I have the issue. 

 

The new data has been imported, the roles and data levels (nominal, internal...) have all been changed to the appropriate type and match the data set the model was originally built on.  The role of the new data is set to "score" and the subsequent score node appears to have run without issue (it has a green tick!).  I then have a code node attached to this to sort and rank (decile) the scored column, however it says the scored variable P_column_nameY (my output is Y/N) doesn't exist.

 

Just to double check I didn't get the variable name wrong I then attached a code node to the score node to "print" the data...   "proc print data= tablename;  run;"  

 

The output of the "print", doesn't include any scored columns at all, so no wonder the sort and rank wasn't working.  Does anyone know why and where I've gone wrong?

 

I have attached a screen grab of my miner project - the bit within the yellow box the process from start to finish.  Anything outside of this are different variations of the model I haven't taken forward or the (working) scoring of the modelled test data

 

miner screengrab.JPG

 

Thanks in advance,

 

GW.

 

 

 


Accepted Solutions
Solution
a week ago
SAS Super FREQ
Posts: 292

Re: Miner - Scoring New Data - Scored variable not found

If instead of "Browse" from the Exported Data window, you do "Properties..." then click on the Variables tab, click the "Label" checkbox, you should be able to see the variable names and labels for your new columns; what you are seeing with "Predicted: ..." is the label, and the variable name is what you want to use in your code, I think what you have above would be correct, but you can double-check here.

 

Then in your SAS Code node, try using &em_import_score as your data set name, this is a macro variable that references the incoming score data set, so:

proc sort data=&em_import_score  out=/*use a new name here like: work.something*/;

...

 

View solution in original post


All Replies
SAS Super FREQ
Posts: 292

Re: Miner - Scoring New Data - Scored variable not found

Your process all sounds/looks correct...can you select the Score node and click on the "..." for Exported Data in the properties panel, and browse your SCORE data from there to see if it has the correct columns?

New Contributor
Posts: 3

Re: Miner - Scoring New Data - Scored variable not found

Wendy,

 

that's great thank you - I really appreciate the reply and help. I have "browsed" the data (I learnt something new!) and there are indeed scored columns in there.  However they aren't in the formt I was expecting for the column names - which is possibly where it is going wrong in the code node when I come to sort.  I want to rank (and decile) the predicted outcome = 'Y', which presumably is the column "predicted: DEBT_90_DAYS_ANY=Y". 

 

miner data screen grab.JPG

 

My code for the sort is referencing "P_DEBT_90_DAYS_ANYY"  as I thought the format was p_mycolumnnameY (or 1/0 if using 1 or 0 instead of Y/N).  However it says the variable doesn't exist, so I'm guessing I have the naming convention wrong or I've missed a step somewhere.

 

proc sort
data= mydata.my_original_table;
out= mydata.my_original_table;
by descending P_DEBT_90_DAYS_ANYY;
run;

 

The data table referenced in the sort code is the same table name as the table that was imported with the new data I wanted to score - presumably this is the correct way?

 

Thanks in advance for any further help,

 

GW

Solution
a week ago
SAS Super FREQ
Posts: 292

Re: Miner - Scoring New Data - Scored variable not found

If instead of "Browse" from the Exported Data window, you do "Properties..." then click on the Variables tab, click the "Label" checkbox, you should be able to see the variable names and labels for your new columns; what you are seeing with "Predicted: ..." is the label, and the variable name is what you want to use in your code, I think what you have above would be correct, but you can double-check here.

 

Then in your SAS Code node, try using &em_import_score as your data set name, this is a macro variable that references the incoming score data set, so:

proc sort data=&em_import_score  out=/*use a new name here like: work.something*/;

...

 

New Contributor
Posts: 3

Re: Miner - Scoring New Data - Scored variable not found

Wendy,

 

Thank you..... its fixed!  It wouldn't let me do it exactly how you described (not sure why - unless I wrote it wrongly), so I created an additional code node in between using:

 

data mydata.mytable;
Set &em_import_score;
run;

 

I then carried on as normal using the proc sort data code, and it appears to have worked, so I'm very grateful to you.

 

Thank you.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 162 views
  • 2 likes
  • 2 in conversation