turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Data Mining
- /
- Scoring Code in Enterprise Miner

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-12-2013 06:21 PM

Hello,

I have to predict a binary variable (1/0) using predicitve modelling. ANN, Logistc Regression and Decidion Trees are used. In the socring code there are two variables created that i don;t know how to interpret. These are I_TARGET_B and U_TARGET_B. WHat do these variables express? As i have seen these variables take the same values, the one as character and the other as numeric). The scoring code for these variables is as follows:

*** Writing the I_TARGET_B AND U_TARGET_B ;

*** *************************;

_MAXP_ = P_TARGET_B1 ;

I_TARGET_B = "1 " ;

U_TARGET_B= 1;

IF( _MAXP_ LT P_TARGET_B0 ) THEN DO;

_MAXP_ = P_TARGET_B0 ;

I_TARGET_B = "0 "

U_TARGET_B = 0;

END;

********************************;

*** End Scoring Code for Neural;

********************************;

What i understand is that _MAXP_ is eqaukt ot the probability of the primary event. Then the values of I_TARGET_B is set to 1 (character) and the values of U_TARGET_B is set to 1 (numeric). SOthese variables have the same values, the one is character, the other numeric. Then if the probability of promary event is lower than the probability of secondary event I_TARGET_B and U_TARGET_B take the value of 0 (the one character, the other numeric).

What is the meaning of the variables (that practically are the same)?

Thnaksin advance,

ANdreas

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-13-2013 11:35 AM

The meaning of the two variables is:

I_ -- normalized category that the case is classified into

U_ -- unnormalized category that the case is classified into

From a practical perspective I haven't come across cases where they differ. In your case the interval vs nominal format reflects the fact that your target is numeric (even though you have probably defined it as binary in metadata). If you were to use a GOOD/BAD binary target, I_ would also be nominal. U_ is always nominal.

G