BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dlakeland
Fluorite | Level 6

Hi,

 

I have built a model in EM that indicates that a decision tree is the best model to use in this analysis.  After scoring it I am putting in a file of those that I am needing to score.  Is there a way to have the scores inputted into the SAS Data Source from the Score results, as this file has a deep tree and 300K of rows.

 

Thanks,

 

D.

1 ACCEPTED SOLUTION

Accepted Solutions
dlakeland
Fluorite | Level 6
Thank you - Although being a novice to EM, I am not familiar with the terminology that you are using.
I have attached my diagram and yes the Decision Tree was the best model. The input file (D160519DONOR2) is what I want to run the model against to get the results. In the input file I deleted out the data in the target attribute (but not the title), so do I need to set the target in the input file? And how do I get to see the results in the input file?
Thanks for your help!
David


View solution in original post

7 REPLIES 7
Reeza
Super User

I'm not following what you need, can you please explain it further?

dlakeland
Fluorite | Level 6

I want my results from the decision tree paired with the 290K records in my list of employees.

 

Decision Tree > Model Comparison > NEW DATA SET THAT I WANT RESULTS FOR > Score.

 

Sample Opitimized code below and want the Probabilities matched up with my 290K records...

 

   IF _BRANCH_ EQ    2 THEN DO;
              _NODE_  =                   41;
              _LEAF_  =                   10;
              P_Termed1  =     0.14912280701754;
              P_Termed0  =     0.85087719298245;
              Q_Termed1  =     0.14912280701754;
              Q_Termed0  =     0.85087719298245;
              V_Termed1  =     0.15816326530612;
              V_Termed0  =     0.84183673469387;
              I_Termed  = '0' ;
              U_Termed  =                    0;
              END;
            ELSE DO;
              _ARBFMT_6 = PUT( COC_Level_2 , $6.);
               %DMNORMIP( _ARBFMT_6);
              IF _ARBFMT_6 IN ('JD613H' ,'JS3658' ,'JS0093' ,'RD9444' ,
              'WB6026' ,'JC3486' ,'JG9574' ,'DM952G' ) THEN DO;
                IF  NOT MISSING(NCS_Years ) AND
                                   4.5 <= NCS_Years  THEN DO;
                  _NODE_  =                   47;
                  _LEAF_  =                    8;
                  P_Termed1  =     0.58906525573192;
                  P_Termed0  =     0.41093474426807;
                  Q_Termed1  =     0.58906525573192;
                  Q_Termed0  =     0.41093474426807;
                  V_Termed1  =     0.63983050847457;
                  V_Termed0  =     0.36016949152542;
                  I_Termed  = '1' ;
                  U_Termed  =                    1;
                  END;
                ELSE DO;
                  _NODE_  =                   46;
                  _LEAF_  =                    7;
                  P_Termed1  =     0.42704761904761;
                  P_Termed0  =     0.57295238095238;
                  Q_Termed1  =     0.42704761904761;
                  Q_Termed0  =     0.57295238095238;
                  V_Termed1  =     0.41358555460017;
                  V_Termed0  =     0.58641444539982;
                  I_Termed  = '0' ;
                  U_Termed  =                    0;
                  END;
                END;

 

 

 

Reeza
Super User

I still don't understand. When you score a dataset the rest of the data stays in the data set. So it would be something like:

 

data scored_new_data;
set file_have;

*scoring code;


run;

 

WendyCzika
SAS Employee

I think you are on the right track, you just need to create a data source for the data that you want to score, and assign the Data Role of Score to that data set.  Then you connect both that Input Data node and the Decision Tree node (or Model Comparison node, assuming Decision tree was the selected model) to the Score node and run, and it will score your new data.

dlakeland
Fluorite | Level 6
Thank you - Although being a novice to EM, I am not familiar with the terminology that you are using.
I have attached my diagram and yes the Decision Tree was the best model. The input file (D160519DONOR2) is what I want to run the model against to get the results. In the input file I deleted out the data in the target attribute (but not the title), so do I need to set the target in the input file? And how do I get to see the results in the input file?
Thanks for your help!
David


WendyCzika
SAS Employee

Your attachment didn't come through, but here is an example flow in case this helps..the exported score data from the Score node will have the predictions for your new data (that doesn't have a target), and you can use the Save Data node at the end of the flow to save that data somewhere outside of the project if you want.  The exported train data from the Decision Tree or Model Comparison node will have the predictions on your training data (and exported validation data for the validation data, etc.) if that is what you want, and this exported data can be viewed from the nodes or saved out with the Save Data node as well.

 

TreeScore.PNG

dlakeland
Fluorite | Level 6

Thank you - This is exactly what I needed!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 3712 views
  • 1 like
  • 3 in conversation