BookmarkSubscribeRSS Feed
skumar46
Fluorite | Level 6

Hi,

I have attached herewith dataset and coding script in sas on demand. I am having diificult in running the below code, would appreciate if anybody can help me out;-

 

Tried with different ways as following but not working stilll:

 

*Step 4: Random Forest (More Robust Alternative))  */
 
PROC FOREST DATA=credit_data_transformed;
    TARGET Default_Flag / LEVEL=BINARY;
    INPUT Log_Income Log_Loan_Amount Interest_Loan_Term
          Interest_Rate Debt_to_Income_Ratio Delinquency_History
          Credit_Score_High Credit_Score_Medium Age_Young Age_Middle / LEVEL=INTERVAL;
    NTREES=100; /* Number of trees */
    SEED=12345;
    OOBPREDICT;
    OUTPUT OUT=rf_preds PREDICTED=RF_Pred_Prob;
RUN;
 
 
 
 
 
PROC HPFOREST DATA=credit_data_transformed OUTMODEL=rf_model;
    TARGET Default_Flag / LEVEL=BINARY; /* Specify the binary target */
    INPUT Log_Income Log_Loan_Amount Interest_Loan_Term
          Interest_Rate Debt_to_Income_Ratio Delinquency_History
          Credit_Score_High Credit_Score_Medium Age_Young Age_Middle / LEVEL=INTERVAL; /* Specify input variables */
 
    /* Specify number of trees and random seed */
    NTREES=100;
    SEED=12345;
 
    /* Optional: for scoring output */
    SCORE OUT=rf_preds;
RUN;
 
 
PROC HPFOREST DATA=credit_data_transformed OUTMODEL=rf_model;
    TARGET Default_Flag / LEVEL=BINARY; /* Specify the binary target */
    INPUT Log_Income Log_Loan_Amount Interest_Loan_Term
          Interest_Rate Debt_to_Income_Ratio Delinquency_History
          Credit_Score_High Credit_Score_Medium Age_Young Age_Middle / LEVEL=INTERVAL; /* Specify input variables */
   
    /* Number of trees, random seed, and model training */
    NTREES=100; /* Specify the number of trees */
    SEED=12345; /* Set the random seed */
    TRAINFRAC=0.7; /* Use 70% of the data for training */
   
    /* Store the predictions */
    SCORE OUT=rf_preds;
RUN;
 
PROC SETINIT;
RUN;
 
8 REPLIES 8
eduardo_silva
SAS Employee
Hi, can you share the log of your program here?
skumar46
Fluorite | Level 6
I have used procforest/hp procforest in 3 different ways and log of all three as under-you can check the attached sas script
1)
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
68
69 PROC FOREST DATA=credit_data_transformed;
70 TARGET Default_Flag / LEVEL=BINARY;
______
22
76
ERROR 22-322: Syntax error, expecting one of the following: INTERVAL, NOMINAL.
ERROR 76-322: Syntax error, statement will be ignored.
71 INPUT Log_Income Log_Loan_Amount Interest_Loan_Term
72 Interest_Rate Debt_to_Income_Ratio Delinquency_History
73 Credit_Score_High Credit_Score_Medium Age_Young Age_Middle / LEVEL=INTERVAL;
74 NTREES=100; /* Number of trees */
______
180
75 SEED=12345;
____
180
76 OOBPREDICT;
__________
180
77 OUTPUT OUT=rf_preds PREDICTED=RF_Pred_Prob;
_________
22
76
ERROR 180-322: Statement is not valid or it is used out of proper order.
ERROR 22-322: Syntax error, expecting one of the following: ;, (, COPYVAR, COPYVARS, MODELID, ROLE.
ERROR 76-322: Syntax error, statement will be ignored.
78 RUN;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE FOREST used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 591.90k
OS Memory 26020.00k
Timestamp 03/27/2025 02:36:33 AM
Step Count 158 Switch Count 0
Page Faults 0
Page Reclaims 49
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 0

79
80 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

2)


1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
68
69 PROC HPFOREST DATA=credit_data_transformed OUTMODEL=rf_model;
________
22
202
ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALPHA, BALANCE, CATBINS, CRITERION, DATA, EXHAUSTIVE,
GRIDCLASSSIZE, GRIDCOPY, GRIDNODESIZE, IMPORTANCE, INBAGFRACTION, INBAGN, INTERVALBINS, LEAFFRACTION, LEAFSIZE,
MAXDEPTH, MAXTREES, MINCATSIZE, MINUSEINSEARCH, MISSING, NODESIZE, PRESELECT, PROLE, PRUNEFRACTION, PRUNETHRESHOLD,
SCOREPROLE, SECONDMISSING, SEED, SKIP_SEQ_ROWS, SPLITSIZE, VARS_TO_TRY.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
70 TARGET Default_Flag / LEVEL=BINARY; /* Specify the binary target */
WARNING: Ignoring second data set reference.
71 INPUT Log_Income Log_Loan_Amount Interest_Loan_Term
72 Interest_Rate Debt_to_Income_Ratio Delinquency_History
73 Credit_Score_High Credit_Score_Medium Age_Young Age_Middle / LEVEL=INTERVAL; /* Specify input variables */
74
75 /* Specify number of trees and random seed */
76 NTREES=100;
______
180
77 SEED=12345;
____
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
78
79 /* Optional: for scoring output */
80 SCORE OUT=rf_preds;
81 RUN;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.RF_PREDS may be incomplete. When this step was stopped there were 0 observations and 0 variables.
WARNING: Data set WORK.RF_PREDS was not replaced because this step was stopped.
NOTE: PROCEDURE HPFOREST used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 659.37k
OS Memory 26020.00k
Timestamp 03/27/2025 02:36:56 AM
Step Count 164 Switch Count 0
Page Faults 0
Page Reclaims 48
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 16

3)

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
68
69 PROC HPFOREST DATA=credit_data_transformed OUTMODEL=rf_model;
________
22
202
ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALPHA, BALANCE, CATBINS, CRITERION, DATA, EXHAUSTIVE,
GRIDCLASSSIZE, GRIDCOPY, GRIDNODESIZE, IMPORTANCE, INBAGFRACTION, INBAGN, INTERVALBINS, LEAFFRACTION, LEAFSIZE,
MAXDEPTH, MAXTREES, MINCATSIZE, MINUSEINSEARCH, MISSING, NODESIZE, PRESELECT, PROLE, PRUNEFRACTION, PRUNETHRESHOLD,
SCOREPROLE, SECONDMISSING, SEED, SKIP_SEQ_ROWS, SPLITSIZE, VARS_TO_TRY.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
70 TARGET Default_Flag / LEVEL=BINARY; /* Specify the binary target */
WARNING: Ignoring second data set reference.
71 INPUT Log_Income Log_Loan_Amount Interest_Loan_Term
72 Interest_Rate Debt_to_Income_Ratio Delinquency_History
73 Credit_Score_High Credit_Score_Medium Age_Young Age_Middle / LEVEL=INTERVAL; /* Specify input variables */
74
75 /* Number of trees, random seed, and model training */
76 NTREES=100; /* Specify the number of trees */
______
180
77 SEED=12345; /* Set the random seed */
____
180
78 TRAINFRAC=0.7; /* Use 70% of the data for training */
_________
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
79
80 /* Store the predictions */
81 SCORE OUT=rf_preds;
82 RUN;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.RF_PREDS may be incomplete. When this step was stopped there were 0 observations and 0 variables.
WARNING: Data set WORK.RF_PREDS was not replaced because this step was stopped.
NOTE: PROCEDURE HPFOREST used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 656.75k
OS Memory 26020.00k
Timestamp 03/27/2025 02:37:33 AM
Step Count 170 Switch Count 0
Page Faults 0
Page Reclaims 48
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 8

83
84
85 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
95

JackieJ_SAS
SAS Employee

Hi, Check out the documentation for the FOREST and HPFOREST procedures and make sure you have valid syntax for each of your statements. For example, you have this statement with the FOREST procedure:

TARGET Default_Flag / LEVEL=BINARY;

The only valid choices for the LEVEL= options are LEVEL=NOMINAL or LEVEL=INTERVAL. The error notes in the log can be hard to figure out, but this error is trying to tell you that LEVEL=BINARY isn't valid syntax:

ERROR 22-322: Syntax error, expecting one of the following: INTERVAL, NOMINAL. 

 

skumar46
Fluorite | Level 6

Hi,

 

Used both Nominal & Interval but still facing same issue. Pls support

eduardo_silva
SAS Employee

Hi,

You are getting errors because your syntax is incorrect.

Maybe you are thinking Proc Forest and Proc HPForest use the same syntax, but they don't.

For the first procedure, the following program should work:

 

PROC FOREST DATA=casuser.credit_modeling_sampling NTREES=100 SEED=12345;
TARGET Default_Flag / LEVEL=NOMINAL;
INPUT Income Loan_Amount
Interest_Rate Debt_to_Income_Ratio Delinquency_History
/ LEVEL=INTERVAL;

OUTPUT OUT=CASUSER.rf_preds;
RUN;

 

You must use LEVEL=NOMINAL in the target statement when fitting a binary target in Proc Forest.

I also removed the OOBPREDICT statement because it doesn't exist, and I don't know what your intention was by using it.

The NTREES and SEED options should be placed in the Proc Forest statement.

There is no PREDICTED option in the output statement, so I removed it, but rf_preds already contains the predicted probabilities if that is what you are looking for.

In the second procedure, you are using Proc HPForest, so the syntax is different. This program should work for the sample data you provided:

 

PROC HPFOREST DATA=casuser.credit_modeling_sampling MAXTREES=100 SEED=12345;
TARGET Default_Flag / LEVEL=BINARY; /* Specify the binary target */
INPUT Income Loan_Amount
Interest_Rate Debt_to_Income_Ratio Delinquency_History
/ LEVEL=INTERVAL; /* Specify input variables */

/* Optional: for scoring output */
SCORE OUT=rf_preds;
RUN;

 

You should use MAXTREES instead of NTREES in the Proc HPForest statement.

Unlike Proc Forest, Proc HPForest uses LEVEL=BINARY.

There is no OUTMODEL option in Proc HPForest. You could use the SAVE statement as a replacement, but it is not exactly the same thing.

The third procedure is very similar to the second one, but you should insert the TRAINFRACTION option in the Proc HPForest procedure:

 

PROC HPFOREST DATA=casuser.credit_modeling_sampling MAXTREES=100 SEED=12345 trainfraction=0.6;

 

I hope this explanation helps you achieve your goal.

I'd also recommend you take a look at the Forest  and HPForest  documentation. It contains relevant information regarding the syntax.

skumar46
Fluorite | Level 6
I tried both below codes given by you but still its not going through. I would appreciate if you download my given dataset and sas coding script and see if you are able to run code.
Code (1)
RUN;

Error:

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
68
69 PROC FOREST DATA=casuser.credit_modeling_sampling NTREES=100 SEED=12345;
ERROR: Libref CASUSER is not assigned.
70 TARGET Default_Flag / LEVEL=NOMINAL;
ERROR: No data set open to look up variables.
71 INPUT Income Loan_Amount
ERROR: No data set open to look up variables.
72 Interest_Rate Debt_to_Income_Ratio Delinquency_History
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
73 / LEVEL=INTERVAL;
ERROR: No data set open to look up variables.
74 OUTPUT OUT=CASUSER.rf_preds;
ERROR: Libref CASUSER is not assigned.
75 RUN;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE FOREST used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 193.18k
OS Memory 24224.00k
Timestamp 03/28/2025 07:03:51 PM
Step Count 152 Switch Count 0
Page Faults 0
Page Reclaims 15
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 8

76
77 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
87
Code (2)

PROC HPFOREST DATA=casuser.credit_modeling_sampling MAXTREES=100 SEED=12345;
TARGET Default_Flag / LEVEL=BINARY; /* Specify the binary target */
INPUT Income Loan_Amount
Interest_Rate Debt_to_Income_Ratio Delinquency_History
/ LEVEL=INTERVAL; /* Specify input variables */

/* Optional: for scoring output */
SCORE OUT=rf_preds;
RUN;
Error:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
68
69 PROC HPFOREST DATA=casuser.credit_modeling_sampling MAXTREES=100 SEED=12345;
ERROR: Libref CASUSER is not assigned.
70 TARGET Default_Flag / LEVEL=BINARY; /* Specify the binary target */
ERROR: No data set open to look up variables.
71 INPUT Income Loan_Amount
ERROR: No data set open to look up variables.
72 Interest_Rate Debt_to_Income_Ratio Delinquency_History
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
ERROR: No data set open to look up variables.
73 / LEVEL=INTERVAL; /* Specify input variables */
ERROR: No data set open to look up variables.
74
75 /* Optional: for scoring output */
76 SCORE OUT=rf_preds;
77 RUN;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.RF_PREDS may be incomplete. When this step was stopped there were 0 observations and 0 variables.
WARNING: Data set WORK.RF_PREDS was not replaced because this step was stopped.
NOTE: PROCEDURE HPFOREST used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 348.40k
OS Memory 24224.00k
Timestamp 03/28/2025 07:04:09 PM
Step Count 158 Switch Count 0
Page Faults 0
Page Reclaims 14
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 8

78
79 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
89
eduardo_silva
SAS Employee
You are getting these new errors because you didn't change the name of the library and data set in the data= and out= options.

So you should replace it here

DATA=casuser.credit_modeling_sampling

and here
OUT=CASUSER.rf_preds;

For your library and data set.

skumar46
Fluorite | Level 6
Hi, I am still facing error with following codes. I believe that its waisting our both of times getting along with each other. I would appreciate if you run this at your end before responding and do share with my sas coding script, it will be easy for me

PROC FOREST DATA=casuser.credit_modeling_sampling NTREES=100 SEED=12345;
TARGET Default_Flag / LEVEL=NOMINAL;
INPUT Income Loan_Amount
Interest_Rate Debt_to_Income_Ratio Delinquency_History
/ LEVEL=INTERVAL;
OUTPUT OUT=CASUSER.rf_preds;
RUN;

PROC HPFOREST DATA=casuser.credit_modeling_sampling MAXTREES=100 SEED=12345;
TARGET Default_Flag / LEVEL=BINARY; /* Specify the binary target */
INPUT Income Loan_Amount
Interest_Rate Debt_to_Income_Ratio Delinquency_History
/ LEVEL=INTERVAL; /* Specify input variables */

/* Optional: for scoring output */
SCORE OUT=CASUSER.rf_preds;
RUN;

I am new user

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1773 views
  • 3 likes
  • 3 in conversation