Hi,
I have followed the demo and example code to run a simple Random Forest model using Open Source Node in Model Studio.
The code is able to run as Preprocessing node but not Supervised Learning. My gut is telling me I might miss define some required variable in order to render the Assessment, but I can not figure out why, could you help to take a look? Thanks!
from sklearn.ensemble import RandomForestClassifier import pandas as pd X = dm_traindf.loc[:, dm_input] y = dm_traindf[dm_dec_target] params = {'n_estimators': 100} dm_model = RandomForestClassifier(**params) dm_model.fit(X, y) fullX = dm_inputdf.loc[:, dm_input] dm_scoreddf = pd.DataFrame(dm_model.predict_proba(fullX), columns=['P__va_d_ped_death_bin0', 'P__va_d_ped_death_bin1'])
More information, the name of the target is "_va_d_ped_death_bin" and it is interval, but it only has 0 and 1 value.
Actually, I just figure it out and I was able to run the python model. after this fix:
dm_scoreddf = pd.DataFrame(dm_model.predict(fullX), columns=['P__va_d_ped_death_bin'])
So what happened is I copied the example code without checking. For interval variable "_va_d_ped_death_bin", the script did not recognize the levels. Once I fix how to create the dm_scoreddf it works afterwards
Actually, I just figure it out and I was able to run the python model. after this fix:
dm_scoreddf = pd.DataFrame(dm_model.predict(fullX), columns=['P__va_d_ped_death_bin'])
So what happened is I copied the example code without checking. For interval variable "_va_d_ped_death_bin", the script did not recognize the levels. Once I fix how to create the dm_scoreddf it works afterwards
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.