BookmarkSubscribeRSS Feed
Planktonisgood
Calcite | Level 5

Hi all,

I'm working on a predictive modeling pipeline in SAS Viya using Open Source Code nodes (Python).
The pipeline works well during training, but when I add a Scoring Node, I get this : 

 

ERROR: Scoring cannot be completed because new variables have been created by one or more Open Source node(s).

I understand this may be due to the fact that new variables (like predictions) are being created dynamically in the Python code, but I haven't found a clear solution to properly register or declare them for scoring.

I've had several long chats with ChatGPT — no luck so far, I'm now in desperate need of help
I'm attaching the code below for context. Any help, advice, or working example would be sincerely appreciated!

Thanks in advance 🙏


Training Code : 
import pandas as pd
import numpy as np
import lightgbm as lgb

target_col = 'Stress_level'
target_months = [202308, 202309, 202409, 202410]

record = dm_inputdf.copy()

dm_interval_input = ["DATA_YM", "blood_pressure", "sleep_duration", "work_hours", "age"]

rec_intv = record[dm_interval_input].astype(np.float32)

rec_all = pd.concat([rec_intv.reset_index(drop=True)], axis=1)

record["target_group"] = np.where(record["DATA_YM"].isin(target_months), "tr1", "tr2")
tr1 = rec_all[record["target_group"] == "tr1"]
tr2 = rec_all[record["target_group"] == "tr2"]
y_tr1 = record.loc[record["target_group"] == "tr1", target_col].astype(np.float32)
y_tr2 = record.loc[record["target_group"] == "tr2", target_col].astype(np.float32)

model_tr1 = lgb.LGBMRegressor(
n_estimators=3000,
learning_rate=0.05,
max_depth=12,
num_leaves=13,
force_row_wise=True
)
model_tr1.fit(tr1, y_tr1)

model_tr2 = lgb.LGBMRegressor(
n_estimators=3000,
learning_rate=0.05,
max_depth=12,
num_leaves=13,
force_row_wise=True
)
model_tr2.fit(tr2, y_tr2)

pred_tr1 = model_tr1.predict(tr1)
pred_tr2 = model_tr2.predict(tr2)

record.loc[record["target_group"] == "tr1", "P_Stress_level"] = np.clip(pred_tr1, 0, None)
record.loc[record["target_group"] == "tr2", "P_Stress_level"] = np.clip(pred_tr2, 0, None)
 
dm_scoreddf = record.copy()
dm_scoreddf["P_Stress_level"] = dm_scoreddf["P_Stress_level"].astype(np.float64)
dm_scoreddf=dm_scoreddf[[
"Stress_level", "blood_pressure", "sleep_duration", "work_hours", "age", "P_Stress_level", "DATA_YM"]]
dm_scoreddf["P_Stress_level"].attrs.update({
"role": "PREDICTION",
"level": "INTERVAL",
"description": "LightGBM  Predition"
})



Scoring Code : 
 
import pandas as pd
import numpy as np

def score_method(blood_pressure, sleep_duration, work_hours, age, DATA_YM):
"Output: P_Stress_level"

record = pd.DataFrame([[blood_pressure, sleep_duration, work_hours, age, DATA_YM
]], columns=['blood_pressure', 'sleep_duration', 'work_hours', 'age', 'DATA_YM'
])

dm_interval_input = [col for col in record.columns if col not in dm_class_input]

rec_intv = record[dm_interval_input]

rec_intv_imp = imputer.transform(rec_intv)

rec = np.concatenate((rec_intv_imp), axis=1)

rec_pred = model_tr1.predict(rec) if int(DATA_YM) in target_months else model_tr2.predict(rec)

return float(np.clip(rec_pred[0], 0, None))

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 0 replies
  • 1519 views
  • 0 likes
  • 1 in conversation