BookmarkSubscribeRSS Feed
lren
Calcite | Level 5

The SAS website ( at https://go.documentation.sas.com/?cdcId=vdmmlcdc&cdcVersion=8.3&docsetId=vdmmlref&docsetTarget=n0gn2... )
has plenty of documentation on building a model using python or R and mentions repeatedly that you can move the open source node to the "Supervised learning" category. 
It pretty much totally lacks any documentation on how to use the node otherwise. Is it not possible to say, preprocess data with python? If I want a python node to simply transform or remove certain rows of my data, how do I accomplish this?
The output variables/csv files mentioned for model training specify that the number of input and output rows must be the same. 
No other avenue of output is mentioned in the documentation.

currently, I have some test data in a test pipeline. I have a single line of code that changes all the values in the input dataframe to "1".
The output of the open source node is just the input data, unchanged. I've logged the input dataframe to the console to verify the change ocurred. the output dataframe of the node is the original, unchanged, input.

How do I transform and output data using the open source node?

1 REPLY 1
RadhikhaMyneni
SAS Employee

Hi Iren,

In Model Studio, the capability provided by any node (preprocessing or modeling) is backed with underlying score code, which enables the passage of information through the pipeline -- what I mean is, score code is necessary for data transformations in one node to be passed along to the subsequent node. This was done for many reasons – not to create multiple copies of data which can get unmanageable as it grows but also the need to deploy the flow into production when the project is done.

 

Since similar score code is not possible when working with Python or R, the Open Source Code node cannot support preprocessing data as suggested. The primary goal of the Open Source Code node is to enable users to train and compare open source models in Python or R with other modeling nodes in the pipeline. This functionality is possible even without score code because the burden of providing the actual predictions in dm_scoreddf data frame is placed on the user from which model assessment and thus comparison can be accomplished.

 

Though there is no easy answer for what you want to do, you can choose to use the SAS Code node for any custom data preprocessing that needs to be done (assuming none of the existing nodes in Model Studio fulfil your needs) or you can choose to do the preprocessing in SAS Data Studio which is built for data manipulation.

 

Hope this helps,

Radhikha

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 1402 views
  • 1 like
  • 2 in conversation