Solved: Python as opensource code

sarafrass · Posted 01-20-2022 02:44 PM

Good afternoon everyone!

I'm building a pipeline in SAS Vyia for learners and I would like to plot (if possible) a correlation matrix by computing it in python language (I attached the pipeline).

The code I've written so far is this on:

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

bank_df = pd.read_csv(dm_inputdf + "node_data.csv")
corr_df = bank_df.corr(method = "pearson")
plt.figure(figsize=(10,8))
sns.heatmap(corr_df, annot=True)
plt.show()

Unfortunately I cannot execute it because it keeps giving me error.

Actually, I would like to print a correlation matrix of the entire datafile but I can't understand which kind of path I should use..

Thank you so much,

Sara

HarrySnart · Posted 01-21-2022 05:45 AM

Hi @sarafrass , for a Model Studio pipeline you should save the plot to a png file which then gets rendered in the results. This is because the Open Code Node is running a subprocess - so the results aren't in the CAS session directly. Likewise if you want to render a table you need to dump the .csv file as part of your Python script. There are some examples in the getting started GitHub projects: https://github.com/sassoftware/sas-viya-dmml-pipelines/tree/master/open_source_code_node/simple_fore...

Have you validated the data is coming in as well? From memory you don't need the 'node.csv' part if you're using the dm_inputdf variable. The dm_inputdf is a convenience function to automatically download the table into CSV and load into Pandas saving you dealing with the .csv file import directly.

Thanks

Harry

View solution in original post

HarrySnart · Posted 01-21-2022 05:45 AM

Hi @sarafrass , for a Model Studio pipeline you should save the plot to a png file which then gets rendered in the results. This is because the Open Code Node is running a subprocess - so the results aren't in the CAS session directly. Likewise if you want to render a table you need to dump the .csv file as part of your Python script. There are some examples in the getting started GitHub projects: https://github.com/sassoftware/sas-viya-dmml-pipelines/tree/master/open_source_code_node/simple_fore...

Have you validated the data is coming in as well? From memory you don't need the 'node.csv' part if you're using the dm_inputdf variable. The dm_inputdf is a convenience function to automatically download the table into CSV and load into Pandas saving you dealing with the .csv file import directly.

Thanks

Harry

Python as opensource code

Re: Python as opensource code

Re: Python as opensource code