BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sarafrass
Fluorite | Level 6

Good afternoon everyone!

 

I'm building a pipeline in SAS Vyia for learners and I would like to plot (if possible) a correlation matrix by computing it in python language (I attached the pipeline).

The code I've written so far is this on:

 

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

 

bank_df = pd.read_csv(dm_inputdf + "node_data.csv")

corr_df = bank_df.corr(method = "pearson")

plt.figure(figsize=(10,8))

sns.heatmap(corr_df, annot=True)

plt.show()

Unfortunately I cannot execute it because it keeps giving me error.

Actually, I would like to print a correlation matrix of the entire datafile but I can't understand which kind of path I should use.. 

 

Thank you so much,

Sara

1 ACCEPTED SOLUTION

Accepted Solutions
HarrySnart
SAS Employee

Hi @sarafrass , for a Model Studio pipeline you should save the plot to a png file which then gets rendered in the results. This is because the Open Code Node is running a subprocess - so the results aren't in the CAS session directly. Likewise if you want to render a table you need to dump the .csv file as part of your Python script.  There are some examples in the getting started GitHub projects: https://github.com/sassoftware/sas-viya-dmml-pipelines/tree/master/open_source_code_node/simple_fore...

 

Have you validated the data is coming in as well? From memory you don't need the 'node.csv' part if you're using the dm_inputdf variable. The dm_inputdf is a convenience function to automatically download the table into CSV and load into Pandas saving you dealing with the .csv file import directly.

 

Thanks

Harry

View solution in original post

1 REPLY 1
HarrySnart
SAS Employee

Hi @sarafrass , for a Model Studio pipeline you should save the plot to a png file which then gets rendered in the results. This is because the Open Code Node is running a subprocess - so the results aren't in the CAS session directly. Likewise if you want to render a table you need to dump the .csv file as part of your Python script.  There are some examples in the getting started GitHub projects: https://github.com/sassoftware/sas-viya-dmml-pipelines/tree/master/open_source_code_node/simple_fore...

 

Have you validated the data is coming in as well? From memory you don't need the 'node.csv' part if you're using the dm_inputdf variable. The dm_inputdf is a convenience function to automatically download the table into CSV and load into Pandas saving you dealing with the .csv file import directly.

 

Thanks

Harry