BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sarafrass
Fluorite | Level 6

Good afternoon everyone!

 

I'm building a pipeline in SAS Vyia for learners and I would like to plot (if possible) a correlation matrix by computing it in python language (I attached the pipeline).

The code I've written so far is this on:

 

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

 

bank_df = pd.read_csv(dm_inputdf + "node_data.csv")

corr_df = bank_df.corr(method = "pearson")

plt.figure(figsize=(10,8))

sns.heatmap(corr_df, annot=True)

plt.show()

Unfortunately I cannot execute it because it keeps giving me error.

Actually, I would like to print a correlation matrix of the entire datafile but I can't understand which kind of path I should use.. 

 

Thank you so much,

Sara

1 ACCEPTED SOLUTION

Accepted Solutions
HarrySnart
SAS Employee

Hi @sarafrass , for a Model Studio pipeline you should save the plot to a png file which then gets rendered in the results. This is because the Open Code Node is running a subprocess - so the results aren't in the CAS session directly. Likewise if you want to render a table you need to dump the .csv file as part of your Python script.  There are some examples in the getting started GitHub projects: https://github.com/sassoftware/sas-viya-dmml-pipelines/tree/master/open_source_code_node/simple_fore...

 

Have you validated the data is coming in as well? From memory you don't need the 'node.csv' part if you're using the dm_inputdf variable. The dm_inputdf is a convenience function to automatically download the table into CSV and load into Pandas saving you dealing with the .csv file import directly.

 

Thanks

Harry

View solution in original post

1 REPLY 1
HarrySnart
SAS Employee

Hi @sarafrass , for a Model Studio pipeline you should save the plot to a png file which then gets rendered in the results. This is because the Open Code Node is running a subprocess - so the results aren't in the CAS session directly. Likewise if you want to render a table you need to dump the .csv file as part of your Python script.  There are some examples in the getting started GitHub projects: https://github.com/sassoftware/sas-viya-dmml-pipelines/tree/master/open_source_code_node/simple_fore...

 

Have you validated the data is coming in as well? From memory you don't need the 'node.csv' part if you're using the dm_inputdf variable. The dm_inputdf is a convenience function to automatically download the table into CSV and load into Pandas saving you dealing with the .csv file import directly.

 

Thanks

Harry

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 338 views
  • 0 likes
  • 2 in conversation