BookmarkSubscribeRSS Feed
edwardrao
Calcite | Level 5

Hi, I have been using Python to process data all the time. Now I'm trying to study SAS.

I'm struggling with the code below, and couldn't figure out how to use SAS to do so.

 

Could anyone help me with this?

 

 

# -*- coding: utf-8 -*-
"""
Created on Wed Nov 28 19:18:27 2018

@author: Luke
"""

import pandas as pd
import numpy as np
import statsmodels.api as sm


dataset = pd.read_csv("D:\Python\Data3.csv", index_col=0,encoding='gbk')
result = {'Name':'LinearRgression Result'}
dic = {}
for j in range(30,7*360,30):
list1 = []
list2 = []
for i in range(2880-j+1):
data = dataset.iloc[i:i+j-1,0:8]
X = data[['t7 ']]
y = data['t8']
X = sm.add_constant(X)
est = sm.OLS(y,X).fit()
predReturn = est.params.t7 * dataset.iloc[i+j,6] + est.params.const #+ est.params.t7 * dataset.iloc[i+j,6]
diff1 = dataset.iloc[i+j,7]-predReturn
list1.append(diff1)
diff2 = dataset.iloc[i+j,7]-data.iloc[0:,7].mean()
list2.append(diff2)
sum1 = 0
sum2 = 0
for k in list1:
sum1 = sum1 + k ** 2
for k in list2:
sum2 = sum2 + k ** 2
dic['%d'%j]=1-sum1/sum2
print(j/360/7*100,'%')


for i in dic.keys():
print(dic[i])

5 REPLIES 5
ballardw
Super User

First warning: I know nothing about Python so …

 

dataset = pd.read_csv("D:\Python\Data3.csv", index_col=0,encoding='gbk')

Looks like you want to read an external file. If the dat is real clean you could possibly use Proc Import in SAS to read the data, though I have no clue what encoding='gbk' might mean in use of Proc Import. Generally I use a Data step to read external CSV files as I control variable types and such that guessing by another program might create incorrectly.

 

result = {'Name':'LinearRgression Result'}

leads me to believe that you want to do a linear regression on the read data. SAS has multiple procedures that will do regressions with Proc REG and Proc GLM being the basic starting points to others depending on specific needs and limitation of your data. The regression procedures let you indicate which data set to use, the Model statement lets you indicate dependent and independent variables (and interactions): Model y = a b c ; would have y as dependent and use the variables a b and c as the independent variables.

 

After that depends on what you actually want the code to do. I would likely start with Proc Reg mainly because it has fewer options to get lost in. Look in the online documentation for examples with data, code and output.

Reeza
Super User

You can call python from SAS. 

 

From what I know of python, it looks like you're doing a regression and then some calculations to get residuals and standardizing them?

Is that correct? You're likely to get a better response if you explain what you're dong rather than providing the code without comment. And you should comment your code, regardless of what language you use. 

 

Here's a full tutorial on doing regression with SAS. 

 

https://stats.idre.ucla.edu/sas/webbooks/reg/chapter1/regressionwith-saschapter-1-simple-and-multipl...

 


@edwardrao wrote:

Hi, I have been using Python to process data all the time. Now I'm trying to study SAS.

I'm struggling with the code below, and couldn't figure out how to use SAS to do so.

 

Could anyone help me with this?

 

 

# -*- coding: utf-8 -*-
"""
Created on Wed Nov 28 19:18:27 2018

@author: Luke
"""

import pandas as pd
import numpy as np
import statsmodels.api as sm


dataset = pd.read_csv("D:\Python\Data3.csv", index_col=0,encoding='gbk')
result = {'Name':'LinearRgression Result'}
dic = {}
for j in range(30,7*360,30):
list1 = []
list2 = []
for i in range(2880-j+1):
data = dataset.iloc[i:i+j-1,0:8]
X = data[['t7 ']]
y = data['t8']
X = sm.add_constant(X)
est = sm.OLS(y,X).fit()
predReturn = est.params.t7 * dataset.iloc[i+j,6] + est.params.const #+ est.params.t7 * dataset.iloc[i+j,6]
diff1 = dataset.iloc[i+j,7]-predReturn
list1.append(diff1)
diff2 = dataset.iloc[i+j,7]-data.iloc[0:,7].mean()
list2.append(diff2)
sum1 = 0
sum2 = 0
for k in list1:
sum1 = sum1 + k ** 2
for k in list2:
sum2 = sum2 + k ** 2
dic['%d'%j]=1-sum1/sum2
print(j/360/7*100,'%')


for i in dic.keys():
print(dic[i])


 

novinosrin
Tourmaline | Level 20

Courtesy of @rbetancourt  , i got this.

 

https://nbviewer.jupyter.org/github/RandyBetancourt/PythonForSASUsers/tree/master/

 

Let's see if he can shed some light. 

 

 

rbetancourt
Obsidian | Level 7

Hi @edwardrao & @novinosrin,

 

I am more familiar with SAS for data management and know little regarding statistical methods.  @Reeza's link to the UCLA site for regression analysis techniques is a good place to start.  I'd also recommend you take a look at SASpy which exposes Python API's to Foundation SAS.  It allows you to run a Python session in a Jupyter notebook and connect a Foundation-SAS sub-process to it.  

 

The two features you may want to focus on is the reg() method and teach_me_SAS.  The reg() method, as the name implies, allows you to call PROC REG through a Python method call.  See cell 40 in this notebook.  It is specific to time-series analysis, however, it gives you an idea on how to call the reg() method.  The teach_me_SAS method is also useful.  It returns the SAS program syntax it would otherwise execute so you can learn by example.

 

I'm attaching a draft of chapter 9, on SASpy from my upcoming book, Python for SAS Users.  It illustrates basic capabilities and provides an example for how to install and configure SASpy. 

novinosrin
Tourmaline | Level 20

Thank you Sir @rbetancourt for chiming in. Much appreciate your time. Best regards!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2447 views
  • 5 likes
  • 5 in conversation