I'm using Pandas read_sas method to read a SAS data set into Python. The problem is that a SAS data set stores datetime values as seconds since 1/1/1960 (I think that's right). I want to use Pandas' datetime module, but it expects a datetime format, not an integer. Is there any way around this?
Here's my code:
from tkinter.filedialog import askopenfilename
import numpy as np
import pandas as pd
import datetime as dt
infileName = askopenfilename() main1 = pd.read_sas(infileName, index='SUBID', format='sas7bdat') main2 = main1.loc[: , ['APS', 'Age', 'Diabetes', 'hosdead', 'icudead', 'mv', 'hospid', 'iculos', 'ICU_TYPE', 'visit', 'dx', 'dx1', 'IADMDTTM', 'IDISDTTM', 'HADMDTTM', 'HDISDTTM']] _ = main2.dropna(thresh=12, inplace=True)
The statement below is where I run into problems.
ilos = dt.timedelta(dt.datetime(main2.loc[:, 'IDISDTTM']) - dt.datetime(main2.loc[:,'IADMDTTM']))
I would use something like that, do not forget that it will be UTC time. If you need a local time, you have to do TZ conversion too.
#!/usr/bin/python
import datetime
sas_time = 1201972927
time = datetime.datetime.fromtimestamp(sas_time + 315619200).isoformat()
print(time)
I'm trying to broadcast the function across all date-time values. When I issue this command (IADMDTTM is ICU admission time):
main2['iadmdttm'] = dt.datetime.fromtimestamp(main2['IADMDTTM'] + 315619200).isoformat()
I get this error:
TypeError: cannot convert the series to <class 'int'>
So I'm not sure what to do. 😞
Are you sure that main2['IADMDTTM'] is a timestamp? Run these lines:
print(isinstance(main2['IADMDTTM'], int))
print(main2['IADMDTTM'])
Show me the output of the program above, please.
Here you go. This is an object and the object type isn't an int. Give me a few minutes to check Pandas read_sas method. I've never used it before.
Actually, you do not need to convert anything. Check this program. I've read BUY dataset from SASHELP:
import pandas as pd
buy = pd.read_sas("/home/alex/buy.sas7bdat")
/* Just print */
print(buy)
/* If you want to iterate over each row, you can do this */
for index, row in buy.iterrows():
print(row['DATE'])
print(row['AMOUNT'])
Results:
python pandas_read_sas.py
DATE AMOUNT
0 1996-01-01 -110000.0
1 1997-01-01 -1000.0
2 1998-01-01 -1000.0
3 1999-01-01 -51000.0
4 2000-01-01 -2000.0
5 2001-01-01 -2000.0
6 2002-01-01 -2000.0
7 2003-01-01 -2000.0
8 2004-01-01 -2000.0
9 2005-01-01 -2000.0
10 2006-01-01 48000.0
1996-01-01 00:00:00
-110000.0
1997-01-01 00:00:00
-1000.0
1998-01-01 00:00:00
-1000.0
1999-01-01 00:00:00
-51000.0
2000-01-01 00:00:00
-2000.0
2001-01-01 00:00:00
-2000.0
2002-01-01 00:00:00
-2000.0
2003-01-01 00:00:00
-2000.0
2004-01-01 00:00:00
-2000.0
2005-01-01 00:00:00
-2000.0
2006-01-01 00:00:00
48000.0
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.