I'm using Pandas read_sas method to read a SAS data set into Python. The problem is that a SAS data set stores datetime values as seconds since 1/1/1960 (I think that's right). I want to use Pandas' datetime module, but it expects a datetime format, not an integer. Is there any way around this?
Here's my code:
from tkinter.filedialog import askopenfilename
import numpy as np
import pandas as pd
import datetime as dt
infileName = askopenfilename() main1 = pd.read_sas(infileName, index='SUBID', format='sas7bdat') main2 = main1.loc[: , ['APS', 'Age', 'Diabetes', 'hosdead', 'icudead', 'mv', 'hospid', 'iculos', 'ICU_TYPE', 'visit', 'dx', 'dx1', 'IADMDTTM', 'IDISDTTM', 'HADMDTTM', 'HDISDTTM']] _ = main2.dropna(thresh=12, inplace=True)
The statement below is where I run into problems.
ilos = dt.timedelta(dt.datetime(main2.loc[:, 'IDISDTTM']) - dt.datetime(main2.loc[:,'IADMDTTM']))
I would use something like that, do not forget that it will be UTC time. If you need a local time, you have to do TZ conversion too.
#!/usr/bin/python
import datetime
sas_time = 1201972927
time = datetime.datetime.fromtimestamp(sas_time + 315619200).isoformat()
print(time)
I'm trying to broadcast the function across all date-time values. When I issue this command (IADMDTTM is ICU admission time):
main2['iadmdttm'] = dt.datetime.fromtimestamp(main2['IADMDTTM'] + 315619200).isoformat()
I get this error:
TypeError: cannot convert the series to <class 'int'>
So I'm not sure what to do. 😞
Are you sure that main2['IADMDTTM'] is a timestamp? Run these lines:
print(isinstance(main2['IADMDTTM'], int))
print(main2['IADMDTTM'])
Show me the output of the program above, please.
Here you go. This is an object and the object type isn't an int. Give me a few minutes to check Pandas read_sas method. I've never used it before.
Actually, you do not need to convert anything. Check this program. I've read BUY dataset from SASHELP:
import pandas as pd
buy = pd.read_sas("/home/alex/buy.sas7bdat")
/* Just print */
print(buy)
/* If you want to iterate over each row, you can do this */
for index, row in buy.iterrows():
print(row['DATE'])
print(row['AMOUNT'])
Results:
python pandas_read_sas.py
DATE AMOUNT
0 1996-01-01 -110000.0
1 1997-01-01 -1000.0
2 1998-01-01 -1000.0
3 1999-01-01 -51000.0
4 2000-01-01 -2000.0
5 2001-01-01 -2000.0
6 2002-01-01 -2000.0
7 2003-01-01 -2000.0
8 2004-01-01 -2000.0
9 2005-01-01 -2000.0
10 2006-01-01 48000.0
1996-01-01 00:00:00
-110000.0
1997-01-01 00:00:00
-1000.0
1998-01-01 00:00:00
-1000.0
1999-01-01 00:00:00
-51000.0
2000-01-01 00:00:00
-2000.0
2001-01-01 00:00:00
-2000.0
2002-01-01 00:00:00
-2000.0
2003-01-01 00:00:00
-2000.0
2004-01-01 00:00:00
-2000.0
2005-01-01 00:00:00
-2000.0
2006-01-01 00:00:00
48000.0
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.