for SAS local provider for OLE DB cannot resolve non standard user defined format, you mean the original values before formatting would be returned? and I would need to format it again after the sas dataset is converted to txt?
and do you mean whenever a data set has variable involved in a 'PROC FORMAT' procedure, the data set will have non standard user defined format that SAS local provider cannot resolve and will return original value?
@HeatherNewton wrote:
for SAS local provider for OLE DB cannot resolve non standard user defined format, you mean the original values before formatting would be returned? and I would need to format it again after the sas dataset is converted to txt?
and do you mean whenever a data set has variable involved in a 'PROC FORMAT' procedure, the data set will have non standard user defined format that SAS local provider cannot resolve and will return original value?
A FORMAT is instructions for how to display the values (when printed for example). If the format cannot be found then value will just be printed using the default display. A user defined format is one you created by using PROC FORMAT.
Most user defined formats are simple one-to-one recoding of the data. So when you work with the data in the new system you need to work with the coded values (the same as you would in your SAS code). If you wanted to make the pretty printouts then you would need do something to convert the stored values to pretty display values.
You can install it on a Windows server, yes. It is a Windows-specific technology. By the way, the SAS local data provider for OLE DB is the same component that makes the SAS Universal Viewer work.
As far as the file size that it can handle, I can't say. SAS can handle very large data, but that's because the SAS compute server is robust. The local provider won't have that benefit. You'll just have to try it.
Thanks for letting us know of the error you are having. I now understand the reason of not preferring Python.
The error "Length of values does not match length of index" is a known one and typically occurs when "The error comes up when you are trying to assign a list of numpy array of different length to a data frame ".
(Google the error --> lots of results).
Out of curiosity I performed a test -reading a sample SAS dataset with missing values. However I have not been able to reproduce this error . I am sharing my result.
(Please proceed with what ever approach you deem fit. I am just sharing my results).
My dataset and python code are given below.
(I created the sample dataset using SAS On Demand for Academics and the Python version I am using is 3.9.2 on Debian 11.4.and Jupyter IDE Version 6.4.12.
Please note that the column noval is totally blank i.e. all values are missing, There are missing values in the last few rows for other columns.)
Code for dataset
-----------------------------------
libname test '/home/my_uid/test';
data test.for_python;
input sepal_length sepal_width petal_length petal_width noval;
infile datalines truncover;
datalines;
5.1 4.6 3.3 3.6
4.9 5.0 4.6 5.6
4.7 5.4 4.7 5.4
4.6 4.6 5.6 4.6
5.0 5.0 6.7 4.4
5.4 4.4 5.0 5.0
4.6 4.9 4.8 4.9
5.0 5.1
4.4 5.2
4.9 5.3
5.1 4.6 . 3.6
7.1 . . 6.6
. 5.2 . 6.6
;
run;
-----------------------
Python Code
----------------------
import pandas as pd ;
import os;
path="/home/my_uid/Downloads";
#python dataset
file="for_python.sas7bdat";
#excel file for output
exl="for_python.xlsx";
##the source dataset
ds=os.path.join(path, file);
#output excel file
exlf=os.path.join(path, exl);
df=pd.read_sas(ds,format="sas7bdat",encoding="utf-8");
# saving the excel
df.to_excel(exlf,index=False);
print('DataFrame has been written to Excel File successfully.');
Thanks. I am not too sure how it happens. Seems it happens with some dataset with missing values but not all. I hope I can figure out later.
if I use sas local data provider for OLE DB, I wont see the non standard user defined format. if I have to compare my result text file with original sas data set. how can I display this data in sas such that I dont see the non standard format since I cannot use proc print, what commands can I use to see the data? just double click on the data set to open it?
When you use read_sas, are you able to keep the format (standard or non standard user defined) in the resulting csv file?
No , the SAS format specifications have no effect.
Python reads the data that is stored in the dataset processes what it has read. There are scenarios where custom formats are not honored in SAS itself.
It is pertinent to point that SAS formats are specifications for visual representation. it impacts how the data is displayed (/ printed for example by proc print) and not how it is stored. The format specifications are honored as long as the format specifications/ format library/catalog is available.
If the same dataset is opened in another distinct SAS session and the required format specifications are not available, then SAS displays the data as it is stored.
@HeatherNewton wrote:
But if i do
Data pg1.test;
Set apple;
Format purchasedate date9.;
Run;
This format is permanently saved, right? So whether i proc print or not, i should get date9. In any data display?
Yes. You can also change the format with PROC DATASETS, but both methods require a SAS license. Which you don't seem to have.
date9. is an internal SAS format. So the format is available to all SAS sessions and will be honored.
If one exports the pg1.test dataset to say a csv file using SAS, the date will correspond to the date9. format specification. Internally however, SAS treats dates as a number.
However external tools like Python will not honor this format and date will be displayed as it is stored.
The long and short of it is that if the desire is to export data from a SAS dataset to say a csv file retaining the internal (i.e., SAS provided formats) and custom formats (the one's that the programmer has written) then there would be two requirements namely (1) the custom format catalog/library (2) SAS software.
However if any other software (say Python, R or may be something else) is used to perform the export, then one has to be content with whatever features the software offers.
I would like to add the following to make my earlier answer more clear.
A format is a type of SAS language element that applies a pattern to or executes instructions for a data value to be displayed or written as output. (https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/leforinforref/p0ivpbwvkfwguqn12eew51gtu7y1.ht... )
When a format is associated with a variable, the format attribute of the variable is updated with the name of the format. (A third party tool may or may not use this attribute).
The original value of the variable and its type (character/numeric) remains unaltered.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.