BookmarkSubscribeRSS Feed
guillaume_bs
Calcite | Level 5

Hi everyone,

 

I have a python script that needs to read data, and, for some reason (related to our python pipeline) we have to read a single file.

When exporting SAS data containing formats with labels, two files are exported (sas7bdat for the data without applying the labels, sas7bcat for the catalog containing the labels).

 

I would like to export the sas7bdat file with the data containing the labels applied ; could you please help me with the correct approach for this ?

(I really couldn't find a clear answer ; seems that put(data_column, $format_name) does it for one column but I would like a synthetic way to get this done in one shot on all columns at the export).

 

Many thanks !!

8 REPLIES 8
Shmuel
Garnet | Level 18
Please post the code of loading the data from Pytom to sas.
I assume it can be done by adapting that code.
guillaume_bs
Calcite | Level 5
Thanks Shmuel for your message.

I use the pyreadstat library (wrapping the readstat C library).
I know it can read data and catalog simultaneously, to apply the catalog, as stated here : https://github.com/Roche/pyreadstat#reading-value-labels with the function pyreadstat.read_sas7bdat('data.sas7bdat', catalog_file='catalog.sas7bcat', ...)

My problem is that I can only bring the sas7bdat file to this function and my pipeline doesn't allow me to get any sas7bcat file.
So the formating should be applied when generating the sas7bdat file.

I hope my question is clearer with these details,
Thanks !!
Kurt_Bremser
Super User

If the dataset has formats applied to columns, and the formats are not available, then you are stuck with the raw (unformatted) values.

If the Python module cannot deal with such an event, you need to ask the source for the dataset to remove the formats before sending you the dataset.

If this is for non-commercial purposes, you could do that yourself with SAS On Demand.

guillaume_bs
Calcite | Level 5
Thanks Kurt !
Yes the situation you describe is exactly what I am facing.
Right now, most of the time, the datasets passed don't have any format applied, so there is no loss in the sas7bdat file and it works fine.
Some times, there was a format applied and it gets lost when only the sas7bdat file is used.

My question (it is probably a dumb SAS question) would be : when exporting the file, what should be done to apply the format ? (so the sas7bdat file contains the well-formated data and the sas7bdat file is basically empty)
Kurt_Bremser
Super User

Run a PROC CONTENTS to determine where custom formats are used.

Then run a DATA step where you use PUT or VVALUE functions to convert raw values to formatted values, or where you simply remove formats that don't do anything special (e.g. currency formats). Take care to create new variables with sufficient length.

guillaume_bs
Calcite | Level 5
Thanks a lot Kurt.
Yes applying the put(...) function to the data to get the format applied & stored in the data itself produces the expected result indeed.
I was wondering if there was a way to avoid doing it explicitly and just dump the data with all columns replaced with their "formatted equivalent".
If there is no way to do that I'll use the explicit put(...) to create new, formatted columns as you suggest.

Thanks for your help !
Tom
Super User Tom
Super User

Why not just dump the data to a text file.  That will preserve the formatting since everything will be text.  Then send the text file to your Python code instead of the SAS dataset.

 

But why does it matter?  What is the format doing? 

If the format is just doing a one to one replacement of the raw value with some decoded text why not just use the raw (coded) values in your Python code?

 

Or are you somehow using the format to collapse the raw data so that multiple distinct raw values are mapped into a single formatted value?

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2488 views
  • 1 like
  • 4 in conversation