BookmarkSubscribeRSS Feed
madsunn
Calcite | Level 5

Hello, 

 

I am sorry in advance for the newbie question. I am Stata user and I am trying to learn SAS for a new project. I downloaded data from a government agency that provides datafiles in SAS. I am able to import SAS data into Stata easily but it does not include the value labels for the variables. The agency also provides three SAS scripts (.sas files) with the codes for formatting, labeling, and proc formatting. I'll provide some examples below: 

 

SAS Format Assignment Statements 

FORMAT ACADV      $YES_NO_CODES.;
FORMAT ACCAR      $YES_NO_CODES.;
FORMAT ACCCEP     $YES_NO_CODES.;
FORMAT ACCHG      $YES_NO_CODES.;
FORMAT ACDRG      $DEGREE_CODES.;

SAS Label Assignment Statements

LABEL ACADV      = N_ED_REF_WK_ENROLL_REAS_ADVANCEMT;
LABEL ACCAR      = N_ED_REF_WK_ENROLL_REAS_BEGIN_CAREER;
LABEL ACCCEP     = N_ED_REF_WK_EMP_ASSISTANCE;
LABEL ACCHG      = N_ED_REF_WK_ENROLL_REAS_CHG_FLD;
LABEL ACDRG      = N_ED_REF_WK_WORK_DEGREE;

SAS PROC FORMAT statements

PROC FORMAT;

/* AGEGR DIFAGEGR */
   VALUE AGE_GROUP_CODES
      20 	= "20: Ages 24 or younger"
      25 	= "25: Ages 25-29"
      30 	= "30: Ages 30-34"
      35 	= "35: Ages 35-39"
      40 	= "40: Ages 40-44"
      45 	= "45: Ages 45-49"
      50 	= "50: Ages 50-54"
      55 	= "55: Ages 55-59"
      60 	= "60: Ages 60-64"
      65 	= "65: Ages 65-69"
      70 	= "70: Ages 70-75"
      98 	= "98: Logical Skip"
       . 	= "Missing"
   ;

/* GRLOANR GROWER UGLOANR UGOWER */
   VALUE $AMOUNT_BORROW_OWE_2D
      "01"	= "01: Did not earn a degree at this level"
      "02"	= "02: $0"
      "03"	= "03: 1 - 10000"
      "04"	= "04: 10001-20000"
      "05"	= "05: 20001-30000"
      "06"	= "06: 30001-40000"
      "07"	= "07: 40001-50000"
      "08"	= "08: 50001-60000"
      "09"	= "09: 60001-70000"
      "10"	= "10: 70001-80000"
      "11"	= "11: 80001-90000"
      "12"	= "12: 90001 or more"
      " " 	= "Missing"
   ;

Is there a way I can read these files into SAS to add the value labels to the dataset?  Then, I plan on exporting a .sas7bcat to upload to Stata. Is this a good plan? 

 

I've tried troubleshooting today but I can't seem to get anything to run successfully. 

 

Thank you for your help! 

 

-M 

 

 

3 REPLIES 3
ballardw
Super User

The key bit that you need but probably aren't familiar enough with SAS to fully understand is, from the online help,

When writing SAS data to a Stata file, the EXPORT procedure saves the value labels that are associated with the variables. The procedure uses the formats that are associated with the variables to retrieve the value entries.

That means that the variable labels and formats must be attached to the data set before export.

So the steps you would take would be:

1) make sure the FORMATS are available to  your current session. That means execute the Proc Format code.

2) make a version of the data set that applies the formats and the labels.

3) export that new set.

The approach to making the exportable data set would look like this (after running the code with the PROC Format):

Data toexport;
   set datasetyouhave;
   %include "<path>format_statmentfilename.sas";
   %include "<path>label_statementfilename.sas";
run;

Path should start at the disk, or root of a drive, and include the full path to the location you saved those files (you did didn't you?).

If your system is Unix/Linux base make sure of the case of the folders.

The %include statement is a way to bring in the code stored in an external file.

 

Then export the Toexport data set using the approach you did before.

 

Instead of a exporting the catalog, which I assume you meant for the Formats as otherwise no catalog should be involved, you can create data set by adding a CNTLOUT option to the proc format code. That would look like:

Proc format cntlout=library.formatdatasetname;

<other format code>

If you don't supply a library then the data set will be written in WORK library as usual. Then you have a dataset that you could export to STATA and use as you may have thought about with the catalog. The data set will have for each observation the name of the format, the start and end values of ranges for each format, the value to display, the format value type numeric or character and some other variables that display information about the ranges and options like "OTHER","HIGH" or "LOW" range limits and include/exclude ends of actual ranges like 1-50.

 

The START and END will be all character so may require some massaging after import into STATA if you want to use numeric values.

 

 

 

 

Tom
Super User Tom
Super User

Also make sure that you tell PROC EXPORT where to find the formats you created with the PROC FORMAT step.

 

And make sure the format names are 8 characters or less as discussed in this older question. 

https://communities.sas.com/t5/SAS-Procedures/Export-SAS-dataset-to-STATA-keeping-the-formats/td-p/6...

 

 

Tom
Super User Tom
Super User

Assuming you created the format catalog, sas dataset and PROC EXPORT code properly the most likely reason the "data labels" (the decode or description of the values in a variables that a FORMAT produces) were not attached to the variables in the STATA data file is because the names of your formats are too long for PROC EXPORT.  It can only recognize format names of 8 characters or less.

 

You can just modify both the PROC FORMAT value statements and the FORMAT statements to use shorter names for the formats.  So convert things like $YES_NO_CODES to $YES_NO since 'YES_NO' is only 6 letters long. And to be safe limit the names of character formats to 7 letters (that is count the $ when checking if it fits in 8 characters.)

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 481 views
  • 0 likes
  • 3 in conversation