If you are coming to SAS after using IBM SPSS Statistics, then you probably have a few SPSS data files that you want to continue to use with SAS. SAS provides a simple method that allows you to convert an SPSS data file – a SAV file -- to a SAS data set.
Note: This article shows you how to import SPSS data files in a SAS program. If you're using SAS Enterprise Guide, there is a built-in task that makes this process easy, no programming required. Check out this blog post for details.
The ability to read an SPSS data file is a part of PROC IMPORT when you have SAS/ACCESS to PC Files installed. Good news – this feature is part of the free SAS University Edition, so you can practice this skill from that environment. I've attached an example SAV file to this article so that you can follow along. If you're using SAS University Edition, you can download this file and drop it into your shared myfolders location so that SAS can see it.
Here's the basic structure of the program:
proc import out=WORK.SURVEY
datafile = "/folders/myfolders/my_data/survey.sav"
dbms = SAV replace;
fmtlib = WORK.FORMATS;
The key parts of the program are:
- the DBMS=SAV option, which tells SAS that we're expecting a file in the SPSS data file format
- the FMTLIB option, which tells SAS where to build custom SAS formats if the data makes use of SPSS labels. More on that in just a bit.
After importing the SPSS data for the first time, it's a good idea to take inventory of what you've captured. PROC DATASETS can show you what you've got.
proc datasets lib=work nolist nodetails;
When we run this on my sample SURVEY data, we see the basic data attributes, including the number of observations (which SPSS calls cases) and the variable names and types.
Next, let's preview first 5 observations (or rows or cases) with a simple PROC PRINT step:
proc print data=work.survey (obs=5);
Now, here's an interesting result from my sample SURVEY data. According to the PROC DATASETS output, the SEX variable is numeric. But the displayed values in the PROC PRINT output are text: "female" or "male". If we look more closely at the PROC DATASETS output, we can see that there is a SAS format assigned to SEX – it's called "SEXA." Where did that come from?
In SPSS data you can use a feature called value labels to map "friendly" names to coded variables. In this example, the coded values are 1 and 2, but the value labels are "female" and "male". In SAS, the natural analogy to SPSS value labels is SAS formats. When we ran the PROC IMPORT step, SAS automatically created SAS format rules for any value labels that it finds. These were stored in the WORK.FORMATS catalog – because that's what we told SAS to do with the FMTLIB option.
We can use PROC FORMAT to dig in to the rules:
proc format lib=work;
As you can see, the small PROC IMPORT step performed a lot of work for us, and it made it very easy to use this data -- created in another statistical package-- in SAS.
With the SEXA format as a "user-defined" format, we need to keep the format definition accessible whenever we use the data set, otherwise the coded value labels will get lost. We can save the format rules in an external data set with the CNTLOUT= option.
proc format lib=work
/* save the format rules */
And if you want to remove the formats from the mix, you can "flatten" the data set so that the formatted values become the actual values within the data. The data might lose some fidelity in the process (as the formatted values might be less precise than the underlying raw values), but the data set is then more portable
Secure your spot at the must-attend AI and analytics event of 2024: SAS Innovate 2024! Get ready for a jam-packed agenda featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.
Register by March 1 to snag the Early Bird rate of just $695! Don't miss out on this exclusive offer.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.