SAS Procedures

Help using Base SAS procedures
BookmarkSubscribeRSS Feed
rhaley1821
Obsidian | Level 7

Hi all,

 

I am a new sas user trying to clean up a dataset. I am interested in coding some categorical variables into a composite variable of water and sanitation quality. 

 

I have a dataset from nicaragua where responses have accents. It was a previous SPSS file, and when I import to SAS all accent characters convert to unknown characters and prevent me from running a proc freq. I simply need 9 categories to code as 0/1, so the variable could simply be converted to numeric if I know what the values mean. Can someone please advise on how to get rid of the unknown character? The variable in question is S1P15

 

dataset is attached. Current proc import below: 

*import spss dataset and convert;
proc import datafile = "/folders/myfolders/sasuser.v94/WFP/datasets/EMNV14-02 DATOS DE LA VIVIENDA Y EL HOGAR (1).SAV"
out= work.nicaragua
dbms=sav
replace;
run;

 

Thank you! 

2 REPLIES 2
japelin
Rhodochrosite | Level 12

try this code.

It's not perfect, but I think it will be possible to proc freq with categorical variables.

filename imp "/folders/myfolders/sasuser.v94/WFP/datasets/EMNV14-02 DATOS DE LA VIVIENDA Y EL HOGAR (1).SAV" encoding='utf-8';
proc import datafile = imp
  out= work.nicaragua
  dbms=sav
  replace;
run;
Tom
Super User Tom
Super User

Your example dataset only has numeric variables.  So the dataset should work fine.

But the formats might be generated using the original encoding instead of the encoding of your SAS session.

Here is method to convert the format text from WLATIN1 to UTF-8.

First import the SAV file and tell it to build the format catalog.

proc import datafile = "c:\downloads\spss.sav"
  dbms=sav
  out= work.nicaragua replace
;
  fmtlib=work.nicaragua;
run;

Then convert the format catalog to a dataset.  And change the values of the LABEL variable from WLATIN1 to UTF-8 encoding.  Get rid of the MIN/MAX/DEFAULT/LENGTH variables so that PROC FORMAT will recalculate the default length to use based on the adjusted label values.

proc format lib=work.nicaragua out=formats; run;
data formats;
  length label $200;
  set formats ;
  label=kcvt(label,'wlatin1','utf-8');
  keep fmtname start end label;
run;
proc format lib=work.nicaragua cntlin=formats ; run;

Now let's try using the labels. If you didn't write the formats into the WORK.FORMATS catalog then make sure to add the catalog to the FMTSEARCH option.

options insert=(fmtsearch=(work.nicaragua));
proc freq data=nicaragua;
 tables S1P25 ;
run;

Results:

image.png

 

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1356 views
  • 2 likes
  • 3 in conversation