I am trying to sort data that has 4 possible categorical data options: Independent, Republican, Democrat, (blank).
I am trying to make a confidence interval, but first I need to have the categorical variables be numeric characters instead, so I tried to sort them using:
data=gss08_Q1;
input=gss08;
if polparty = 'independent' then polparty1 = 1;
if polparty = 'rebublican' then polparty1 = 2;
if polparty = 'democrat' then polparty1= 3;
else polparty= 4;
This returned with multiple of the same errors, am I using this wrong?
You've got other syntax errors in your code like data=gss08_Q1; but I'm not going to comment on a code snippet with no error log provided.
Your "core" syntax should work like tested below once you fix the last assignment
Adding a few more ELSE for efficiency wouldn't hurt as well.
data test;
polparty='independent';
if polparty = 'independent' then polparty1 = 1;
ELSE if polparty = 'rebublican' then polparty1 = 2;
ELSE if polparty = 'democrat' then polparty1= 3;
else polparty1= 4;
run;
Another approach for recoding a value is using a format/informat. Especially if you've got a lot different values for a category this leads to code that's easier to read.
In below example using an informat with the upcase option you've got also the advantage that using the format will work on source strings whether upper-, mixed- or lowcase.
proc format;
invalue pp_num(upcase)
'INDEPENDENT' =1
'REPUBLICAN' =2
'DEMOCRAT' =3
other =4
;
run;
data test2;
polparty='independent';
polparty1=input(polparty,pp_num.);
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.