Hello,
I'm new to SAS and I'm having a hard time dummy coding dummy variables. I have a 4 level categorical variables (level of education) and understand I should have 3 dummy variables, but that output codes 0 for levels 2 and 4. Should I include the 4th code, and if I do, how do I know which one is the reference variable. Thank you.
DATA WORK.IMPORT1;
SET WORK.IMPORT;
IF _educag = 1 THEN educagd=1; ELSE educagd = 0;
IF _educag = 2 THEN educagd=1; ELSE educagd = 0;
IF _educag = 3 THEN educagd=1; ELSE educagd = 0;
RUN;
Why do you need those dummy vars? Having a single variable with four distinct values is almost always easier to deal with, than four variables. Maybe adding some put-statements will help you understanding what happens.
data work.import1;
set work.import;
if _educag = 1 then educagd=1; else educagd = 0;
put educagd=;
if _educag = 2 then educagd=1; else educagd = 0;
put educagd=;
if _educag = 3 then educagd=1; else educagd = 0;
put educagd=;
run;
And please avoid coding in all upcase.
Your code is creating only a single dummy variable named EDUCAGD. You do assign it, and re-assign its value several times.
Instead, you need to assign values to three different dummy variables:
IF _educag = 1 THEN educagd1=1; ELSE educagd1 = 0;
IF _educag = 2 THEN educagd2=1; ELSE educagd2 = 0;
IF _educag = 3 THEN educagd3=1; ELSE educagd3 = 0;
Also note, most of the time it is not necessary to create these at all. The procedure that you use might be able to create the right number of dummy variables automatically. (See if the procedure you use will support a CLASS statement.)
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.