02-22-2018 12:09 AM
I'm new to SAS and I'm having a hard time dummy coding dummy variables. I have a 4 level categorical variables (level of education) and understand I should have 3 dummy variables, but that output codes 0 for levels 2 and 4. Should I include the 4th code, and if I do, how do I know which one is the reference variable. Thank you.
IF _educag = 1 THEN educagd=1; ELSE educagd = 0;
IF _educag = 2 THEN educagd=1; ELSE educagd = 0;
IF _educag = 3 THEN educagd=1; ELSE educagd = 0;
02-22-2018 03:31 AM
Why do you need those dummy vars? Having a single variable with four distinct values is almost always easier to deal with, than four variables. Maybe adding some put-statements will help you understanding what happens.
data work.import1; set work.import; if _educag = 1 then educagd=1; else educagd = 0; put educagd=; if _educag = 2 then educagd=1; else educagd = 0; put educagd=; if _educag = 3 then educagd=1; else educagd = 0; put educagd=; run;
And please avoid coding in all upcase.
02-22-2018 08:27 AM
Your code is creating only a single dummy variable named EDUCAGD. You do assign it, and re-assign its value several times.
Instead, you need to assign values to three different dummy variables:
IF _educag = 1 THEN educagd1=1; ELSE educagd1 = 0;
IF _educag = 2 THEN educagd2=1; ELSE educagd2 = 0;
IF _educag = 3 THEN educagd3=1; ELSE educagd3 = 0;
Also note, most of the time it is not necessary to create these at all. The procedure that you use might be able to create the right number of dummy variables automatically. (See if the procedure you use will support a CLASS statement.)