Help using Base SAS procedures

Coding Race Variables

Accepted Solution Solved
Reply
Super Contributor
Posts: 268
Accepted Solution

Coding Race Variables

For each obs, I have 5 variables related to race (raceA, raceB, raceP, raceI and raceW).  If 4 of the variables = 'N' and one of them = 'Y' then coding the race is easy.  subgrp_race will be defined by the field that has the 'Y.'  However, if two or more fields = 'Y' then I will define a new variable (RaceO) to denote that two or more races were chosen by the student.

Here's a proc freq below that shows the table raceA*raceB*raceP*raceI*raceW/ list;

I know I can write a buttload of if then statements but there has to be something more elegant.

raceAraceBracePraceIraceW
NNNNY399286.31399286.31
NNNYN40.09399686.40
NNNYY210.45401786.85
NNYNN10.02401886.88
NNYNY40.09402286.96
NYNNN3858.32440795.29
NYNNY1573.39456498.68
NYNYY30.06456798.75
YNNNN360.78460399.52
YNNNY160.35461999.87
YNYNN30.06462299.94
YYNNN10.02462399.96
YYNNY20.044625100.00

Accepted Solutions
Solution
‎07-20-2015 02:44 PM
Frequent Contributor
Posts: 130

Re: Coding Race Variables

Try the code below, I believe it will give you what you want if I'm understanding your description correctly:

data have;

input raceA$ raceB$ raceP$ raceI$ raceW$;

datalines;

N N N N Y

N N N Y N

N N N Y Y

N N Y N N

N N Y N Y

N Y N N N

N Y N N Y

N Y N Y Y

Y N N N N

Y N N N Y

Y N Y N N

Y Y N N N

Y Y N N Y

;

run;

proc contents data=have noprint

out=have_contents (keep=NAME);

run;

%macro race;

data _NULL_;

set have_contents end=lastobs;

call symputx(cats('race',_n_),NAME);

if lastobs then call symputx('n',_n_);

run;

data want;

set have;

combine=catx(',',raceA,raceB,raceP,raceI,raceW);

If countc(combine,"Y")>1 then do;

     raceO="Y";

     subgrp_race="raceO"; end;

Else do;

     %do i=1 %to &n;

     If &&race&i="Y" then do;

     raceO="N";

     subgrp_race="&&race&i"; end;

     %end; end;

drop combine;

run;

%mend;

%race

Hope this helps!

View solution in original post


All Replies
Solution
‎07-20-2015 02:44 PM
Frequent Contributor
Posts: 130

Re: Coding Race Variables

Try the code below, I believe it will give you what you want if I'm understanding your description correctly:

data have;

input raceA$ raceB$ raceP$ raceI$ raceW$;

datalines;

N N N N Y

N N N Y N

N N N Y Y

N N Y N N

N N Y N Y

N Y N N N

N Y N N Y

N Y N Y Y

Y N N N N

Y N N N Y

Y N Y N N

Y Y N N N

Y Y N N Y

;

run;

proc contents data=have noprint

out=have_contents (keep=NAME);

run;

%macro race;

data _NULL_;

set have_contents end=lastobs;

call symputx(cats('race',_n_),NAME);

if lastobs then call symputx('n',_n_);

run;

data want;

set have;

combine=catx(',',raceA,raceB,raceP,raceI,raceW);

If countc(combine,"Y")>1 then do;

     raceO="Y";

     subgrp_race="raceO"; end;

Else do;

     %do i=1 %to &n;

     If &&race&i="Y" then do;

     raceO="N";

     subgrp_race="&&race&i"; end;

     %end; end;

drop combine;

run;

%mend;

%race

Hope this helps!

Super Contributor
Posts: 268

Re: Coding Race Variables

Thanks!

Respected Advisor
Posts: 3,124

Re: Coding Race Variables

you can simply:

data want;

set have;

o=count(cats(of raceSmiley Happy,'Y')>1;

run;

O =1 when you have more than 'Y' selected, otherwise O=0.

Super User
Posts: 10,483

Re: Coding Race Variables

Yet another way:

I read in Yes/no variables as 1 and 0 with custom informats. The Race0 = (Sum(raceA,raceB,raceP,raceI, raceW)>1);

(I also tend to name such thins Race1 through 5 so Race1-Race5 makes better lists).

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 317 views
  • 0 likes
  • 4 in conversation