If it is that simple, and those are the only race/demographic variables you have, a bunch of IF-THEN statements are preferable. My solution is a bit convoluted and could be considered overkill in this example. It also assumes that no one can be multiple races. If someone has multiple races, it will take the last observation carrying that value.
data have;
infile datalines delimiter = ',' dsd;
input RaceAAB $ RaceAMIN $ RaceASIAN $ RaceNH_PI $ RaceWHITE $;
datalines;
,,,,Y
Y,,,,
,,Y,,
,,,Y,
Y,,,,
,Y,,,
,,,,Y
;
run;
proc format;
value $race_col
'AAB' = 'African American or Black'
'AMIN' = 'Native American'
'ASIAN' = 'Asian'
'NH_PI' = 'Native Hawaiian'
'WHITE' = 'White'
other = 'Not matched';
run;
data want (drop = i);
set have;
array _r [*] $ RaceAAB -- RaceWHITE;
do i = 1 to dim(_r);
if _r[i] = 'Y' then Race = put(strip(tranwrd(vname(_r[i]), 'Race', '')), $race_col.);
end;
run;
As always, if something can be done simply, do it that way.
... View more