I'm not new to SAS, but new to using arrays. I have a SAS table set up like this except for the final_race variable, which is what i want:
patient_id | race_white | race_black | race_asian | race_am_indian | race_pac_islander | race_other | race_unk | final_race |
1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | white |
2 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | multiracial |
3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | unknown |
4 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | asian |
A 1 means a person is of that race. Some people could have 1s in multiple columns because they're multiracial. Some have all 0s meaning we don't know their race. I need to sort through all the combos.. if a person has a 1 for only one race, then I need final_race to be that race. Else, if there are multiple 1s, then final_race "Multiracial". If all 0s, then "Unknown". I tried to build something off https://communities.sas.com/t5/SAS-Procedures/Race-looping/td-p/182117
but I couldn't figure it out with the differences I need.
Thank you
Seems simple as long as the variable names actually match the values you want calculate.
data have;
input patient_id
race_white race_black race_asian race_am_indian race_pac_islander race_other race_unk
want :$20.
;
cards;
1 1 0 0 0 0 0 0 white
2 1 1 0 0 0 0 0 multiracial
3 0 0 0 0 0 0 0 unknown
4 0 0 1 0 0 0 0 asian
;
data want;
set have ;
length final_race $20;
array race race_: ;
select (sum(of race[*]));
when (.,0) final_race='unknown';
when (1) final_race=substr(vname(race[ whichn(1,of race[*]) ]),6);
otherwise final_race='multiracial';
end;
run;
/* UNTESTED CODE */
data want;
set have;
length final_race $ 12;
sum_races=sum(of race_whte--race_other);
if sum_races>1 then final_race='multiracial';
else if sum_races=0 then final_race='unknown';
else do;
if race_white=1 then race='white';
/* You type the rest, I'm lazy */
end;
drop sum_races;
run;
Alternative (UNTESTED)
data want;
set have;
length final_race $ 12;
array race race_white--race_other;
sum_races=sum(of race_white--race_other);
if sum_races>1 then final_race='multiracial';
else if sum_races=0 then final_race='unknown';
else do i=1 to dim(race);
if race(i)=1 then race=scan(vname(race(i)),2,'_');
end;
drop sum_races;
run;
Seems simple as long as the variable names actually match the values you want calculate.
data have;
input patient_id
race_white race_black race_asian race_am_indian race_pac_islander race_other race_unk
want :$20.
;
cards;
1 1 0 0 0 0 0 0 white
2 1 1 0 0 0 0 0 multiracial
3 0 0 0 0 0 0 0 unknown
4 0 0 1 0 0 0 0 asian
;
data want;
set have ;
length final_race $20;
array race race_: ;
select (sum(of race[*]));
when (.,0) final_race='unknown';
when (1) final_race=substr(vname(race[ whichn(1,of race[*]) ]),6);
otherwise final_race='multiracial';
end;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.