Hi there, I am looking for functions or ideas on how to analysis the below example dataset:
What i would like to do it using the colour columns, create a new variable which assigns the favourite colour of each individual. So for person1 a simple IF statement would be fine as they only have 1 favourite colour, I can say 'if red = 1 then fav_colour = red'. However, the issue is when a person has more than one favourite colour i want to be able to assign 2 separate variables that have the 2 different colours in it.
red | yellow | blue | green | orange | fav_colour | fav_colour2 | fav_colour3 | |
person1 | 1 | 0 | 0 | 0 | 0 | red | ||
person2 | 0 | 0 | 0 | 0 | 1 | orange | ||
person3 | 0 | 1 | 0 | 1 | 0 | yellow | green | |
person4 | 0 | 0 | 1 | 0 | 0 | blue | ||
person5 | 1 | 0 | 0 | 1 | 1 | red | green | orange |
I have been looking at using arrays and do loops but am struggling to figure out how best to use these, so any ideas on functions that could be useful would be very much appreciated. (I have SAS Enterprise Guide version 7.11)
Thanks!
UNTESTED CODE
If you want tested code, please provide data as a SAS data step and not as a screen capture
data want;
set have;
array colors(*) red yellow blue green orange;
array fav_color(*) $ 8 fav_color1-fav_color5;
count=1;
do i=1 to dim(colors);
if colors(i)=1 then do;
fav_color(count)=vname(colors(i));
count=count+1;
end;
end;
drop i count;
run;
So, all you have now are the color columns, correct?
UNTESTED CODE
If you want tested code, please provide data as a SAS data step and not as a screen capture
data want;
set have;
array colors(*) red yellow blue green orange;
array fav_color(*) $ 8 fav_color1-fav_color5;
count=1;
do i=1 to dim(colors);
if colors(i)=1 then do;
fav_color(count)=vname(colors(i));
count=count+1;
end;
end;
drop i count;
run;
Two transposes do it:
data have;
infile datalines dlm='09'x;
input person $ red yellow blue green orange;
datalines;
person1 1 0 0 0 0
person2 0 0 0 0 1
person3 0 1 0 1 0
person4 0 0 1 0 0
person5 1 0 0 1 1
;
proc transpose data=have out=long (where=(col1 = 1));
by person;
var _numeric_;
run;
data num;
set long;
by person;
if first.person
then num = 1;
else num + 1;
run;
proc transpose data=num out=wide (drop=_name_ _label_) prefix=fav_colour;
by person;
var _name_;
id num;
run;
data want;
merge
have
wide
;
by person;
run;
@Jenna25 wrote:
Hi there, I am looking for functions or ideas on how to analysis the below example dataset:
What i would like to do it using the colour columns, create a new variable which assigns the favourite colour of each individual. So for person1 a simple IF statement would be fine as they only have 1 favourite colour, I can say 'if red = 1 then fav_colour = red'. However, the issue is when a person has more than one favourite colour i want to be able to assign 2 separate variables that have the 2 different colours in it.
red yellow blue green orange fav_colour fav_colour2 fav_colour3 person1 1 0 0 0 0 red person2 0 0 0 0 1 orange person3 0 1 0 1 0 yellow green person4 0 0 1 0 0 blue person5 1 0 0 1 1 red green orange
I have been looking at using arrays and do loops but am struggling to figure out how best to use these, so any ideas on functions that could be useful would be very much appreciated. (I have SAS Enterprise Guide version 7.11)
I question the value of doing this transformation of the data. It seems to me that any analysis you could do with the new columns could be done just as easily on the columns of 0s and 1s.
Looks like transforming from bad to worse, but if you need it ... do you really want the fav_colour-variables added to the variable you already have, or do you want person and fav_colour only?
If you are happy with person and fav_colours, try this, it uses the have-dataset provided by @Kurt_Bremser
data pre_want;
set have;
length fav_colour $ 10;
array colours red--orange;
do i = 1 to dim(colours);
if colours[i] then do;
fav_colour = vname(colours[i]);
output;
end;
end;
keep person fav_colour;
run;
proc transpose data=pre_want out=want(drop=_name_) prefix=fav_colour;
by person;
var fav_colour;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.