Hello Everyone
I am wanting to create a category column based on the values from another column. Please see the table below:
PersonID Mode_of_Instruction Instruction_Category
31515 100 % Online Mixed
31515 In Person Mixed
31515 50% Online Mixed
31515 50% Online Mixed
64790 In Person In Person
64790 In Person In Person
64790 75% Online In Person
64790 50% Online In Person
83409 100% Online 100% Online
83409 100% Online 100% Online
83409 100% Online 100% Online
83409 100% Online 100% Online
The column called "Instruction_Category" is what I would like to create. Notice a few things:
- There are multiple observations per person
- If a person has more than one mode_of_instruction, and it includes "100% Online", then the instruction_category is "Mixed"
- If a person has more than one mode_of_instruction, and it does not include "100% Online", then the instruction_category is "In Person"
I hope this makes sense- please let me know if clarifications are needed.
Thanks in advance!
data have;
input (PersonID Color ) ($);* Color_Category;
cards;
31515 Green Multicolor
31515 Blue Multicolor
31515 Green Multicolor
31515 Blue Multicolor
64790 Green Green
64790 Green Green
64790 Green Green
64790 Green Green
83409 Blue Blue
83409 Blue Blue
83409 Blue Blue
83409 Blue Blue
;
proc sql;
create table want as
select *,ifc(count(distinct color)>1,'Multicolor',color) as Color_Category
from have
group by personid;
quit;
Great use of IFC!
proc sql;
create table want as
select *,case when max(color) ne min(color) then 'MultiColour'
else colour end as Color_Category
from have
group by personid;
quit;
Slight variation using max/min - you can use max/min on character functions in SQL.
And an overly predictable rather increasingly boring DOW for me lately
data have;
input (PersonID Color ) ($);* Color_Category;
cards;
31515 Green Multicolor
31515 Blue Multicolor
31515 Green Multicolor
31515 Blue Multicolor
64790 Green Green
64790 Green Green
64790 Green Green
64790 Green Green
83409 Blue Blue
83409 Blue Blue
83409 Blue Blue
83409 Blue Blue
;
data want;
do _n_=0 by 0 until(last.personid);
set have;
by personid color notsorted;
if first.color then _n_=sum(_n_,1);
end;
length Color_Category $10;
Color_Category=ifc(_n_>1,'Multicolor',color);
do until(last.personid);
set have;
by personid;
output;
end;
run;
And this traditional one, recalling my initial days where I used to read to voraciously read Guru @data_null__ 's posts and ask/beg him for help. And Guru offered this one(easy one) to cheer me and said "you will have your time soon"
data have;
input (PersonID Color ) ($);* Color_Category;
cards;
31515 Green Multicolor
31515 Blue Multicolor
31515 Green Multicolor
31515 Blue Multicolor
64790 Green Green
64790 Green Green
64790 Green Green
64790 Green Green
83409 Blue Blue
83409 Blue Blue
83409 Blue Blue
83409 Blue Blue
;
proc freq data=have noprint;
tables personid*color/out=temp(keep=personid count);
run;
proc freq data=temp noprint;
tables personid/out=temp2(keep=personid count);
run;
data want;
merge have temp2;
by personid;
length Color_Category $10;
Color_Category=ifc(count>1,'Multicolor',color);
drop count;
run;
Guru DN, dedicating to all your help as always and every time I mention is a way of showing my heartfelt gratitude. Turn on your sense of humor, you must have gone a lot more grey now. lol 🙂
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.