Hey everyone,
I have a problem with the a dataset like the following table.
id | year | modal |
111 111 111 112 112 113 113 113 | 2006 2006 2006 2006 2006 2006 2006 2006 | 1 2 123 13 12 12 2 2 |
112 112 112 112 112 113 113 113 114 114 115 115 | 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 | 2 2 123 23 23 23 3 3 3 1 1 1 |
114 114 115 115 116 116 116 116 | 2008 2008 2008 2008 2008 2008 2008 2008 | 12 1 13 1 2 2 2 3 |
114 114 114 114 | 2009 2009 2009 2009 | 3 123 1 2 |
I want to find out which modal exists for each id and year.
The following combinations are possible: 1; 2; 3; 12; 13; 23 and 123.
What the output should look like:
id | year | Modal_total |
111 112 113 112 113 114 115 114 115 116 114 | 2006 2006 2006 2007 2007 2007 2007 2008 2008 2008 2009 | 123 123 12 123 23 13 1 12 13 23 123 |
Thanks a lot in advance. 🙂
Hello @annaxenia1,
Here's another approach, assuming that your input dataset ("HAVE") is sorted by year id.
data want(drop=_: modal);
length _c $3;
do until(last.id);
set have;
by year id;
if modal in (1,12,13,123) then substr(_c,1,1)='1';
if modal in (2,12,23,123) then substr(_c,2,1)='2';
if modal in (3,13,23,123) then substr(_c,3,1)='3';
end;
Modal_total=compress(_c);
run;
This creates Modal_total as a character variable (of length 3). If you prefer a numeric variable, please replace the assignment statement by
Modal_total=input(compress(_c),3.);
For input datasets with more than three different values (1, 2, 3) the lists in the IF conditions might become unwieldy. In this case I'd rather convert the modal values to character and apply a suitable character function.
Edit: Example of a variant using a character function (FINDC), also using an array instead of a character variable _c:
data want(drop=_: modal);
array _c[3] $1;
do until(last.id);
set have;
by year id;
_m=put(modal,3.);
do _i='1', '2', '3';
if findc(_m,_i) then _c[input(_i,1.)]=_i;
end;
end;
length Modal_total $3;
Modal_total=cats(of _c[*]);
run;
@annaxenia1 Hi and welcome to the SAS Community 🙂
Can you be more specific about the logic here? How exactly is Modal_total calculated?
Is the variable modal stored as character or numeric?
data have;
input id
year
modal $
;
cards;
111 2006 1
111 2006 2
111 2006 123
112 2006 13
112 2006 12
113 2006 12
113 2006 2
113 2006 2
;
data temp;
set have;
do i=1 to length(modal);
temp=char(modal,i);output;
end;
drop i modal;
run;
proc sort data=temp nodupkey;
by id year temp;
run;
data want;
do until(last.year);
set temp;
by id year;
length want $ 80;
want=cats(want,temp);
end;
drop temp;
run;
Hello @annaxenia1,
Here's another approach, assuming that your input dataset ("HAVE") is sorted by year id.
data want(drop=_: modal);
length _c $3;
do until(last.id);
set have;
by year id;
if modal in (1,12,13,123) then substr(_c,1,1)='1';
if modal in (2,12,23,123) then substr(_c,2,1)='2';
if modal in (3,13,23,123) then substr(_c,3,1)='3';
end;
Modal_total=compress(_c);
run;
This creates Modal_total as a character variable (of length 3). If you prefer a numeric variable, please replace the assignment statement by
Modal_total=input(compress(_c),3.);
For input datasets with more than three different values (1, 2, 3) the lists in the IF conditions might become unwieldy. In this case I'd rather convert the modal values to character and apply a suitable character function.
Edit: Example of a variant using a character function (FINDC), also using an array instead of a character variable _c:
data want(drop=_: modal);
array _c[3] $1;
do until(last.id);
set have;
by year id;
_m=put(modal,3.);
do _i='1', '2', '3';
if findc(_m,_i) then _c[input(_i,1.)]=_i;
end;
end;
length Modal_total $3;
Modal_total=cats(of _c[*]);
run;
Hi @annaxenia1 My share of fun. I must admit, I kinda copied the idea from @Ksharp . Sorry Xia!
data have;
input id year modal ;
cards;
111 2006 1
111 2006 2
111 2006 123
112 2006 13
112 2006 12
113 2006 12
113 2006 2
113 2006 2
112 2007 2
112 2007 2
112 2007 123
112 2007 23
112 2007 23
113 2007 23
113 2007 3
113 2007 3
114 2007 3
114 2007 1
115 2007 1
115 2007 1
114 2008 12
114 2008 1
115 2008 13
115 2008 1
116 2008 2
116 2008 2
116 2008 2
116 2008 3
114 2009 3
114 2009 123
114 2009 1
114 2009 2
;
proc sort data=have out=_have;
by id year;
run;
data want;
do until(last.year);
set _have;
by id year;
array t(99999)$ _temporary_ ;
do _n_=1 to length(cats(modal));
temp=char(cats(modal),_n_);
if temp not in t then t(temp)=temp;
end;
end;
Modal_total=cats(of t(*));
call missing(of t(*));
drop temp modal;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.