06-21-2016 03:15 PM
I need to find the most frequent level of character variables by group, and the sum of several numeric variables by the same group. If the most frequent level is not unique, I would like the last level with in the group (the data is sorted). Here is an example of what I want to acheive:
For the sum, using group by id and SUM() within PROC SQL works well. Is there a MODE function within PROC SQL?
I am using Enterprise Guide version 5.1 (22.214.171.12419) (32-bit).
06-21-2016 03:37 PM
I'm not aware of any mode function (unfortunately).
But you can calculate the mode by using the proc univariate.
Not sure though if can calculate all your measures in one go.
06-21-2016 11:27 PM
Well it can be done in SQL but it ain't straightforward
data a; input id city &:$20. color :$8. num; datalines; 123 New York blue 4 123 LA blue 3 123 Seattle green 2 456 New York yellow 3 456 LA green 2 456 LA red 3 456 Seattle blue 5 789 Seattle yellow 2 789 New York green 3 789 Seattle yellow 5 789 Seattle yellow 2 789 Seattle green 4 ; data T; set a; order = _n_; run; proc sql; create table cityModes as select id, city from (select id, city, last from (select id, city, count(*) as n, max(order) as last from T group by id, city) group by id having n = max(n)) group by id having last=max(last); create table colorModes as select id, color from (select id, color, last from (select id, color, count(*) as n, max(order) as last from T group by id, color) group by id having n = max(n)) group by id having last=max(last); create table want as select T.id, a.city, b.color, sum(T.num) as sum from T inner join cityModes as a on T.id=a.id inner join colorModes as b on a.id=b.id group by T.id, a.city, b.color; select * from want; quit;