Finding MODE in SAS data step

KrunalPatel · Posted 06-06-2013 06:54 AM

Hi All,

anybody can please guide me how can i find the MODE in SAS data step?

with help of Proc Univeriate or Proc Means, i can get the answers but i want to find that in SAS data step, see example below:

there are 5 column and 10 observation in the table columns : a1, a2, a3, a4, a5

i want to find mean and mode for each 10 rows based on those five columns.

Data AA;

set ZZ;

Avg = Mean(a1, a2, a3, a4, a5);

Mode = ? /* how to find mode as like mean?*/

run;

Thanks,

KP

KrunalPatel · Posted 06-06-2013 06:55 AM

MODE : i.e. most common frequency

esjackso · Posted 06-06-2013 08:16 AM

Maybe Im miss reading something but wouldnt means or univariate give you the mode down the column and not across the columns?

I dont know of mode function in the datastep and to get (as I understand it) from univariate or means I think you would have to transpose the data.

EJ

KrunalPatel · Posted 06-06-2013 12:06 PM

Hi EJ,

yes you are right, Proc Univeriate can give me the answer vertically but i am looking for Horizontly.

Transposing the table will work but practically its not possible.... as I am talking about finding the MODE for more than 100,000 each rows.

ballardw · Posted 06-06-2013 11:48 AM

There is no simple one line MODE for the data step. I think partially because there would have to be tie-breaking rule decisions. For your data, if all of the variables have a different value which is the mode? If two variables have one value and two more have a different value but are the same, such as 1 1 3 5 5?

Haikuo · Posted 06-06-2013 04:12 PM

If you really ( I mean really) want one, here is one, it can be tweaked to work on Char as well, if there is a tie, it chooses randomly (I think):

data have;

input v1-v5;

cards;

1 1 1 2 3

1 2 2 3 4

1 2 3 4 5

1 2 3 2 4

;

data want (drop=rc rename=(value=mode));

declare hash h();

h.definekey('value');

h.definedata('value','count');

h.definedone();

declare hash h1(ordered:'d');

h1.definekey('count');

h1.definedata('value','count');

h1.definedone();

declare hiter hi('h');

declare hiter hi1('h1');

set have;

array v v:;

do over v;

if h.find(key:v) ne 0 then do; count=1; value=v; h.replace();end;

else do; value=v;count+1; h.replace();end;

end;

do rc=hi.first() by 0 while (rc=0);

h1.replace();

rc=hi.next();

end;

hi1.first();

run;

Haikuo

data_null__ · Posted 06-07-2013 12:36 PM

Haikuo,

Will your program have a pretty big performance issue as the number of observations grows? Would it be better to declare the hash(s) only one time and clear them at the end of the data step for the next observation?

Haikuo · Posted 06-07-2013 12:46 PM

Good point, DN. Honestly I have no idea which one is more efficient, re-declare or h.clear() for each obs. I have seen both, this one got chosen merely for the reason of getting a shorter code.

Thanks for pointing it out and, OP be aware if using my code.

Haikuo

Astounding · Posted 06-06-2013 06:53 PM

Here's the DATA step version. To simplify things, I assume you know how many numeric variables you will want to process. If need be, macro language could count them anyway.

data want;

set have;

array nums {150} _numeric_;

array counts {150} _temporary_;

do _n_=1 to 149;

counts{_n_}=1;

do _i_=_n_+1 to 150;

if nums{_n_} = nums{_i_} then counts{_n_} + 1;

end;

if counts{_n_} > maxcount then do;

mode = nums{_n_};

maxcount = counts{_n_};

end;

run;

In the case of ties, it just takes the first one.

Good luck.

Astounding · Posted 06-06-2013 08:19 PM

Two second thoughts here. First, it's unfair to call this the DATA step solution since both are DATA step solutions. How about the ARRAY solution? And second, there's no need for a second array (unless you need to locate ties for the mode). A single variable would do. So ...

data want;

set have;

array nums {150} _numeric_;

do _n_=1 to 149;

count=1;

do _i_ = _n_+1 to 150;

if nums{_n_} = nums{_i_} then count + 1;

end;

if count > maxcount then do;

mode = nums{_n_};

maxcount = count;

end;

run;

Finding MODE in SAS data step

Re: Finding MODE in SAS data step

Re: Finding MODE in SAS data step

Re: Finding MODE in SAS data step

Re: Finding MODE in SAS data step

Re: Finding MODE in SAS data step

Re: Finding MODE in SAS data step

Re: Finding MODE in SAS data step

Re: Finding MODE in SAS data step

Re: Finding MODE in SAS data step

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away