Finding MODE in SAS data step

Reply
Contributor
Posts: 21

Finding MODE in SAS data step

Hi All,

anybody can please guide me how can i find the MODE in SAS data step?

with help of Proc Univeriate or Proc Means, i can get the answers but i want to find that in SAS data step, see example below:

there are 5 column and 10 observation in the table columns : a1, a2, a3, a4, a5

i want to find mean and mode for each 10 rows based on those five columns.

Data AA;

set ZZ;

Avg = Mean(a1, a2, a3, a4, a5);

Mode = ? /* how to find mode as like mean?*/

run;

Thanks,

KP

Contributor
Posts: 21

Re: Finding MODE in SAS data step

Posted in reply to KrunalPatel

MODE : i.e. most common frequency

Super Contributor
Posts: 334

Re: Finding MODE in SAS data step

Posted in reply to KrunalPatel

Maybe Im miss reading something but wouldnt means or univariate give you the mode down the column and not across the columns?

I dont know of mode function in the datastep and to get (as I understand it) from univariate or means I think you would have to transpose the data.

EJ

Contributor
Posts: 21

Re: Finding MODE in SAS data step

Hi EJ,

yes you are right, Proc Univeriate can give me the answer vertically but i am looking for Horizontly.

Transposing the table will work but practically its not possible.... as I am talking about finding the MODE for more than 100,000 each rows.

Super User
Posts: 11,343

Re: Finding MODE in SAS data step

Posted in reply to KrunalPatel

There is no simple one line MODE for the data step. I think partially because there would have to be tie-breaking rule decisions. For your data, if all of the variables have a different value which is the mode? If two variables have one value and two more have a different value but are the same, such as 1 1 3 5 5?

Respected Advisor
Posts: 3,156

Re: Finding MODE in SAS data step

Posted in reply to KrunalPatel

If you really ( I mean really) want one, here is one, it can be tweaked to work on Char as well, if there is a tie, it chooses randomly (I think):

data have;

input v1-v5;

cards;

1 1 1 2 3

1 2 2 3 4

1 2 3 4 5

1 2 3 2 4

;

data want (drop=rc rename=(value=mode));

         declare hash h();

h.definekey('value');

h.definedata('value','count');

h.definedone();

         declare hash h1(ordered:'d');

h1.definekey('count');

h1.definedata('value','count');

h1.definedone();

         declare hiter hi('h');

         declare hiter hi1('h1');

set have;

  array v v:;

  do over v;

    if h.find(key:v) ne 0 then do; count=1; value=v; h.replace();end;

    else do; value=v;count+1; h.replace();end;

  end;

  do rc=hi.first() by 0 while (rc=0);

     h1.replace();

       rc=hi.next();

  end;

  hi1.first();

run;

Haikuo

Respected Advisor
Posts: 3,799

Re: Finding MODE in SAS data step

Haikuo,

Will your program have a pretty big performance issue as the number of observations grows?  Would it be better to declare the hash(s) only one time and clear them at the end of the data step for the next observation?

Respected Advisor
Posts: 3,156

Re: Finding MODE in SAS data step

Posted in reply to data_null__

Good point, DN. Honestly I have no idea which one is more efficient, re-declare or h.clear() for each obs. I have seen both, this one got chosen merely for the reason of getting a shorter code.

Thanks for pointing it out and, OP be aware if using my code.

Haikuo

Super User
Posts: 5,516

Re: Finding MODE in SAS data step

Posted in reply to KrunalPatel

Here's the DATA step version.  To simplify things, I assume you know how many numeric variables you will want to process.  If need be, macro language could count them anyway.

data want;

  set have;

  array nums {150} _numeric_;

  array counts {150} _temporary_;

  do _n_=1 to 149;

      counts{_n_}=1;

      do _i_=_n_+1 to 150;

          if nums{_n_} = nums{_i_} then counts{_n_} + 1;

      end;

      if counts{_n_} > maxcount then do;

         mode = nums{_n_};

         maxcount = counts{_n_};

     end;

  end;

run;

In the case of ties, it just takes the first one.

Good luck.

Super User
Posts: 5,516

Re: Finding MODE in SAS data step

Posted in reply to Astounding

Two second thoughts here.  First, it's unfair to call this the DATA step solution since both are DATA step solutions.  How about the ARRAY solution?  And second, there's no need for a second array (unless you need to locate ties for the mode).  A single variable would do.  So ...

data want;

  set have;

  array nums {150} _numeric_;

  do _n_=1 to 149;

     count=1;

     do _i_ = _n_+1 to 150;

        if nums{_n_} = nums{_i_} then count + 1;

     end;

     if count > maxcount then do;

        mode = nums{_n_};

        maxcount = count;

     end;

  end;

run;

Ask a Question
Discussion stats
  • 9 replies
  • 4603 views
  • 2 likes
  • 6 in conversation