BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Solph
Pyrite | Level 9

Hi,

 

How do I find the most frequent response for each person across a list of variables? Ideally I'd like to know the frequency count of the "most frequent  response". 

 

 

If it's a tie, either would be fine (although it would be nice if there is a way to choose without long SAS programming code)

Also my data are string variables (with values 01-10) data but if code-wise it's easier to do numerical, I can make them numerical.

 

data have;
input id 1 x1 $ 4-5 x2 $ 7-8 x3 $ 10-11 x4 $ 13-14 x5 $ 16-17;
cards;
1 07 04 07 07 07
2 04 05 04 04 05
3 02 02    03
4 01 01 02 02
5 01
;
run;

 

Want:
ID X1 X2 X3 X4 X5 MostFreq Count
1 07 04 07 07 07   07 3
2 04 05 04 04 05   04 3
3 02 02 03             02 2
4 01 01 02 02        01  2 (or 02  2)
5 05                        05  1

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @Solph,

 

Try this:

data want;
set have;
length MostFreq $2;
array x x:;
array _t[10] _temporary_;
call missing(of _t[*]);
do _n_=1 to dim(x);
  if x[_n_] ne ' ' then _t[input(x[_n_],2.)]+1;
end;
Count=max(of _t[*]);
MostFreq=put(whichn(Count, of _t[*]),z2.);
run;

This is for character variables x1, x2, ... For numeric variables it could be simplified.

View solution in original post

12 REPLIES 12
Solph
Pyrite | Level 9

Sorry, posted it before I finished composing. My question was if there is some efficient way to do it? I don't know if SAS has some ready to use code to do multiple response (like SPSS has) and write out the values into the datatset, or like the max/min functions (xx=max(of x1-x5) that can easily identify the max/min values out of a list of variables. The code I can come up with to get the most frequent response is a really really long one.

 

Thanks. 

Reeza
Super User

The fastest way is to transpose your data, from wide to long,  use a proc freq and then re-transpose to desired structure.  

 

Do you need help coding that? 

Proc transpose can flip your data each way. 

Solph
Pyrite | Level 9

You meant to have the vertical data? I think it worked perfectly. Here is what I did after transposing. Let me know if there is a better way.


proc sort; by id;
proc freq; by id; table x /out=xx (drop=percent); run;
proc print noobs; run;

 

It outputted ID, x (01-10) and count for each ID and each X value.

 

 

id x COUNT
1 04 1
1 07 4
2 04 3
2 05 2

 

Thanks again.

Astounding
PROC Star

I would recommend an approach that gives you a little more flexibility ... get 5 counts instead of 1 maximum count.  For example:

 

data want;

set have;

array x {5};

array count {5};

do _i_=1 to 5;

count{_i_}=0;

do _n_=1 to 5;

   if x{_i_} = x{_n_} then count{_i_} + 1;

end;

end;

run;

 

At the end of this, COUNT1 will be the number of times that the X1 answer appears, COUNT2 the number of times that the X2 answer appears, etc.  You can easily add to this DATA step with statements such as these:

 

maxcount = max(of count1-count5);

maxvar = whichn(maxcount, of count1-count5);

varname = vname(x{maxvar});

 

This way you can easily detect ties for the maximum count, detect answers that only appear once, etc.

Solph
Pyrite | Level 9

I thought about using array to count but got stuck for the rest. I liked the code because it indeed gives more flexibliity for further manipulation, but

- I need to assign a new variable associating varname values (e.g. x1 in the first row) to varaible name x1's value of 07, so that I can create a variable with a value of 07? How do I do it

- How come count for ID=5 didn't look right? Is it because it's counting missing? (Looks like it, for ID 3 and 4 too).

 

 

id x1 x2 x3 x4 x5 count1 count2 count3 count4 count5 _i_ maxcount maxvar varname
1 07 04 07 07 07 4 1 4 4 4 6 4 1 x1
2 04 05 04 04 05 3 2 3 3 2 6 3 1 x1
3 02   02 03   2 2 2 1 2 6 2 1 x1
4 01 01 02   02 2 2 2 1 2 6 2 1 x1
5 01         1 4 4 4 4 6 4 2 x2

 

Astounding
PROC Star

A few variations, then ...

 

If  you don't want to count missing values, change one line in the program:

 

if x{_i_} > ' ' then do _n_=1 to 5;

 

The counts for the missing values will remain at zero.

 

It looks like you can get MAXCOUNT, MAXVAR, and VARNAME.  To also get MAXVALUE, you could utilize MAXVAR:

 

maxvalue = x{maxvar};

FreelanceReinh
Jade | Level 19

Hi @Solph,

 

Try this:

data want;
set have;
length MostFreq $2;
array x x:;
array _t[10] _temporary_;
call missing(of _t[*]);
do _n_=1 to dim(x);
  if x[_n_] ne ' ' then _t[input(x[_n_],2.)]+1;
end;
Count=max(of _t[*]);
MostFreq=put(whichn(Count, of _t[*]),z2.);
run;

This is for character variables x1, x2, ... For numeric variables it could be simplified.

stat_sas
Ammonite | Level 13

data want(drop=i);
set have;
array x{*} x:;
do i=1 to dim(x);
MostFreq=max(of x(*));
Count=countc(cats(of x(*)),MostFreq,'i');
end;
run;

FreelanceReinh
Jade | Level 19
@stat_sas wrote:

MostFreq=max(of x(*));


@stat_sas: Why should the largest value coincide with the most frequent value?

Solph
Pyrite | Level 9

Thanks all for the various approaches.

 

FreelanceReinhard's code is ready to go, for the most frequent response. Thanks a lot (so that I don't need to write additional code or revise code).
stat_sas's code works when I want the max value and its count.

 

I've filed away all for future reference. Thanks again for your help.

stat_sas
Ammonite | Level 13

You are right. I thougt this is based on maximum number but this is not the case.

Ksharp
Super User

If your variable are Alphabeta NOT digit , try the following code :

 

data have; 
infile cards truncover;
input id ( x1  x2  x3  x4  x5) ($) ;
cards;
1 07 04 07 07 07
2 04 05 04 04 05
3 02 02  .  03 .
4 01 01 02 02
5 01 . . .
;
run;

data want;
if _n_=1 then do;
 length key $ 40;
 declare hash h();
 declare hiter hi('h');
 h.definekey('key');
 h.definedata('key','count');
 h.definedone();
end;
 set have;
 array xx{*} $ x:;
 do i=1 to dim(xx);
  if not missing(xx{i}) then do;
   key=xx{i};
   if h.find()=0 then count+1;
    else count=1;
   h.replace();
  end;
 end;
 most_freq=0;
 do while(hi.next()=0);
  if most_freq lt count then do;
   mode=key;
   most_freq=count;
  end;
 end;
h.clear();
drop i key count ;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 4270 views
  • 0 likes
  • 6 in conversation