turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Find most frequent response across multiple variab...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 12:03 PM

Hi,

How do I find the most frequent response for each person across a list of variables? Ideally I'd like to know the frequency count of the "most frequent response".

If it's a tie, either would be fine (although it would be nice if there is a way to choose without long SAS programming code)

Also my data are string variables (with values 01-10) data but if code-wise it's easier to do numerical, I can make them numerical.

data have;

input id 1 x1 $ 4-5 x2 $ 7-8 x3 $ 10-11 x4 $ 13-14 x5 $ 16-17;

cards;

1 07 04 07 07 07

2 04 05 04 04 05

3 02 02 03

4 01 01 02 02

5 01

;

run;

Want:

ID X1 X2 X3 X4 X5 MostFreq Count

1 07 04 07 07 07 07 3

2 04 05 04 04 05 04 3

3 02 02 03 02 2

4 01 01 02 02 01 2 (or 02 2)

5 05 05 1

Accepted Solutions

Solution

05-11-2016
03:35 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 12:33 PM

Hi @Solph,

Try this:

```
data want;
set have;
length MostFreq $2;
array x x:;
array _t[10] _temporary_;
call missing(of _t[*]);
do _n_=1 to dim(x);
if x[_n_] ne ' ' then _t[input(x[_n_],2.)]+1;
end;
Count=max(of _t[*]);
MostFreq=put(whichn(Count, of _t[*]),z2.);
run;
```

This is for character variables x1, x2, ... For numeric variables it could be simplified.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 12:11 PM

Sorry, posted it before I finished composing. My question was if there is some efficient way to do it? I don't know if SAS has some ready to use code to do multiple response (like SPSS has) and write out the values into the datatset, or like the max/min functions (xx=max(of x1-x5) that can easily identify the max/min values out of a list of variables. The code I can come up with to get the most frequent response is a really really long one.

Thanks.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 12:13 PM

The fastest way is to transpose your data, from wide to long, use a proc freq and then re-transpose to desired structure.

Do you need help coding that?

Proc transpose can flip your data each way.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 12:50 PM

You meant to have the vertical data? I think it worked perfectly. Here is what I did after transposing. Let me know if there is a better way.

proc sort; by id;

proc freq; by id; table x /out=xx (drop=percent); run;

proc print noobs; run;

It outputted ID, x (01-10) and count for each ID and each X value.

id | x | COUNT |
---|---|---|

1 | 04 | 1 |

1 | 07 | 4 |

2 | 04 | 3 |

2 | 05 | 2 |

Thanks again.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 12:32 PM

I would recommend an approach that gives you a little more flexibility ... get 5 counts instead of 1 maximum count. For example:

data want;

set have;

array x {5};

array count {5};

do _i_=1 to 5;

count{_i_}=0;

do _n_=1 to 5;

if x{_i_} = x{_n_} then count{_i_} + 1;

end;

end;

run;

At the end of this, COUNT1 will be the number of times that the X1 answer appears, COUNT2 the number of times that the X2 answer appears, etc. You can easily add to this DATA step with statements such as these:

maxcount = max(of count1-count5);

maxvar = whichn(maxcount, of count1-count5);

varname = vname(x{maxvar});

This way you can easily detect ties for the maximum count, detect answers that only appear once, etc.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 01:02 PM

I thought about using array to count but got stuck for the rest. I liked the code because it indeed gives more flexibliity for further manipulation, but

- I need to assign a new variable associating varname values (e.g. x1 in the first row) to varaible name x1's value of 07, so that I can create a variable with a value of 07? How do I do it

- How come count for ID=5 didn't look right? Is it because it's counting missing? (Looks like it, for ID 3 and 4 too).

id | x1 | x2 | x3 | x4 | x5 | count1 | count2 | count3 | count4 | count5 | _i_ | maxcount | maxvar | varname |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | 07 | 04 | 07 | 07 | 07 | 4 | 1 | 4 | 4 | 4 | 6 | 4 | 1 | x1 |

2 | 04 | 05 | 04 | 04 | 05 | 3 | 2 | 3 | 3 | 2 | 6 | 3 | 1 | x1 |

3 | 02 | 02 | 03 | 2 | 2 | 2 | 1 | 2 | 6 | 2 | 1 | x1 | ||

4 | 01 | 01 | 02 | 02 | 2 | 2 | 2 | 1 | 2 | 6 | 2 | 1 | x1 | |

5 | 01 | 1 | 4 | 4 | 4 | 4 | 6 | 4 | 2 | x2 |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 01:14 PM

A few variations, then ...

If you don't want to count missing values, change one line in the program:

**if x{_i_} > ' ' then** do _n_=1 to 5;

The counts for the missing values will remain at zero.

It looks like you can get MAXCOUNT, MAXVAR, and VARNAME. To also get MAXVALUE, you could utilize MAXVAR:

maxvalue = x{maxvar};

Solution

05-11-2016
03:35 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 12:33 PM

Hi @Solph,

Try this:

```
data want;
set have;
length MostFreq $2;
array x x:;
array _t[10] _temporary_;
call missing(of _t[*]);
do _n_=1 to dim(x);
if x[_n_] ne ' ' then _t[input(x[_n_],2.)]+1;
end;
Count=max(of _t[*]);
MostFreq=put(whichn(Count, of _t[*]),z2.);
run;
```

This is for character variables x1, x2, ... For numeric variables it could be simplified.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 12:46 PM

data want(drop=i);

set have;

array x{*} x:;

do i=1 to dim(x);

MostFreq=max(of x(*));

Count=countc(cats(of x(*)),MostFreq,'i');

end;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 12:54 PM - edited 05-11-2016 12:55 PM

stat_sas wrote:

MostFreq=max(of x(*));

@stat_sas: Why should the *largest* value coincide with the *most frequent* value?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 03:40 PM

Thanks all for the various approaches.

FreelanceReinhard's code is ready to go, for the most frequent response. Thanks a lot (so that I don't need to write additional code or revise code).

stat_sas's code works when I want the max value and its count.

I've filed away all for future reference. Thanks again for your help.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-12-2016 01:27 PM

You are right. I thougt this is based on maximum number but this is not the case.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-11-2016 10:33 PM

If your variable are Alphabeta NOT digit , try the following code :

```
data have;
infile cards truncover;
input id ( x1 x2 x3 x4 x5) ($) ;
cards;
1 07 04 07 07 07
2 04 05 04 04 05
3 02 02 . 03 .
4 01 01 02 02
5 01 . . .
;
run;
data want;
if _n_=1 then do;
length key $ 40;
declare hash h();
declare hiter hi('h');
h.definekey('key');
h.definedata('key','count');
h.definedone();
end;
set have;
array xx{*} $ x:;
do i=1 to dim(xx);
if not missing(xx{i}) then do;
key=xx{i};
if h.find()=0 then count+1;
else count=1;
h.replace();
end;
end;
most_freq=0;
do while(hi.next()=0);
if most_freq lt count then do;
mode=key;
most_freq=count;
end;
end;
h.clear();
drop i key count ;
run;
```