BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Solph
Pyrite | Level 9

I've a data set and would like to find the most frequent value from variables, along with max number of response and the count for the most frequent value.

 

 

data have; input id 1 fy2004 $ 3-4 fy2005 $ 6-7 fy2006 $ 9-10 fy2007 $ 12-13 mostfreq $ 17-18 count 20 total_count 22;
datalines;
1 13 13 13 13   13 4 4 
2 14    14 14   14 3 3 
3 12 12 12 05   12 3 4
4 01 01    02   01 2 4
5 01 01 12 12   02 2 4
;

data want;
	set have;
	length MostFreq $2;
	array list fy:;
	array _t[10] _temporary_;
	call missing(of _t[*]);
	do i=1 to dim(list);
	  if list[i] ne '' then _t[input(list[i],2.)]+1;
	end;
	Count=max(of _t[*]);
	MostFreq=put(whichn(Count, of _t[*]),z2.);
	drop i;
run;

The code came from https://communities.sas.com/t5/General-SAS-Programming/Find-most-frequent-response-across-multiple-v.... I modified it but don't know why it didn't work on my data. I also need a third variable  for total count of responses.

 

Much appeciated for your help.

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

@Solph wrote:

Sorry. I've a follow up question. Hope you can help out. What if my smallest number is 00, not 01. The code wouldn't work when the starting number is 00. Thanks in advance. 


No problem. As @Reeza said, define the temporary array as

array _t[0:99] _temporary_;

in order to count the number of '00' responses in _t[0], the number of '01' responses in _t[1], ..., and the number of '99' responses (if any) in _t[99]. There is one more change required, though: The WHICHN function used in the definition of MostFreq now yields numbers 1, 2, 3, ... if the maximum count is found in _t[0], _t[1], _t[2], ... To compensate for that, we have to subtract 1 from those values. Then they match the responses again.

MostFreq=put(whichn(Count, of _t[*])-1,z2.);

View solution in original post

6 REPLIES 6
FreelanceReinh
Jade | Level 19

Hello @Solph,

 

Happy to see that my 2016 solution is still in use. 🙂

 

Back then your possible response values were '01' - '10', which was the reason for dimension 10 of the temporary array _t. Now, apparently, you need at least dimension 14. If you like, you can also use 99 to be on the safe side (e.g. for future additional questions). Note that this array contains the count for response '01' in its first element (_t[1]), the count for response '02' in the second and so on.

 

For the total count you can insert the line

Total_Count=dim(list)-cmiss(of list[*]);

This will result in the number of non-missing values among the variables in array list.

 

Edit: Of course, a shorter formula for the total count is

Total_Count=sum(of _t[*]);

However, there is one case where the two would differ: If all responses are missing, the first formula gives 0, whereas the second results in a missing value together with the familiar note "Missing values were generated ..." in the log (which would be triggered anyway by the formula for Count, though).

Solph
Pyrite | Level 9

Thanks a lot. It worked perfectly. Don't know why I didn't catch it. The code below reflected your input and now worked.

 

data want;
	set have;
	length MostFreq $2;
	array list fy:;
	array _t[99] _temporary_;  *Set the max possible value to be 99;
	call missing(of _t[*]);
	do i=1 to dim(list);
	  if list[i] ne '' then _t[input(list[i],2.)]+1;
	end;
	Count=max(of _t[*]);
	Total_Count=sum(of _t[*]);
	MostFreq=put(whichn(Count, of _t[*]),z2.);
	drop i;
run;

  

Solph
Pyrite | Level 9

Sorry. I've a follow up question. Hope you can help out. What if my smallest number is 00, not 01. The code wouldn't work when the starting number is 00. Thanks in advance.

data have; input id 1 fy2004 $ 3-4 fy2005 $ 6-7 fy2006 $ 9-10 fy2007 $ 12-13 _Mostfreq $ 17-18 _Count 20 _Total_count 22;
datalines;
1 00 13 13 13   13 3 4 
2 14    14 14   14 3 3 
3 12 12 12 05   12 3 4
4 01 01    02   01 2 3
5 00 00 12 12   00 2 4
;

 

Reeza
Super User
You can add an index to an array to have it go from 0 to 99.

array _t(0:99);

Then it will work.
FreelanceReinh
Jade | Level 19

@Solph wrote:

Sorry. I've a follow up question. Hope you can help out. What if my smallest number is 00, not 01. The code wouldn't work when the starting number is 00. Thanks in advance. 


No problem. As @Reeza said, define the temporary array as

array _t[0:99] _temporary_;

in order to count the number of '00' responses in _t[0], the number of '01' responses in _t[1], ..., and the number of '99' responses (if any) in _t[99]. There is one more change required, though: The WHICHN function used in the definition of MostFreq now yields numbers 1, 2, 3, ... if the maximum count is found in _t[0], _t[1], _t[2], ... To compensate for that, we have to subtract 1 from those values. Then they match the responses again.

MostFreq=put(whichn(Count, of _t[*])-1,z2.);
Solph
Pyrite | Level 9

Thanks a lot FreelanceReinhard and Reeza. They totally addressed my needs. So the code below worked! Thanks so much.

data have; input id 1 fy2004 $ 3-4 fy2005 $ 6-7 fy2006 $ 9-10 fy2007 $ 12-13 _Mostfreq $ 17-18 _Count 20 _Total_count 22;
datalines;
1 00 13 13 13   13 3 4 
2 14    14 14   14 3 3 
3 12 12 12 05   12 3 4
4 01 01    02   01 2 3
5 00 00 12 12   00 2 4
;
data want;
	set have;
	length MostFreq $2;
	array list fy:;
	array _t[0:99] _temporary_;  *Set the max possoble value, in this example 99;
	call missing(of _t[*]);
	do i=1 to dim(list);
	  if list[i] ne '' then _t[input(list[i],2.)]+1;
	end;
	Count=max(of _t[*]);
	Total_Count=sum(of _t[*]);
	*MostFreq=put(whichn(Count, of _t[*]),z2.);
	MostFreq=put(whichn(Count, of _t[*])-1,z2.);
	drop i;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 3740 views
  • 4 likes
  • 3 in conversation