Hello, I'm trying to find the most common combination of values with the data set that looks like the one below. Note: a member can have a code more than once.
memberid code
1 20
1 33
1 59
2 33
2 20
3 59
3 43
The answer would show that the combination of codes 33 and 20 are most common
This will probably not scale well, specifically the SQL join.
data have;
input memberid code;
cards;
1 20
1 33
1 59
2 33
2 20
3 59
3 43
;
proc sql;
create table combos as
select distinct a.memberID, min(a.code, b.code) as code1, max(a.code, b.code) as code2
from have as a
left join have as b
on a.memberid=b.memberid
and a.code ne b.code;
quit;
proc freq data=combos noprint;
table code1*code2 /out=freqs;
run;
proc sort data=freqs;
by descending count;
run;
title 'Most frequent pair';
proc print data=freqs(obs=1);
var code1 code2;
run;
Is a combination always only 2 codes together? If not, how would that be handled?
@jmmedina25 wrote:
Hello, I'm trying to find the most common combination of values with the data set that looks like the one below. Note: a member can have a code more than once.
memberid code
1 20
1 33
1 59
2 33
2 20
3 59
3 43
The answer would show that the combination of codes 33 and 20 are most common
This will probably not scale well, specifically the SQL join.
data have;
input memberid code;
cards;
1 20
1 33
1 59
2 33
2 20
3 59
3 43
;
proc sql;
create table combos as
select distinct a.memberID, min(a.code, b.code) as code1, max(a.code, b.code) as code2
from have as a
left join have as b
on a.memberid=b.memberid
and a.code ne b.code;
quit;
proc freq data=combos noprint;
table code1*code2 /out=freqs;
run;
proc sort data=freqs;
by descending count;
run;
title 'Most frequent pair';
proc print data=freqs(obs=1);
var code1 code2;
run;
@jmmedina25 wrote:
Hello, I'm trying to find the most common combination of values with the data set that looks like the one below. Note: a member can have a code more than once.
memberid code
1 20
1 33
1 59
2 33
2 20
3 59
3 43
The answer would show that the combination of codes 33 and 20 are most common
Please explain why 59 is not also in your list of most common, it appears twice as does 33 and 20.
data have; input memberid code; cards; 1 20 1 33 1 59 2 33 2 20 3 59 3 43 ; proc sql; create table combos as select distinct a.*,b.code as _code from have as a, have as b where a.memberid=b.memberid and a.code > b.code; quit; proc freq data=combos noprint order=freq; table code*_code/out=want ; run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.