Hello, I'm trying to find the most common combination of values with the data set that looks like the one below. Note: a member can have a code more than once.
memberid code
1 20
1 33
1 59
2 33
2 20
3 59
3 43
The answer would show that the combination of codes 33 and 20 are most common
This will probably not scale well, specifically the SQL join.
data have;
input memberid code;
cards;
1 20
1 33
1 59
2 33
2 20
3 59
3 43
;
proc sql;
create table combos as
select distinct a.memberID, min(a.code, b.code) as code1, max(a.code, b.code) as code2
from have as a
left join have as b
on a.memberid=b.memberid
and a.code ne b.code;
quit;
proc freq data=combos noprint;
table code1*code2 /out=freqs;
run;
proc sort data=freqs;
by descending count;
run;
title 'Most frequent pair';
proc print data=freqs(obs=1);
var code1 code2;
run;
Is a combination always only 2 codes together? If not, how would that be handled?
@jmmedina25 wrote:
Hello, I'm trying to find the most common combination of values with the data set that looks like the one below. Note: a member can have a code more than once.
memberid code
1 20
1 33
1 59
2 33
2 20
3 59
3 43
The answer would show that the combination of codes 33 and 20 are most common
This will probably not scale well, specifically the SQL join.
data have;
input memberid code;
cards;
1 20
1 33
1 59
2 33
2 20
3 59
3 43
;
proc sql;
create table combos as
select distinct a.memberID, min(a.code, b.code) as code1, max(a.code, b.code) as code2
from have as a
left join have as b
on a.memberid=b.memberid
and a.code ne b.code;
quit;
proc freq data=combos noprint;
table code1*code2 /out=freqs;
run;
proc sort data=freqs;
by descending count;
run;
title 'Most frequent pair';
proc print data=freqs(obs=1);
var code1 code2;
run;
@jmmedina25 wrote:
Hello, I'm trying to find the most common combination of values with the data set that looks like the one below. Note: a member can have a code more than once.
memberid code
1 20
1 33
1 59
2 33
2 20
3 59
3 43
The answer would show that the combination of codes 33 and 20 are most common
Please explain why 59 is not also in your list of most common, it appears twice as does 33 and 20.
data have; input memberid code; cards; 1 20 1 33 1 59 2 33 2 20 3 59 3 43 ; proc sql; create table combos as select distinct a.*,b.code as _code from have as a, have as b where a.memberid=b.memberid and a.code > b.code; quit; proc freq data=combos noprint order=freq; table code*_code/out=want ; run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.