BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
genemroz
Quartz | Level 8

Esteemed Advisers;

This seems like it should be a simple problem but it is vexing me.

From the dataset created by the code below, I want to produce Proc Freq table or dataset that produces the following result:

Pair Count
AB 4
AC 3
AD 2
AE 1
BC 3
BD 2
BE 1
CD 2
CE 1
DE 1

I'm looking for suggestions/guidance on how to achieve this result,

Thanks,

Gene

Data test;
input ID Letter $;
datalines;
1 A
1 B
2 A
2 B
2 C
3 A
3 B
3 C
3 D
4 A
4 B
4 C
4 D
4 E
;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You need to create all pairs of values found by each id, followed by a proc freq.

 

To make the pairs, you can use a data step to:

  1. make an array of all letters for a givend id.
  2. from that array, construct and output each pair:
Data test;
input ID Letter $;
datalines;
1 A
1 B
2 A
2 B
2 C
3 A
3 B
3 C
3 D
4 A
4 B
4 C
4 D
4 E
run;

data vneed (keep=id pair) / view=vneed;
  array let {8} $8 let1-let8;
  do i=1 by 1 until (last.id);
    set test;
    by id;
    let{i}=letter;
  end;

  do j=1 to i-1;
    do k=j+1 to i;
      pair=cats(let{j},let{k});
      output;
    end;
  end;
run;

proc freq data=vneed;
  tables pair; 
run;

Make sure to declare the array dimension large enough to accommodate the most populated ID.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

You need to create all pairs of values found by each id, followed by a proc freq.

 

To make the pairs, you can use a data step to:

  1. make an array of all letters for a givend id.
  2. from that array, construct and output each pair:
Data test;
input ID Letter $;
datalines;
1 A
1 B
2 A
2 B
2 C
3 A
3 B
3 C
3 D
4 A
4 B
4 C
4 D
4 E
run;

data vneed (keep=id pair) / view=vneed;
  array let {8} $8 let1-let8;
  do i=1 by 1 until (last.id);
    set test;
    by id;
    let{i}=letter;
  end;

  do j=1 to i-1;
    do k=j+1 to i;
      pair=cats(let{j},let{k});
      output;
    end;
  end;
run;

proc freq data=vneed;
  tables pair; 
run;

Make sure to declare the array dimension large enough to accommodate the most populated ID.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
genemroz
Quartz | Level 8
Thanks for your prompt response and excellent solution!

Regards,

Gene
Ksharp
Super User
Data test;
input ID Letter $;
datalines;
1 A
1 B
2 A
2 B
2 C
3 A
3 B
3 C
3 D
4 A
4 B
4 C
4 D
4 E
;

proc sql;
create table want as
select a,b,count(*) as count
 from (select a.letter as a,b.letter as b from test as a,test as b 
        where a.id=b.id and a.letter<b.letter)
  group by a,b;
quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1207 views
  • 0 likes
  • 3 in conversation