BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
genemroz
Quartz | Level 8

Esteemed Advisers;

This seems like it should be a simple problem but it is vexing me.

From the dataset created by the code below, I want to produce Proc Freq table or dataset that produces the following result:

Pair Count
AB 4
AC 3
AD 2
AE 1
BC 3
BD 2
BE 1
CD 2
CE 1
DE 1

I'm looking for suggestions/guidance on how to achieve this result,

Thanks,

Gene

Data test;
input ID Letter $;
datalines;
1 A
1 B
2 A
2 B
2 C
3 A
3 B
3 C
3 D
4 A
4 B
4 C
4 D
4 E
;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You need to create all pairs of values found by each id, followed by a proc freq.

 

To make the pairs, you can use a data step to:

  1. make an array of all letters for a givend id.
  2. from that array, construct and output each pair:
Data test;
input ID Letter $;
datalines;
1 A
1 B
2 A
2 B
2 C
3 A
3 B
3 C
3 D
4 A
4 B
4 C
4 D
4 E
run;

data vneed (keep=id pair) / view=vneed;
  array let {8} $8 let1-let8;
  do i=1 by 1 until (last.id);
    set test;
    by id;
    let{i}=letter;
  end;

  do j=1 to i-1;
    do k=j+1 to i;
      pair=cats(let{j},let{k});
      output;
    end;
  end;
run;

proc freq data=vneed;
  tables pair; 
run;

Make sure to declare the array dimension large enough to accommodate the most populated ID.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

You need to create all pairs of values found by each id, followed by a proc freq.

 

To make the pairs, you can use a data step to:

  1. make an array of all letters for a givend id.
  2. from that array, construct and output each pair:
Data test;
input ID Letter $;
datalines;
1 A
1 B
2 A
2 B
2 C
3 A
3 B
3 C
3 D
4 A
4 B
4 C
4 D
4 E
run;

data vneed (keep=id pair) / view=vneed;
  array let {8} $8 let1-let8;
  do i=1 by 1 until (last.id);
    set test;
    by id;
    let{i}=letter;
  end;

  do j=1 to i-1;
    do k=j+1 to i;
      pair=cats(let{j},let{k});
      output;
    end;
  end;
run;

proc freq data=vneed;
  tables pair; 
run;

Make sure to declare the array dimension large enough to accommodate the most populated ID.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
genemroz
Quartz | Level 8
Thanks for your prompt response and excellent solution!

Regards,

Gene
Ksharp
Super User
Data test;
input ID Letter $;
datalines;
1 A
1 B
2 A
2 B
2 C
3 A
3 B
3 C
3 D
4 A
4 B
4 C
4 D
4 E
;

proc sql;
create table want as
select a,b,count(*) as count
 from (select a.letter as a,b.letter as b from test as a,test as b 
        where a.id=b.id and a.letter<b.letter)
  group by a,b;
quit;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 838 views
  • 0 likes
  • 3 in conversation