BookmarkSubscribeRSS Feed
lcain
Fluorite | Level 6

I am trying to use proc sql to create all possible pairs of two variables.

 

For example, if my starting data appears like this...

 

letter number

a 1

b 2

c 3

 

I would like the output data to look like...

 

pair1 pair2 pair3

a-1 b-2 c-3

a-1 b-3 c-2

a-2 b-1 c-3

a-2 b-3 c-2

a-3 b-1 c-2

a-3 b-2 c-1

 

I have been able to get a single column with all 9 possible pairs, but when I try to put multiple pairs on a single line, things get messed up.

 

Thanks in advance for your help!

 

6 REPLIES 6
PGStats
Opal | Level 21

What you want is not a cartesian product as produced by SQL. In a cartesian product, the pair a-1 would occur only once...

What you require is a list of all permutations.

 

data have;
input letter $ number;
datalines;
a 1
b 2
c 3
;

proc sql noprint;
select count(letter) into :nbLetters from have;
quit;

proc transpose data=have out=temp(drop=_name_);
var number;
id letter;
run;

data want;
set temp;
array l{&nbLetters} _numeric_;
array pair{&nbLetters} $8;

do i = 1 to fact(&nbLetters);
    call allperm(i, of l{*});
    do j = 1 to &nbLetters;
        pair{j} = catx("-", vname(l{j}), l{j});
        end;
    output;
    end;
keep pair:;
run;

proc print data=want noobs; run;

you could replace allperm by lexperm to get only permutations of distinct nonmissing values.

PG
lcain
Fluorite | Level 6

Thanks so much. In my actual data, both variables are character strings. The length of the variables range from 3-8 characters. How would I edit the code to work for two character strings?

 

Thanks again!

FreelanceReinh
Jade | Level 19

If I may step in here ... I think the following modification of the two array definitions should work for you:

array l{&nbLetters} _character_;
array pair{&nbLetters} $17; /* Minimum required length depends on how you concatenate your strings */

I really like PG's solution because it's so elegant. Please note, however, that there are rare, exceptional cases where it would not work without further modification. For example, if the first character variable contained values such as "a:a" and "a_a", the PROC TRANSPOSE step would fail. But if your character strings are "well-behaved", no problems should occur.

lcain
Fluorite | Level 6

Thanks! That worked. I wasn't allowing for a long enough string for my concatenation.

FreelanceReinh
Jade | Level 19

One could code the transposing into the data step and thus avoid PROC TRANSPOSE:

data have;
length string1
       string2 $8;
input string1 string2;
cards;
first one
second two
third three
fourth four
;

proc sql noprint;
select count(*) into :n from have;
quit;

data want;
array s[&n] $8;
array t[&n] $8;
array p[&n] $17;
do i=1 by 1 until(e);
  set have end=e;
  s[i]=string1;
  t[i]=string2;
end;
do i=1 to fact(&n);
  call allperm(i, of t[*]);
  do j=1 to &n;
    p[j]=catx('-', s[j], t[j]);
  end;
  output;
end;
keep p:;
run;

proc print;
run;
ToniMasso
Calcite | Level 5

Another solution:

data have;
input letter $ number;
datalines;
a 1
b 2
c 3
;


PROC SQL;
   CREATE TABLE WORK.temp1 AS 
   SELECT t1.letter
      FROM WORK.HAVE t1;
QUIT;

PROC SQL;
   CREATE TABLE WORK.temp2 AS 
   SELECT t1.number
      FROM WORK.HAVE t1;
QUIT;

PROC SQL;
   CREATE TABLE WORK.temp3 AS 
   SELECT t1.letter, 
          t2.number
      FROM WORK.temp1 t1
           CROSS JOIN WORK.temp2 t2;
QUIT;

PROC SQL;
   CREATE TABLE WORK.WANT0 AS 
   SELECT DISTINCT t1.letter, 
          t1.number, 
          t2.letter AS letter1, 
          t2.number AS number1, 
          t3.letter AS letter2, 
          t3.number AS number2
      FROM WORK.temp3 t1
           CROSS JOIN WORK.temp3 t2
           CROSS JOIN WORK.temp3 t3
      WHERE t1.letter < t2.letter AND t2.letter < t3.letter AND t1.number NOT = t2.number AND t2.number NOT = t3.number 
           AND t1.number NOT = t3.number;
QUIT;
PROC SQL;
   CREATE TABLE WORK.WANT AS 
   SELECT /* VAR1 */
            (CATX('-',t1.letter,put(t1.number,z.))) AS VAR1, 
          /* VAR2 */
            (CATX('-',t1.letter1, put( t1.number1,z.))) AS VAR2, 
          /* VAR3 */
            (CATX('-',t1.letter2,PUT(t1.number2,Z.))) AS VAR3
      FROM WORK.WANT0 t1;
QUIT;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2122 views
  • 5 likes
  • 4 in conversation