# Create all possible pairs - Cartesian product

Hello SAS community,

Can somebody suggest how to create a set of all possible pairs of observations (Cartesian product) without using proc sql?

Say I have a data set test1:

data test1;

do x=1 to 3;

y=x+10; output;

end;
run;

I would like to create a data set test2 that contains all possible combinations of x and y:

data test2;

input x y;

datalines;

1  11

1  12

1  13

2  11

2  12

2  13

3  11

3  12

3  13

;

run;

‎03-12-2012 01:02 PM
## Create all possible pairs - Cartesian product

Posted in reply to Olga_Shest

Not sure I understand because your examples don't match the simple definition of all possible pairs.

Instead consider if you have dataset X with all possible values of variable X.  Similarly for Y.

So for each observation in X read in all observations in Y.

data want ;

set x;

do _n_=1 to nobs;

set y nobs=nobs point=_n_;

output;

end;

run;

## Create all possible pairs - Cartesian product

Posted in reply to Olga_Shest

Here is one approach:

data test1;

do x=1 to 3;

y=x+10; output;

end;

run;

data test2;

set test1;

do i=1 to nobs;

set test1 (keep=y) point=i nobs=nobs;

output;

end;

run;

proc print;run;

Regards,

Haikuo

‎03-12-2012 01:02 PM
## Create all possible pairs - Cartesian product

Posted in reply to Olga_Shest

Not sure I understand because your examples don't match the simple definition of all possible pairs.

Instead consider if you have dataset X with all possible values of variable X.  Similarly for Y.

So for each observation in X read in all observations in Y.

data want ;

set x;

do _n_=1 to nobs;

set y nobs=nobs point=_n_;

output;

end;

run;

## Create all possible pairs - Cartesian product

Posted in reply to Olga_Shest

data test1;
do x=1 to 3;
y=x+10; output;
end;
run;

data test2;
set test1;
drop y;
run;

proc print noobs; run;

data test3;
set test1;
drop x;
run;

proc print noobs; run;

proc sql;
select x,y from test2, test3;
quit;

Output:

x         y
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
1        11
1        12
1        13
2        11
2        12
2        13
3        11
3        12
3        13

## Create all possible pairs - Cartesian product

Thank you Hima. Your approach requires proc sql though...

## Re: Create all possible pairs - Cartesian product

Posted in reply to Olga_Shest

Just for fun, here is a hash approach,

data test2 (drop=_;

if _n_=1 then do;

set test1(obs=1);

dcl hash h(dataset: 'test1', ordered: 'a');

h.definekey('y');

h.definedata('y');

h.definedone();

dcl hiter hi('h');

end;

set test1;

do _rc=hi.first() by 0 while (_rc=0);

output;

_rc=hi.next();

end;

run;

## Create all possible pairs - Cartesian product

Thank you Haikuo. Would you care to elaborate what are advantages of the hash approch you demonstrated over the "stacking y's for all x's" approach that you and Tom posted?

## Re: Create all possible pairs - Cartesian product

Posted in reply to Olga_Shest

Hash in general is more efficient for saving I/O time by process records completely in memory. I can't speak for this case, you will have to do a benchmark.

Haikuo

## Re: Create all possible pairs - Cartesian product

Posted in reply to Olga_Shest

Thank you everyone for posting!

