DATA Step, Macro, Functions and more

Cartesian ON and OFF

Accepted Solution Solved
Reply
Contributor
Posts: 51
Accepted Solution

Cartesian ON and OFF

Hi All

How is it possible to do a cartesian product for a normal data step/ proc merge and how is it possible to 'switch off' the cartesain result from PROC SQL?

Thanks

Steve

Using as example:

data meds;

input site subject  conmed $10.;

datalines;

1 1234 Med1

1 1234 Med2

;

proc sort;by site subject;

run;

data medhist;

input site subject medhist $10.;

datalines;

1 1234 Ulcer

1 1234 Wind

;

proc sort;by site subject;

run;

data all;

    merge meds medhist;

    by site subject;

run;

proc sql;

    create table allsql as

    select a.*, b.medhist

    from meds as a

    left join medhist as b

    on a.site=b.site and a.subject=b.subject;

quit;


Accepted Solutions
Solution
‎12-02-2011 03:38 PM
SAS Employee
Posts: 7

Cartesian ON and OFF

There is no way to toggle the conceptual Caretsian product behavior on/off in PROC SQL. This behavior is not just an inherent part of PROC SQL's behavior, but also of  the SQL language.

On an operational basis, as contrasted with conceptual behavior, PROC SQL has several ways to implement a join. The join technique that comes closest to actually acting like a Cartesian product is the join sequential loop (jsl)  technique. When PROC SQL is run with the _method option on, any instance of a jsl will show up in the method tree as a "sqxjsl."

In the program example that was entered with this question, I obtained the following method tree when I used "proc sql  _method;". The real nice aspect of the tree is that it indicated that PROC SQL was using the merge join techinque (which is generally much more efficient than the join sequential loop technique) for this example. Good luck with the problem that you are attempting to solve.

      sqxcrta

          sqxjm  <================ merge join

              sqxsort

                  sqxsrc( WORK.MEDHIST(alias = B) )

              sqxsort

                  sqxsrc( WORK.MEDS(alias = A) )

View solution in original post


All Replies
PROC Star
Posts: 7,474

Cartesian ON and OFF

Take a look at: http://www2.sas.com/proceedings/sugi30/249-30.pdf  The very last section of that paper shows how to obtain a cartesian product via datastep code.  However, the paper explains most of the various merge capabilities attainable with SAS.

Contributor
Posts: 51

Cartesian ON and OFF

Thanks Art

From your reply I realise there is a way to get a cartesian join from data step coding(i just need to read the paper ;-)) but can you tell me if it's possible to 'switch' off the cartesian joining in PROC SQL (maybe it's also in the paper but a yes or no would help for now)

Thanks
Steve

PROC Star
Posts: 7,474

Cartesian ON and OFF

yes, sort of, kind of, maybe.  And, yes, the paper does get into the various kinds of matches.  However, if you are asing "is there a way to get all possible matches without getting all possible matches", no,

Contributor
Posts: 51

Cartesian ON and OFF

Cheers Art, I think you've answered my questions, maybe, sort of, ;-)

Will read the paper...

Have a top weekend, farewell from Berlin

Steve

Contributor
Posts: 51

Cartesian ON and OFF

Hi Art

How can I stop the PDV holding onto the last value in many-to-many merges?

Using as example:

data meds;

input site subject  conmed $10.;

datalines;

1 1234 Med1

1 1234 Med2

;

proc sort;by site subject;

run;

data medhist;

input site subject medhist $10.;

datalines;

1 1234 Ulcer

1 1234 Wind

1 1234 Phlegm

;

proc sort;by site subject;

run;

data all;

    merge meds medhist;

    by site subject;

run;

The required output is:

1 1234 Med1 Ulcer

1 1234 Med2 Wind

1 1234          Phlegm

and not

1 1234 Med1 Ulcer

1 1234 Med2 Wind

1 1234 Med2 Phlegm

I could use the lag function but thought there maybe another way to do the merge to avoid further programming to clean up afterwards

Thanks

Steve

Super User
Posts: 10,035

Cartesian ON and OFF

Just set them all missing.

data meds;
input site subject  conmed $10.;
datalines;
1 1234 Med1
1 1234 Med2
;

proc sort;by site subject;
run;


data medhist;
input site subject medhist $10.;
datalines;
1 1234 Ulcer
1 1234 Wind
1 1234 Phlegm
;
run;

proc sort;by site subject;
run;

 

data all;
    merge meds medhist;
    by site subject;
     output;
     call missing(of _all_);
run;

Ksharp

Contributor
Posts: 51

Cartesian ON and OFF

Thanks KSharp

Solution
‎12-02-2011 03:38 PM
SAS Employee
Posts: 7

Cartesian ON and OFF

There is no way to toggle the conceptual Caretsian product behavior on/off in PROC SQL. This behavior is not just an inherent part of PROC SQL's behavior, but also of  the SQL language.

On an operational basis, as contrasted with conceptual behavior, PROC SQL has several ways to implement a join. The join technique that comes closest to actually acting like a Cartesian product is the join sequential loop (jsl)  technique. When PROC SQL is run with the _method option on, any instance of a jsl will show up in the method tree as a "sqxjsl."

In the program example that was entered with this question, I obtained the following method tree when I used "proc sql  _method;". The real nice aspect of the tree is that it indicated that PROC SQL was using the merge join techinque (which is generally much more efficient than the join sequential loop technique) for this example. Good luck with the problem that you are attempting to solve.

      sqxcrta

          sqxjm  <================ merge join

              sqxsort

                  sqxsrc( WORK.MEDHIST(alias = B) )

              sqxsort

                  sqxsrc( WORK.MEDS(alias = A) )

Contributor
Posts: 51

Cartesian ON and OFF

Posted in reply to LewisC_sas

Thanks LexisC

SAS Employee
Posts: 7

Cartesian ON and OFF

You are welcome.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 245 views
  • 6 likes
  • 4 in conversation