Hi All
How is it possible to do a cartesian product for a normal data step/ proc merge and how is it possible to 'switch off' the cartesain result from PROC SQL?
Thanks
Steve
Using as example:
data meds;
input site subject conmed $10.;
datalines;
1 1234 Med1
1 1234 Med2
;
proc sort;by site subject;
run;
data medhist;
input site subject medhist $10.;
datalines;
1 1234 Ulcer
1 1234 Wind
;
proc sort;by site subject;
run;
data all;
merge meds medhist;
by site subject;
run;
proc sql;
create table allsql as
select a.*, b.medhist
from meds as a
left join medhist as b
on a.site=b.site and a.subject=b.subject;
quit;
There is no way to toggle the conceptual Caretsian product behavior on/off in PROC SQL. This behavior is not just an inherent part of PROC SQL's behavior, but also of the SQL language.
On an operational basis, as contrasted with conceptual behavior, PROC SQL has several ways to implement a join. The join technique that comes closest to actually acting like a Cartesian product is the join sequential loop (jsl) technique. When PROC SQL is run with the _method option on, any instance of a jsl will show up in the method tree as a "sqxjsl."
In the program example that was entered with this question, I obtained the following method tree when I used "proc sql _method;". The real nice aspect of the tree is that it indicated that PROC SQL was using the merge join techinque (which is generally much more efficient than the join sequential loop technique) for this example. Good luck with the problem that you are attempting to solve.
sqxcrta
sqxjm <================ merge join
sqxsort
sqxsrc( WORK.MEDHIST(alias = B) )
sqxsort
sqxsrc( WORK.MEDS(alias = A) )
Take a look at: http://www2.sas.com/proceedings/sugi30/249-30.pdf The very last section of that paper shows how to obtain a cartesian product via datastep code. However, the paper explains most of the various merge capabilities attainable with SAS.
Thanks Art
From your reply I realise there is a way to get a cartesian join from data step coding(i just need to read the paper ;-)) but can you tell me if it's possible to 'switch' off the cartesian joining in PROC SQL (maybe it's also in the paper but a yes or no would help for now)
Thanks
Steve
yes, sort of, kind of, maybe. And, yes, the paper does get into the various kinds of matches. However, if you are asing "is there a way to get all possible matches without getting all possible matches", no,
Cheers Art, I think you've answered my questions, maybe, sort of, 😉
Will read the paper...
Have a top weekend, farewell from Berlin
Steve
Hi Art
How can I stop the PDV holding onto the last value in many-to-many merges?
Using as example:
data meds;
input site subject conmed $10.;
datalines;
1 1234 Med1
1 1234 Med2
;
proc sort;by site subject;
run;
data medhist;
input site subject medhist $10.;
datalines;
1 1234 Ulcer
1 1234 Wind
1 1234 Phlegm
;
proc sort;by site subject;
run;
data all;
merge meds medhist;
by site subject;
run;
The required output is:
1 1234 Med1 Ulcer
1 1234 Med2 Wind
1 1234 Phlegm
and not
1 1234 Med1 Ulcer
1 1234 Med2 Wind
1 1234 Med2 Phlegm
I could use the lag function but thought there maybe another way to do the merge to avoid further programming to clean up afterwards
Thanks
Steve
Just set them all missing.
data meds; input site subject conmed $10.; datalines; 1 1234 Med1 1 1234 Med2 ; proc sort;by site subject; run; data medhist; input site subject medhist $10.; datalines; 1 1234 Ulcer 1 1234 Wind 1 1234 Phlegm ; run; proc sort;by site subject; run; data all; merge meds medhist; by site subject; output; call missing(of _all_); run;
Ksharp
Thanks KSharp
There is no way to toggle the conceptual Caretsian product behavior on/off in PROC SQL. This behavior is not just an inherent part of PROC SQL's behavior, but also of the SQL language.
On an operational basis, as contrasted with conceptual behavior, PROC SQL has several ways to implement a join. The join technique that comes closest to actually acting like a Cartesian product is the join sequential loop (jsl) technique. When PROC SQL is run with the _method option on, any instance of a jsl will show up in the method tree as a "sqxjsl."
In the program example that was entered with this question, I obtained the following method tree when I used "proc sql _method;". The real nice aspect of the tree is that it indicated that PROC SQL was using the merge join techinque (which is generally much more efficient than the join sequential loop technique) for this example. Good luck with the problem that you are attempting to solve.
sqxcrta
sqxjm <================ merge join
sqxsort
sqxsrc( WORK.MEDHIST(alias = B) )
sqxsort
sqxsrc( WORK.MEDS(alias = A) )
Thanks LexisC
You are welcome.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.