How to implement the following in proc sql?
proc sort data = set1; by x, y; run;
proc sort data = set2; by x, y; run;
data set3;
merge set1 (in = a) set2 (in = b);
by x y;
if a and not b;
run;
I am trying to do this by group, and the group identifier is commonly available in set1 and set2.
Are there variables in common between set1 and set2 other than X and Y?
Data step merge will "update" common variable other than the by variables and if you need/want that behavior we need to know.
BTW your sort syntax is incorrect. No commas in a BY statement
Left or Right join.
See this previous questions for a good diagram (bottom) and reference to a paper that has more details.
https://communities.sas.com/t5/SAS-Procedures/Proc-SQL-equiv-of-Merge-If-A-or-B/td-p/255720
@xyxu wrote:
How to implement the following in proc sql?
proc sort data = set1; by x, y; run; proc sort data = set2; by x, y; run; data set3; merge set1 (in = a) set2 (in = b); by x y; if a and not b; run;
I am trying to do this by group, and the group identifier is commonly available in set1 and set2.
proc sql;
create table set3 as
select *
from
set1 a left join set2 b
on a.x=b.x and a.y=b.y
where a.x ne b.x and a.y ne b.y;
quit;
Note: I used * in select clause coz I am lazy to type and that will throw a warning because of ambiguity. However, you could type the list of variables that you need appropriately.
"My goal is to find all obs in set1 that does not have any match in set2 in terms of a pair of x and y."--
@xyxu Yeah OR is just fine too
on a.x=b.x and a.y=b.y
where a.x ne b.x or a.y ne b.y;
proc sql;
create table set3 as
select t1.*
from set1 t1 left join set2 t2
on t1.x = t2.x and t1.y = t2.y
where missing(t2.x) or missing(t2.y)
;
quit;
I suggest you stick with data step merge (in comparison to proc SQL).
The already have a code that needs some modifications as follows and you will be nearing your goal.
As suggested by @ballardw make corrections to the sort syntax.
In addition rename the common variables (other than by variables) in one of the datasets.
You may have to do home post processing to the merged dataset to get exactly what you want,
but you can easily achieve your objective of purging the rows of dataset2 from dataset1.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.