How to implement the following in proc sql?
proc sort data = set1; by x, y; run;
proc sort data = set2; by x, y; run;
data set3;
merge set1 (in = a) set2 (in = b);
by x y;
if a and not b;
run;
I am trying to do this by group, and the group identifier is commonly available in set1 and set2.
Are there variables in common between set1 and set2 other than X and Y?
Data step merge will "update" common variable other than the by variables and if you need/want that behavior we need to know.
BTW your sort syntax is incorrect. No commas in a BY statement
Left or Right join.
See this previous questions for a good diagram (bottom) and reference to a paper that has more details.
https://communities.sas.com/t5/SAS-Procedures/Proc-SQL-equiv-of-Merge-If-A-or-B/td-p/255720
@xyxu wrote:
How to implement the following in proc sql?
proc sort data = set1; by x, y; run; proc sort data = set2; by x, y; run; data set3; merge set1 (in = a) set2 (in = b); by x y; if a and not b; run;
I am trying to do this by group, and the group identifier is commonly available in set1 and set2.
proc sql;
create table set3 as
select *
from
set1 a left join set2 b
on a.x=b.x and a.y=b.y
where a.x ne b.x and a.y ne b.y;
quit;
Note: I used * in select clause coz I am lazy to type and that will throw a warning because of ambiguity. However, you could type the list of variables that you need appropriately.
"My goal is to find all obs in set1 that does not have any match in set2 in terms of a pair of x and y."--
@xyxu Yeah OR is just fine too
on a.x=b.x and a.y=b.y
where a.x ne b.x or a.y ne b.y;
proc sql;
create table set3 as
select t1.*
from set1 t1 left join set2 t2
on t1.x = t2.x and t1.y = t2.y
where missing(t2.x) or missing(t2.y)
;
quit;
I suggest you stick with data step merge (in comparison to proc SQL).
The already have a code that needs some modifications as follows and you will be nearing your goal.
As suggested by @ballardw make corrections to the sort syntax.
In addition rename the common variables (other than by variables) in one of the datasets.
You may have to do home post processing to the merged dataset to get exactly what you want,
but you can easily achieve your objective of purging the rows of dataset2 from dataset1.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.