How can we flag the source of each observations after merging two datasets using PROC SQL? The flag variable should identify whether the observation belongs to both datasets or any of them.
You need to use the COALESCE function for your key variable(s) in the SELECT.
select
coalesce(a.key,b.key) as key,
.....
Use CASE logic:
proc sql;
select case
when not missing(A.column) and not missing(B.column) then 'BOTH'
when not missing(A.column) then 'FROMA'
when not missing(B.column) then 'FROMB'
else ' '
end as Source_Flag length = 5
from table1 as A
outer join table2 as B
....;
quit;
The easiest way is to use normal SAS code to merge instead since it actually has a concept of source. So code like this will create a FLAG variable with values of 1, 2 or 3 (both).
data want;
merge a(in=in1) b(in=in2);
flag = 2*in2 + in1 ;
run;
You could do something similar in SQL code with a little work.
Say you had this SQL code:
proc sql;
create table want as
select ...variable list...
from A
full join B
on (...criteria...)
;
You could add those IN1 and IN2 flags and the create the new FLAG variable the same way like this:
create table want as
select ...variable list...
,case when (in1 and in2) then 3 when (in1) then 1 else 2 end as FLAG
from (select *,1 as in1 from A ) A
full join (select *,1 as in2 from B) B
on (...criteria...)
;
You need the CASE instead of the simple arithmetic because the IN1 and IN2 variables will be coded as 1 and missing instead of the 1 and 0 that they would have had when created by the IN= dataset option.
Thanks for the code.
When I ran the last set of code with PRC SQL, the flag variable is created ok but the matching variable is missing when flag=2.
You need to use the COALESCE function for your key variable(s) in the SELECT.
select
coalesce(a.key,b.key) as key,
.....
The combined dataset now contains only the key variable.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.