I Have code similar to the following, using the coalesce function to join Value1 to taple Temp_Original from 6 possible tables.
Is there a way to also have it output which table the value was taken from?
I could write separate code to check the if the value is missing from each table in sequence, and output the Table name. But I was hoping there might be a smoother way to do it, perhaps using some saved value from the coalesce function indicating the placement of the first non empty value.
any ideas?
proc sql noprint;
create table Temp_Join as
select A.*,
Coalesce(B.Value1,C.Value1,D.Value1,E.Value1,F.Value1,G.Value1) as Value1,
Coalesce(B.Value2,C.Value2,D.Value2,E.Value2,F.Value2,G.Value2) as Value2
from Temp_Original as A
left join Temp1 as B on A.SomeValue = B.SomeValue
left join Temp2 as C on A.SomeValue = C.SomeValue
left join Temp3 as D on A.SomeValue = D.SomeValue
left join Temp4 as E on A.SomeValue = E.SomeValue
left join Temp5 as F on A.SomeValue = F.SomeValue
left join Temp6 as G on A.SomeValue = G.SomeValue;
quit;
Use WHICHN() , or if the values are string WHICHC().
proc sql noprint;
create table Temp_Join as
select A.*
, Coalesce(B.Value1,C.Value1,D.Value1,E.Value1,F.Value1,G.Value1) as Value1
, whichn(calculated value1,B.Value1,C.Value1,D.Value1,E.Value1,F.Value1,G.Value1) as Which1
from Temp_Original as A
left join Temp1 as B on A.SomeValue = B.SomeValue
left join Temp2 as C on A.SomeValue = C.SomeValue
left join Temp3 as D on A.SomeValue = D.SomeValue
left join Temp4 as E on A.SomeValue = E.SomeValue
left join Temp5 as F on A.SomeValue = F.SomeValue
left join Temp6 as G on A.SomeValue = G.SomeValue
;
quit;
So WHICH1 will have 1 if the value came from B, 2 if it came from C, etc.
No easy way to do this, or none that i know of with Proc SQL . 😞
Here is one way that i know of doing this, unfortunately, you didn't provide any sample data so hard to test but here is some code to start you off.
I would use the data step to accomplish this task as the data step has much more flexibility and control when processing data
/*Make sure all data is sorted in this example its*/
/*by SomeValue */
/*Do this for each file - if not all ready sorted*/
/*below is one example*/
/*proc sort data=Temp_original;*/
/*by somevalue;*/
/*run;*/
Data Temp_Join asselect;
Merge Temp_Original(in=INOriginal)
Temp1(in=INTemp1)
Temp2(in=INTemp2)
Temp3(in=INTemp3)
Temp4(in=INTemp4)
Temp5(in=INTemp5)
Temp6(in=INTemp6);
if INOriginal=1 then fromfile=0;
if INTemp1=1 then fromfile=1;
if INTemp2=1 then fromfile=2;
if INTemp3=1 then fromfile=3;
if INTemp4=1 then fromfile=4;
if INTemp5=1 then fromfile=5;
if INTemp6=1 then fromfile=6;
if INOriginal=1; /*Keeping all by variables that match the Original file*/
run;
Use WHICHN() , or if the values are string WHICHC().
proc sql noprint;
create table Temp_Join as
select A.*
, Coalesce(B.Value1,C.Value1,D.Value1,E.Value1,F.Value1,G.Value1) as Value1
, whichn(calculated value1,B.Value1,C.Value1,D.Value1,E.Value1,F.Value1,G.Value1) as Which1
from Temp_Original as A
left join Temp1 as B on A.SomeValue = B.SomeValue
left join Temp2 as C on A.SomeValue = C.SomeValue
left join Temp3 as D on A.SomeValue = D.SomeValue
left join Temp4 as E on A.SomeValue = E.SomeValue
left join Temp5 as F on A.SomeValue = F.SomeValue
left join Temp6 as G on A.SomeValue = G.SomeValue
;
quit;
So WHICH1 will have 1 if the value came from B, 2 if it came from C, etc.
You could use data step UPDATE statement instead.
Let's create some test datasets.
data have;
do id=1 to 5; output; end;
run;
data test1;
input id value;
cards;
1 .
2 4
;
data test2;
input id value;
cards;
1 3
2 5
3 .
;
data test3;
input id value;
cards;
1 6
2 7
3 8
4 .
;
Now stack the "test" data together into a transaction dataset. Use the INDSNAME= option of the SET statement to find the name of that dataset that is supplying each record.
data trans;
set test1-test3 indsname=indsname;
by id;
where not missing(value);
dsname=indsname;
run;
Now apply these transactions to you original dataset. Keep track of whether or not a non missing value has been found and only write that observation instead of the normal UPDATE behavior of just writing the final observation for the group.
data want;
update have trans;
by id;
if first.id then found=0;
if not missing(value) and not found then do; found+1; output; end;
if last.id and not found then output;
run;
Results
Obs id value dsname found 1 1 3 WORK.TEST2 1 2 2 4 WORK.TEST1 1 3 2 5 WORK.TEST2 1 4 3 . 0 5 4 . 0 6 5 . 0
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.