Solved: Re: Using Coalesce function. Output which table the value came from?

mcook · Posted 02-04-2022 09:09 AM

I Have code similar to the following, using the coalesce function to join Value1 to taple Temp_Original from 6 possible tables.

Is there a way to also have it output which table the value was taken from?
I could write separate code to check the if the value is missing from each table in sequence, and output the Table name. But I was hoping there might be a smoother way to do it, perhaps using some saved value from the coalesce function indicating the placement of the first non empty value.

any ideas?

proc sql noprint;
create table Temp_Join as
select A.*, 
Coalesce(B.Value1,C.Value1,D.Value1,E.Value1,F.Value1,G.Value1) as Value1,
Coalesce(B.Value2,C.Value2,D.Value2,E.Value2,F.Value2,G.Value2) as Value2
from Temp_Original as A 
left join Temp1 as B on A.SomeValue = B.SomeValue
left join Temp2 as C on A.SomeValue = C.SomeValue
left join Temp3 as D on A.SomeValue = D.SomeValue
left join Temp4 as E on A.SomeValue = E.SomeValue
left join Temp5 as F on A.SomeValue = F.SomeValue
left join Temp6 as G on A.SomeValue = G.SomeValue;
quit;

Tom · Posted 02-04-2022 11:08 AM

Use WHICHN() , or if the values are string WHICHC().

proc sql noprint;
create table Temp_Join as
select A.*
     , Coalesce(B.Value1,C.Value1,D.Value1,E.Value1,F.Value1,G.Value1) as Value1
     , whichn(calculated value1,B.Value1,C.Value1,D.Value1,E.Value1,F.Value1,G.Value1) as Which1
from Temp_Original as A 
left join Temp1 as B on A.SomeValue = B.SomeValue
left join Temp2 as C on A.SomeValue = C.SomeValue
left join Temp3 as D on A.SomeValue = D.SomeValue
left join Temp4 as E on A.SomeValue = E.SomeValue
left join Temp5 as F on A.SomeValue = F.SomeValue
left join Temp6 as G on A.SomeValue = G.SomeValue
;
quit;

So WHICH1 will have 1 if the value came from B, 2 if it came from C, etc.

View solution in original post

CarmineVerrell · Posted 02-04-2022 10:17 AM

No easy way to do this, or none that i know of with Proc SQL . 😞

Here is one way that i know of doing this, unfortunately, you didn't provide any sample data so hard to test but here is some code to start you off.

I would use the data step to accomplish this task as the data step has much more flexibility and control when processing data

/*Make sure all data is sorted in this example its*/ /*by SomeValue */

/*Do this for each file - if not all ready sorted*/ /*below is one example*/ /*proc sort data=Temp_original;*/ /*by somevalue;*/ /*run;*/ Data Temp_Join asselect; Merge Temp_Original(in=INOriginal) Temp1(in=INTemp1) Temp2(in=INTemp2) Temp3(in=INTemp3) Temp4(in=INTemp4) Temp5(in=INTemp5) Temp6(in=INTemp6); if INOriginal=1 then fromfile=0;

if INTemp1=1 then fromfile=1;

if INTemp2=1 then fromfile=2; if INTemp3=1 then fromfile=3;

if INTemp4=1 then fromfile=4;

if INTemp5=1 then fromfile=5; if INTemp6=1 then fromfile=6; if INOriginal=1; /*Keeping all by variables that match the Original file*/ run;

Tom · Posted 02-04-2022 11:08 AM

Use WHICHN() , or if the values are string WHICHC().

proc sql noprint;
create table Temp_Join as
select A.*
     , Coalesce(B.Value1,C.Value1,D.Value1,E.Value1,F.Value1,G.Value1) as Value1
     , whichn(calculated value1,B.Value1,C.Value1,D.Value1,E.Value1,F.Value1,G.Value1) as Which1
from Temp_Original as A 
left join Temp1 as B on A.SomeValue = B.SomeValue
left join Temp2 as C on A.SomeValue = C.SomeValue
left join Temp3 as D on A.SomeValue = D.SomeValue
left join Temp4 as E on A.SomeValue = E.SomeValue
left join Temp5 as F on A.SomeValue = F.SomeValue
left join Temp6 as G on A.SomeValue = G.SomeValue
;
quit;

So WHICH1 will have 1 if the value came from B, 2 if it came from C, etc.

Tom · Posted 02-04-2022 11:20 AM

You could use data step UPDATE statement instead.

Let's create some test datasets.

data have;
  do id=1 to 5; output; end;
run;

data test1;
  input id value;
cards;
1 .
2 4
;

data test2;
  input id value;
cards;
1 3
2 5
3 .
;
data test3;
  input id value;
cards;
1 6
2 7
3 8
4 .
;

Now stack the "test" data together into a transaction dataset. Use the INDSNAME= option of the SET statement to find the name of that dataset that is supplying each record.

data trans;
  set test1-test3 indsname=indsname;
  by id;
  where not missing(value);
  dsname=indsname;
run;

Now apply these transactions to you original dataset. Keep track of whether or not a non missing value has been found and only write that observation instead of the normal UPDATE behavior of just writing the final observation for the group.

data want;
  update have trans;
  by id;
  if first.id then found=0;
  if not missing(value) and not found then do; found+1; output; end;
  if last.id and not found then output;
run;

Results

Obs    id    value      dsname      found

 1      1      3      WORK.TEST2      1
 2      2      4      WORK.TEST1      1
 3      2      5      WORK.TEST2      1
 4      3      .                      0
 5      4      .                      0
 6      5      .                      0

Using Coalesce function. Output which table the value came from?

Re: Using Coalesce function. Output which table the value came from?

Re: Using Coalesce function. Output which table the value came from?

Re: Using Coalesce function. Output which table the value came from?

Re: Using Coalesce function. Output which table the value came from?

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away