data monthly.summary_ead; merge summary_ead(in=a) final_ead(in=b rename=(dawccf=final_dawccf)); by portfolio product ead; if a and not b then final_dawccf=dawccf; if not a and b then do; cnt=0; end; run; translated into SQL; Select *, b.dawccf as b.final_dawccf case when (a.portfolio is not null and a.product is not null and a.ead is not null) and (b.portfolio is null or b.product is null or b.ead is null) then final_dawccf=dawccf, case when (a.portfolio is null or a.product is null or a.ead is null) and (b.portfolio is not null and b.product is not null or b.ead is not null) then cnt=0, from summary_ead a full join final_ead b on a.portfolio=b.portfolio and a.product=b.product and a.ead=b.ead
Hi all,
I try to translate sas into sql but not if it is correct. Please could you give me advice. Thanks.
Do the three variables ( portfolio product ead ) uniquely identify the observations in one or both of the input datasets?
If not then you cannot replicate the MERGE with any type of join in SQL code.
Is your question how to translate these two IF/THE statements?
if a and not b then final_dawccf=dawccf;
if not a and b then cnt=0;
What values do FINAL_DAWCCF and CNT get when the conditions in your IF statement are false?
Is CNT coming in from one of those two datasets? If not then CNT will be missing.
Do you need both DAWCCF and FINAL_DAWCCF?
Are there other variables besides the five (six?) you have mentioned in the code?
In the SQL code you have to add something to coalesce the key variables. You cannot select both A.EAD and B.EAD like you are doing since the target dataset cannot have two variables named EAD.
You also need to make sure to include a comma between each variable listed in the SELECT clause. To make it easier to see whether you have or not make sure to place them at the BEGINNING of the line instead of the END of the line. Then you can quickly scan the code to insure it is valid. It is much harder to scan the jagged end of lines looking for commas than to scan the neat beginning of the lines.
So perhaps something like this:
create table monthly.summary_ead_sql as
select
    coalesce(a.portfolio,b.portfolio) as portfolio
  , coalesce(a.product,b.product) as product
  , coalesce(a.ead,b.ead) as ead
  , case when (not missing(a.portfolio) and missing(b.portfolio)) 
         then a.dawccf 
         else b.dawccf 
    end as final_dawccf
  , case when (missing(a.portfolio) and not missing(b.portfolio)) 
         then 0
         else null  /* ?? a.cnt ?? */
    end as cnt
from summary_ead a 
full join final_ead b
  on a.portfolio=b.portfolio and a.product=b.product and a.ead=b.ead
;
But testing if PORTFOLIO is missing can only mimic the IN= flag variables if PORTFOLIO cannot have missing values in either dataset.
Can the CNT and DAWCCF variables ever be missing? Perhaps you can skip the CASE and just use COALESCE() instead?
  , coalesce(b.dawccf,a.dawccf) as dawccf 
  , coalesce(a.cnt,0) as cnt
My question is: why?
If it ain't broke, don't fix it...
MERGE has behaviors that SQL Join doesn't. For instance variables common to both data sets other than the by variables replace values based on order of data set names appearing on the MERGE statement. To get that behavior you need to use COALESCE or COALESCEC function for every common variable.
Behavior of FULL JOIN is not what MERGE does as repeats of BY variables will result in very different matching than SQL join.
Without example data and expected results it is extremely hard to suggest an SQL solution that does match your expectation or need.
And I think we've been over this and similar issues related to MERGE vs SQL before.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
