- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I am comparing two datasets using coalesce but in the output result, I see multiple records with same debt_code. I want to check if the name and surname columns of accounts in Test2 matches with the name in Test1 table. If it matches then flag it as Yes otherwise No. Here is a sample dataset. The code runs successfully but gives multiple records of the same debt_code. Can you please help?
Data Test_1;
infile cards expandtabs;
input icustomerid debt_code rep_code $ Name $ Surname $;
datalines ;
79968 379556913 131 Paul Kirby
98020 335544664 131 Helen Woods
123015 363876681 131 Steven Crone
;
run;
Data Test_2;
infile cards expandtabs;
input icustomerid debt_code rep_code $ Name $ Surname $;
datalines ;
79968 279618771 005H Pauls Kirby
79968 328710207 899 Paul Kirby
79968 334155405 899 Paul Kirby
79968 340181650 899 Paul Kirby
79968 347829640 899 Paul Kirby
98020 233107515 005H Helen Woods
98020 187724349 005H Helen Wood
123015 188273239 005H SteveA Crone
123015 166978189 005H Steven Crone
123015 174441956 005H Steven Crone
;
run;
proc sql;
create table Comparing_Names as
select coalesce(a.icustomerid,b.icustomerid) as icustomerid
, a.debt_code
, a.rep_code as rep_code1
, a.dr_inits as first_name1
, a.dr_name as last_name1
, b.debt_code
, b.rep_code as rep_code2
, b.dr_inits as first_name2
, b.dr_name as last_name2
, case when (a.dr_name=b.dr_name and a.dr_inits=b.dr_inits) then 'YES' else 'NO' end as Match
from Test1 a
full join Test2 b
on a.icustomerid=b.icustomerid
and a.dr_name=b.dr_name
and a.dr_inits=b.dr_inits
order by 1,2
;
quit;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Sandeep77 wrote:
Hi, Can you help how can I get the output result as shown previously (Data output). Here I want only the records from Test_2 table and an extra column named match which can show if the name and surname are same as Test_1 dataset.
So instead of a FULL join use a LEFT or RIGHT join
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Coalesce has nothing to do with creating multiple records.
Your SQL join does that.
Actually this is a waste of clock cycles
coalesce(a.icustomerid,b.icustomerid) as icustomerid
combined with
full join Test2 b on a.icustomerid=b.icustomerid
Your Join has limited the data to where the two values are the same. So the Coalesce will always returm a.icustomerid as icostomerid.
Provide an example of what you want the output to look like.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The Test_2 table should compare the name and surname with Test_1 and give the output yes if the name and surname is same. Here is a sample dataset for output based on Test_1 and Test_2:
Data Output;
infile cards expandtabs;
input icustomerid debt_code rep_code $ Name $ Surname $ Match $;
datalines ;
79968 279618771 005H Pauls Kirby No
79968 328710207 899 Paul Kirby Yes
79968 334155405 899 Paul Kirby Yes
79968 340181650 899 Paul Kirby Yes
79968 347829640 899 Paul Kirby Yes
98020 233107515 005H Helen Woods Yes
98020 187724349 005H Helen Wood No
123015 188273239 005H SteveA Crone No
123015 166978189 005H Steven Crone Yes
123015 174441956 005H Steven Crone Yes
;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It's not a waste of clock cycles for a full join. If a.icustomerid is missing, then coalesce chooses the value of b.customerid.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Sandeep77 wrote:
Hi all,
I am comparing two datasets using coalesce but in the output result, I see multiple records with same debt_code. I want to check if the name and surname columns of accounts in Test2 matches with the name in Test1 table. If it matches then flag it as Yes otherwise No. Here is a sample dataset. The code runs successfully but gives multiple records of the same debt_code. Can you please help?
The fact that you are getting extra records was already explained in your previous thread by @Tom
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Can you help how can I get the output result as shown previously (Data output). Here I want only the records from Test_2 table and an extra column named match which can show if the name and surname are same as Test_1 dataset.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Sandeep77 wrote:
Hi, Can you help how can I get the output result as shown previously (Data output). Here I want only the records from Test_2 table and an extra column named match which can show if the name and surname are same as Test_1 dataset.
So instead of a FULL join use a LEFT or RIGHT join
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Does coalesce statement create multiple records while comparing datasets?
Hello, @Sandeep77, the first thing you should do is check the documentation for COALESCE. The answer to your question is the first sentence in the documentation: "Returns the first nonmissing value from a list of numeric arguments".
Paige Miller