New SAS User

Sandeep77 · Posted 06-06-2023 10:12 AM

Hi all,

I am comparing two datasets using coalesce but in the output result, I see multiple records with same debt_code. I want to check if the name and surname columns of accounts in Test2 matches with the name in Test1 table. If it matches then flag it as Yes otherwise No. Here is a sample dataset. The code runs successfully but gives multiple records of the same debt_code. Can you please help?

Data Test_1;
infile cards expandtabs;
input icustomerid debt_code rep_code $ Name $ Surname $;
datalines ;
79968 379556913 131 Paul Kirby
98020 335544664 131 Helen Woods
123015 363876681 131 Steven Crone
;
run;

Data Test_2;
infile cards expandtabs;
input icustomerid debt_code rep_code $ Name $ Surname $;
datalines ;
79968 279618771 005H Pauls Kirby
79968 328710207 899 Paul Kirby
79968 334155405 899 Paul Kirby
79968 340181650 899 Paul Kirby
79968 347829640 899 Paul Kirby
98020 233107515 005H Helen Woods
98020 187724349 005H Helen Wood
123015 188273239 005H SteveA Crone
123015 166978189 005H Steven Crone
123015 174441956 005H Steven Crone
;
run;

proc sql;
create table Comparing_Names as 
select coalesce(a.icustomerid,b.icustomerid) as icustomerid
     , a.debt_code
     , a.rep_code as rep_code1
     , a.dr_inits as first_name1
     , a.dr_name  as last_name1
     , b.debt_code
     , b.rep_code as rep_code2
     , b.dr_inits as first_name2
     , b.dr_name as last_name2
     , case when (a.dr_name=b.dr_name and a.dr_inits=b.dr_inits) then 'YES' else 'NO' end as Match
from Test1 a
full join Test2 b
   on a.icustomerid=b.icustomerid
  and a.dr_name=b.dr_name
  and a.dr_inits=b.dr_inits
order by 1,2
;
quit;

Tom · Posted 06-06-2023 11:47 AM

@Sandeep77 wrote:

Hi, Can you help how can I get the output result as shown previously (Data output). Here I want only the records from Test_2 table and an extra column named match which can show if the name and surname are same as Test_1 dataset.

So instead of a FULL join use a LEFT or RIGHT join

View solution in original post

ballardw · Posted 06-06-2023 10:35 AM

Coalesce has nothing to do with creating multiple records.

Your SQL join does that.

Actually this is a waste of clock cycles

 coalesce(a.icustomerid,b.icustomerid) as icustomerid

combined with

full join Test2 b
   on a.icustomerid=b.icustomerid

Your Join has limited the data to where the two values are the same. So the Coalesce will always returm a.icustomerid as icostomerid.

Provide an example of what you want the output to look like.

Sandeep77 · Posted 06-06-2023 10:40 AM

The Test_2 table should compare the name and surname with Test_1 and give the output yes if the name and surname is same. Here is a sample dataset for output based on Test_1 and Test_2:

Data Output;
infile cards expandtabs;
input icustomerid debt_code rep_code $ Name $ Surname $ Match $;
datalines ;
79968 279618771 005H Pauls Kirby No
79968 328710207 899 Paul Kirby Yes
79968 334155405 899 Paul Kirby Yes
79968 340181650 899 Paul Kirby Yes
79968 347829640 899 Paul Kirby Yes
98020 233107515 005H Helen Woods Yes
98020 187724349 005H Helen Wood No
123015 188273239 005H SteveA Crone No
123015 166978189 005H Steven Crone Yes
123015 174441956 005H Steven Crone Yes
;
run;

PaigeMiller · Posted 06-06-2023 10:45 AM

It's not a waste of clock cycles for a full join. If a.icustomerid is missing, then coalesce chooses the value of b.customerid.

--
Paige Miller

PaigeMiller · Posted 06-06-2023 10:48 AM

@Sandeep77 wrote:

Hi all,

I am comparing two datasets using coalesce but in the output result, I see multiple records with same debt_code. I want to check if the name and surname columns of accounts in Test2 matches with the name in Test1 table. If it matches then flag it as Yes otherwise No. Here is a sample dataset. The code runs successfully but gives multiple records of the same debt_code. Can you please help?

The fact that you are getting extra records was already explained in your previous thread by @Tom

--
Paige Miller

Sandeep77 · Posted 06-06-2023 10:55 AM

Hi, Can you help how can I get the output result as shown previously (Data output). Here I want only the records from Test_2 table and an extra column named match which can show if the name and surname are same as Test_1 dataset.

Tom · Posted 06-06-2023 11:47 AM

@Sandeep77 wrote:

Hi, Can you help how can I get the output result as shown previously (Data output). Here I want only the records from Test_2 table and an extra column named match which can show if the name and surname are same as Test_1 dataset.

So instead of a FULL join use a LEFT or RIGHT join

PaigeMiller · Posted 06-06-2023 12:43 PM

Does coalesce statement create multiple records while comparing datasets?

Hello, @Sandeep77, the first thing you should do is check the documentation for COALESCE. The answer to your question is the first sentence in the documentation: "Returns the first nonmissing value from a list of numeric arguments".

--
Paige Miller

New SAS User

Does coalesce statement create multiple records while comparing datasets?

Re: Does coalesce statement create multiple records while comparing datasets?

Re: Does coalesce statement create multiple records while comparing datasets?

Re: Does coalesce statement create multiple records while comparing datasets?

Re: Does coalesce statement create multiple records while comparing datasets?

Re: Does coalesce statement create multiple records while comparing datasets?

Re: Does coalesce statement create multiple records while comparing datasets?

Re: Does coalesce statement create multiple records while comparing datasets?

Re: Does coalesce statement create multiple records while comparing datasets?

create a macro to iterate through multiple keywords and datasets

Multiple by statements

%else on multiple %ifs statements

Create single records from multiple rows

Compare two datasets and find the non matching records

Follow Us

What is...

New SAS User

Register Today!

Follow Us

What is...