DATA Step, Macro, Functions and more

match-merge doesn't work, and I don't know why

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

match-merge doesn't work, and I don't know why

I am trying to match-merge a small file consisting of lastname and firstname (plus other stuff) with a large reference table. The match is to be done on lastname and firstname (in that order).  Basically, I need to add some information from the reference to the records in the small file.

I've done everything I can think of to assure that the match variables are formatted identically -- same informat, same format, trimmed out leading and trailing blanks -- but the match-merge doesn't work. Yes, the two files are both sorted on the two match variables. Yes, I did include a BY statement, using the same two variables in the same order. It looks like SAS is concatenating the files instead of merging them -- I get all the records from one file with all variables from the other as missing values, followed by the records from the other table (without the non-matched variables from the first one). If I use the (IN=) option for the small file, I simply get the same file with the addition of blanks in the variables brought in from the other file. The reverse happens if I use (IN=) on the large file.

To summarize, the small file was created this way:

DATA WORK.small;

    INFILE '[file path].csv  DELIMITER= ','  MISSOVER  DSD  LRECL=32767  FIRSTOBS=2;

        INFORMAT ID best32. ;

        INFORMAT Lastname $30. ;

        INFORMAT Firstname $30. ;

        FORMAT ID best12. ;

        FORMAT Lastname $30. ;

        FORMAT Firstname $30. ;

    INPUT  ID  Lastname $  Firstname $ ;

run;

And here is how the reference file was read in:

DATA WORK.large;

     INFILE '[file path].csv'  DELIMITER= ','  MISSOVER  DSD  LRECL=32767  FIRSTOBS=2;

        INFORMAT add_this_info  $11. ;

        INFORMAT Lastname $30. ;

        INFORMAT Firstname $30. ;

        INFORMAT Name $40. ;

        FORMAT add_this_info  $11. ;

        FORMAT Lastname $30. ;

        FORMAT Firstname $30. ;

        FORMAT Name $40. ;

        FORMAT add_this_info  $11. ;

    INPUT  add_this_info $  Lastname $  Firstname $  Name $ ;

run;

Both files were sorted on lastname and firstname (in that order), and then I wrote the match-merge:

DATA match_merge;

     MERGE large small(IN=keep);

     BY lastname firstname;

     IF keep=1;

run;

For testing, the IF statement may or may not appear.

I've even concatenated the lastname and firstname into a single field (that's not the "Name" field in the large file), using the COMPRESS and TRIM functions to assure that the fields are exactly alike. Same thing happens.

It is absolutely certain that some records in the small file have corresponding records in the large file -- I can eyeball that. But SAS doesn't see what I do.

Why??


Accepted Solutions
Solution
‎08-11-2015 04:21 AM
Super User
Posts: 7,782

Re: match-merge doesn't work, and I don't know why

Posted in reply to NealTinWA

When you have problems matching variables, convert their values into new variables with a $HEX. format and look at the hex values. You will see previously undetectable characters/circumstances that cause the comparison to fail.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Super User
Posts: 11,343

Re: match-merge doesn't work, and I don't know why

Posted in reply to NealTinWA

did you try

MERGE  small(IN=keep) large;

instead of

MERGE large small(IN=keep);

order of appearance in the merge statement is important

Super User
Posts: 19,814

Re: match-merge doesn't work, and I don't know why

Posted in reply to NealTinWA

Try a SQL merge to test if it's something wrong with your data or your data step.

proc sql;

create table want as

select a.*, b.ID

from a

left join b

on a.first_name=b.first_name

and a.last_name=b.last_name

order by b.last_name;

quit;

Super User
Posts: 5,507

Re: match-merge doesn't work, and I don't know why

Posted in reply to NealTinWA

Have you tried getting rid of the INFORMAT statements?

The FORMAT statements should be sufficient to establish the lengths of the variables.

Solution
‎08-11-2015 04:21 AM
Super User
Posts: 7,782

Re: match-merge doesn't work, and I don't know why

Posted in reply to NealTinWA

When you have problems matching variables, convert their values into new variables with a $HEX. format and look at the hex values. You will see previously undetectable characters/circumstances that cause the comparison to fail.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
PROC Star
Posts: 1,167

Re: match-merge doesn't work, and I don't know why

Posted in reply to KurtBremser

Very nice! That's definitely going in my toolbox.

Tom

New Contributor
Posts: 2

Re: match-merge doesn't work, and I don't know why

Posted in reply to NealTinWA

Yes! That was the problem. There was a hidden hex character in the small file's lastname field.

Interestingly (or not) I couldn't see it if I read the flat file as a CSV, but it was visible if read as a TXT file.

So in the end, I did make a mistake ... but a far different one than I thought I might be making.

Super User
Posts: 11,343

Re: match-merge doesn't work, and I don't know why

Posted in reply to NealTinWA

Let me guess: you looked at the CSV with a spreadsheet program. They do tend to obscure some things.

You can look at CSV with NOTEPAD or similar plain text editors (not Wordpad which supports additional appearance stuff) as well.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 1141 views
  • 8 likes
  • 6 in conversation