Trouble formatting merge keys

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Trouble formatting merge keys

I'm using SAS 9.3 and trying to merge two datasets. In dataset1, my merge key variable called "id" is originally formatted as a character $200.  In dataset2, the merge key variable also called "id" is formatted as a character $6. The values of id are actually numbers (for example, 001004 or 134875).

 

I have tried the following code that reformats the id in dataset1 and tries to perform the merge:

data dataset1; set dataset1;
newid=input(id,BEST12.);
newid2=put(newid,z6.);
run;

proc sql;
create table combo as
select a.*, b.var1, b.var2
from dataset2 as a left join dataset1 as b
on a.id = b.newid2;
quit;
run;

The above code appears to work at first, but fails to fully merge all the records in dataset1. The merge works correctly for low values of id, but fails when there are no longer any leading zeros. As a specific example, the merge works for id=099999 but not for id=100000. How can I format my id variable to make sure all records are appropriately merged?


Accepted Solutions
Solution
‎02-28-2018 11:49 AM
PROC Star
Posts: 266

Re: Trouble formatting merge keys

Why not convert the two variables to numeric in the SQL call?

proc sql;
  create table combo as
  select 
    dataset2.*, 
    dataset1.var1, 
    dataset1.var2
from 
  dataset2 left join dataset1
  on input(dataset2.id,best6.) = input(left(dataset1.id),best12.)
  ;
quit;
run;

And be careful with those SQL aliases. Everybody uses them, but they very often make the code harder to read - in this case dataset1 was mapped to "b" and dataset2 to "a",  very confusing, IMO. It is not that hard to use a change command to get rid of the aliases in the final code, and it normally makes everything much easier to read.

View solution in original post


All Replies
Super User
Posts: 23,771

Re: Trouble formatting merge keys

Check the lengths on your variables to see if they’re the same and use STRIP to remove any leading/trailing spaces. 

 

If if the other variable had a length of $200, perhaps the following may work:

 

Length newID $6.;
NewID = ID;

You’ll have to make the names match but hopefully that gives you the idea. 

Super User
Posts: 23,771

Re: Trouble formatting merge keys

Check the lengths on your variables to see if they’re the same and use STRIP to remove any leading/trailing spaces. 

 

If if the other variable had a length of $200, perhaps the following may work:

 

Length newID $6.;
NewID = ID;

You’ll have to make the names match but hopefully that gives you the idea. 

Esteemed Advisor
Posts: 5,540

Re: Trouble formatting merge keys

If both ID's are character then you don't need a transformation. Removing leading blanks should be enough:

 

proc sql;
create table combo as
select a.*, b.var1, b.var2
from dataset2 as a left join dataset1 as b
on left(a.id) = left(b.id);
quit;
PG
Solution
‎02-28-2018 11:49 AM
PROC Star
Posts: 266

Re: Trouble formatting merge keys

Why not convert the two variables to numeric in the SQL call?

proc sql;
  create table combo as
  select 
    dataset2.*, 
    dataset1.var1, 
    dataset1.var2
from 
  dataset2 left join dataset1
  on input(dataset2.id,best6.) = input(left(dataset1.id),best12.)
  ;
quit;
run;

And be careful with those SQL aliases. Everybody uses them, but they very often make the code harder to read - in this case dataset1 was mapped to "b" and dataset2 to "a",  very confusing, IMO. It is not that hard to use a change command to get rid of the aliases in the final code, and it normally makes everything much easier to read.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 136 views
  • 0 likes
  • 4 in conversation