Help using Base SAS procedures

Merge

Accepted Solution Solved
Reply
Super Contributor
Posts: 259
Accepted Solution

Merge

Hi I have two data sets that i want to merge. The first is called Common and the second is Link and they both have a common key "Index". Common is sorted by index but there are multiple rows in Common with the same index whereas the each row in Link has a unique index. I only want the information in Link to merge onto the first unique index and blank for other rows in Common where the index is repeated. Example below:

Common

IndexField2Field3
1
1
1

Link

IndexField3
1

100

2200
3100

Merged File I Want

IndexField1Field2Field3
1100
10
10

Accepted Solutions
Solution
‎08-18-2014 06:47 AM
Super User
Super User
Posts: 7,413

Re: Merge

Yep, just pop a case around the assign:

proc sql;

     create table want as

     select     A.*,

                   case     when B.FIELD3=. then 0 else B.FIELD end as FIELD3

     from      COMMON A

     left join  (select 1 as FLAG,* from LINK) B

     on          A.INDEX=B.INDEX

     and        A.FLAG=B.FLAG;

quit

View solution in original post


All Replies
Super User
Super User
Posts: 7,413

Re: Merge

That's pretty odd.  Anyways:

sort common / link; by index;

data want;

     merge common link;

     by index;

run;

data want;

     set want;

     by index;

     if not first.index then field3=0;

run;

Or you could assign a flag:

data common;

     by index;

     if first.index then flag=1;

run;

then merge based on that:

proc sql;

     create table want as

     select     A.*,

                   B.FIELD3

     from      COMMON A

     left join  (select 1 as FLAG,* from LINK) B

     on          A.INDEX=B.INDEX

     and        A.FLAG=B.FLAG;

quit

Super Contributor
Posts: 259

Re: Merge

Thanks that works perfectly. There are about 6 numeric felds on the link file which merges on. Is there a way that you can make the cells 0 when the flag is not 1. At the moment, I have values where the flag is 1 and blank cells where the flag is 0 for all the fields merging on from link.

Many thanks

Solution
‎08-18-2014 06:47 AM
Super User
Super User
Posts: 7,413

Re: Merge

Yep, just pop a case around the assign:

proc sql;

     create table want as

     select     A.*,

                   case     when B.FIELD3=. then 0 else B.FIELD end as FIELD3

     from      COMMON A

     left join  (select 1 as FLAG,* from LINK) B

     on          A.INDEX=B.INDEX

     and        A.FLAG=B.FLAG;

quit

Super Contributor
Posts: 259

Re: Merge

Thanks but if I want this to apply to all fields merging on (not just field3) do I use

case when B.*=. then 0 else B.*...

Super User
Super User
Posts: 7,413

Re: Merge

Well, your probably better off using the datastep method if you want to find shortcuts.  With SQL you should really know what you want out at the end, where from etc. rather than using shortcuts or wildcards.  The problem with the above is that you are not operating on a group, just an ad-hoc selection - for example if the order of common changes then it wouldn't be the same value there.  Makes it quite difficult logically to link the two.  

So yes, you would need one case for each variable, the principal being that you should be specifying what should come out anyways in the select:

proc sql;

     create table WANT as

     select     A.VAR1,

                   A.VAR2,

                   A.VAR3,

                   case when B.VAR1 is null then 0 else B.VAR1 end as VAR1,

                   case when B.VAR2 is null then 0 else B.VAR2 end as VAR2,

...

There are plusses and minuses for datastep syntax vs SQL.  You could also generate the required code using call execute.

Trusted Advisor
Posts: 1,204

Re: Merge

data common;
infile datalines missover;
input index field1 field2 field3;
datalines;
1 . . .
1 . . .
1 . . .
2 . . .
2 . . .
3 . . .
;

data link;
input index field3;
datalines;
1 100
2 200
3 100
;


data want;
update common link;
by index;
if field3=. then field3=0;
run;

Super User
Super User
Posts: 6,502

Re: Merge

So basically you want to defeat the normal behavior of SAS where the values from the datasets with fewer observations are effectively retained.

One simple way is to OUTPUT the record and then reset the values to missing (or zero if you want).  Then this will be the value that is retained onto all of the extra rows.  When you read a new record from the table that contributed that field (or fields) then the zero will be overwritten with the real value.

data want ;

  merge common link ;

  by index ;

  output;

  field3=0;

run;

Super User
Posts: 5,085

Re: Merge

IMO, Tom's solution is easy and expandable to more variables.  However, note that your original post shows that FIELD3 is also part of COMMON.  If that is the case, you will need to get rid of it.  In its shortest form, the DATA step could be:

data want;

  field3=0;

  merge common (drop=field3) link;

  by index;

run;

There are further complications if any of the additional fields in LINK are actually character instead of numeric.

Finally, the result you are seeking is unusual.  If you post some idea of why you need it, there may be other ways to get you where you need to go.

Good luck.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 280 views
  • 0 likes
  • 5 in conversation