Help using Base SAS procedures

Merge criteria

Reply
Super Contributor
Posts: 401

Merge criteria

Hi, I'm trying to merge 2 data sets but need a criteria that if one of the merges don't match (or it doesn't have a match) then merge another field.. this is what I mean :

Have:

SET1

AREA    Prov   

Tor         Ont      

Osh       Ont      

Mon       Que     

Van        BC      

Win        Man    

Ont        Ont      

Que       Que     

Man      Man     

BC         BC      

SET2

AREA    VAL

Ont         2.0

Tor          3.5

Mon        4.4

Van        2.1

BC         3.2

Man        2.8

Que        1.9

Ont         2.5

Want: (merge SET1 and SET2)

SET3

AREA    Prov     VAL

Tor         Ont       3.5

Osh        Ont       2.5  <- here becasue there's no match for Osh in SET2 it uses the value for the PROV

Mon       Que       4.4

Van         BC       2.1

Win         Man     2.8  <- here becasue there's no match for Win in SET2 it uses the value for the PROV

Ont         Ont       2.5

Que        Que       1.9

Man        Man      2.8

BC          BC        3.2

Thanks

PROC Star
Posts: 7,471

Merge criteria

Ontario has two records in set 2.  Was that an error or is that the data you actually have?  If it is the latter, what do you want to do in such situations?

PROC Star
Posts: 7,471

Merge criteria

You haven't responded to my questions, thus I'm answering based on the situation where the duplicate entry was an error.  I think that the following accomplishes that which you want to do:

data SET1;

  input (AREA Prov) ($);

  cards;

Tor       Ont     

Osh       Ont     

Mon       Que    

Van       BC     

Win       Man   

Ont       Ont     

Que       Que    

Man       Man    

BC        BC     

;

data SET2;

  input AREA $ VAL;

  cards;

Tor        3.5

Mon        4.4

Van        2.1

BC         3.2

Man        2.8

Que        1.9

Ont        2.5

;

proc sort data=SET1;

  by area;

run;

proc sort data=SET2;

  by area;

run;

data found

     notfound (drop=val

               rename=(area=prov prov=area));

  merge set1 (in=ina) set2 (in=inb);

  by area;

  if ina and inb then output found;

  else if ina then output notfound;

run;

proc sort data=notfound;

  by area;

run;

data need (rename=(area=prov prov=area));

  merge notfound (in=ina) set2 (in=inb);

  by area;

  if ina and inb;

run;

data want;

  set found need;

run;

Super Contributor
Posts: 401

Re: Merge criteria

Hi art.. is rename=(area=prov prov=area)); neccessary ?, since I have another field that I must match by (Date) as well.  Thanks

PROC Star
Posts: 7,471

Re: Merge criteria

If you use Tom's proc sql approach you avoid that need.  There may be a way around it with a datastep approach, but I couldn't think of it.  Using a datastep approach one has to do something to merge on the switched variables.  My approach was simply to rename the variables.

As far as adding another variable, like date, you would have to again provide example data.

Super Contributor
Posts: 401

Re: Merge criteria

Sorry for the delay.  One was an error....

PROC Star
Posts: 7,471

Re: Merge criteria

Then either Tom's or my code should do what you want.

Super User
Super User
Posts: 7,042

Re: Merge criteria

Perhaps we should let Art answer this one as those values look like Canadian provinces. Smiley Wink

SQL is useful for doing complex joins.

create table SET3 as

  select distinct a.area,a.prov,coalesce(b.val,c.val) as val

  from set1 a

   left join set2 b

   on a.area = b.area

   left join set2 c

   on a.prov = c.area

;

Message was edited by: Tom Abernathy to fix typo in last ON condition

PROC Star
Posts: 7,471

Re: Merge criteria

Tom,

They are definitely Canadian cities and provinces.  Your proc sql code would need something more, though (probably a case statement), as it doesn't get the values for the two records where the city is missing from the original match.

I think my convoluted datastep approach provides the right result, but I'd be interested in seeing the proc sql equivalent that achieves the same result.

Art

Super User
Super User
Posts: 7,042

Re: Merge criteria

I helps if I join on the right condition.

Occasional Contributor
Posts: 6

Re: Merge criteria

I see you got some nice suggestions. It seems to me hash() could be one of possible approaches, since it does not require sorting and is capable of searching multiple variables at the same time.

data SET1;

infile cards;

input AREA $  Prov $;

cards;

Tor         Ont

Osh       Ont

Mon       Que

Van        BC

Win        Man

Ont        Ont

Que       Que

Man      Man

BC         BC

;

data SET2;

infile cards;

input AREA $  val;

cards;

Tor          3.5

Mon        4.4

Van        2.1

BC         3.2

Man        2.8

Que        1.9

Ont         2.5

;

data merged;

   if _n_=1 then do;

     if 0 then set set2;

declare hash merg (dataset: "set2");

merg.definekey("area");

merg.definedata("val");

merg.definedone();

   end;

set set1;

if merg.find(key:area)=0 then output;

else if merg.find(keySmiley Tonguerov)=0 then output;

else delete;

run;

proc print;run;

Kindly Regards,

Haikuo

Super Contributor
Posts: 401

Re: Merge criteria

Posted in reply to 2much2learn

Thanks 2m2l, what if I also have different dates that I need to match for each of the obs in Set1 and Set2..

eg.

SET1

AREA    Prov   Date   

Tor         Ont      2009.01

Tor         Ont      2009.02

Tor         Ont      2009.03

Osh       Ont       2009.01

Osh       Ont       2009.02

Osh       Ont       2009.03

Mon       Que      2009.01

Mon       Que      2009.02

Mon       Que      20009.03

 

SET2

AREA    VAL     Date

Ont         2.0      2009.01

Ont         2.1      2009.02

Ont         2.5      2009.03

Tor          3.5     2009.01

Tor          3.6      2009.02

Tor          3.5      2009.03

Mon        4.4      2009.01

Mon        4.2      2009.02

Mon        4.3      2009.03

Que        1.9      2009.01

Que        0.6      2009.02

Que        0.9      2009.03

Ont         2.5      2009.01

Ont         2.8      2009.02

Ont         3.4      2009.03

Thanks

PROC Star
Posts: 7,471

Re: Merge criteria

Using Tom's proc sql method, which I happen to prefer of the three methods that have been offered, I think that the following accomplishes what you need:

proc sql;

  create table SET3 as

    select distinct a.area,a.prov,a.date,

     coalesce(b.val,c.val) as val

      from set1 a

        left join set2 b

          on a.area = b.area and a.date = b.date

            left join set2 c

              on a.prov = c.area and a.date = c.date

;

quit;

Super User
Posts: 10,028

Re: Merge criteria

Here is Hash method. if you like it.

data SET1 ;
input AREA  $  Prov  $ Date $ ;
datalines;
Tor         Ont      2009.01
Tor         Ont      2009.02
Tor         Ont      2009.03
Osh       Ont       2009.01
Osh       Ont       2009.02
Osh       Ont       2009.03
Mon       Que      2009.01
Mon       Que      2009.02
Mon       Que      2009.03
;
run;

data SET2;
input AREA $   VAL     Date $ ;
datalines;
Ont         2.0      2009.01
Ont         2.1      2009.02
Ont         2.5      2009.03
Tor          3.5     2009.01
Tor          3.6      2009.02
Tor          3.5      2009.03
Mon        4.4      2009.01
Mon        4.2      2009.02
Mon        4.3      2009.03
Que        1.9      2009.01
Que        0.6      2009.02
Que        0.9      2009.03
Ont         2.5      2009.01
Ont         2.8      2009.02
Ont         3.4      2009.03
;
run;
data want(drop=rc rx); 
 if _n_ eq 1 then do;
  if 0 then set set2;
  declare hash ha(dataset:'set2',hashexp:10);
   ha.definekey('area','date');
   ha.definedata('val');
   ha.definedone();
 end;
set set1;
call missing(val);               
rc=ha.find();
if rc ne 0 then rx=ha.find(key:prov,key:date);
run;


Ksharp

Contributor
Posts: 32

Merge criteria

I can see that you have got a few replies already.

I've been quite happy using PROC SQL with a full outer join and the COALESCE() function on the common fields.

Suspect performance is less than ideal, but others will probably comment on that...

Ask a Question
Discussion stats
  • 14 replies
  • 317 views
  • 0 likes
  • 6 in conversation