DATA Step, Macro, Functions and more

Append two files on a index without duplicates

Reply
Contributor
Posts: 63

Append two files on a index without duplicates

I want to append a master file with update file based on index (Company Year)

File Master

Company             Year                      Revenue

A                            1991                     100

A                            1992                     200

A                            1993                     300

C                            1991                     100

 

File update

Company             Year                      Revenue

A                            1991                     100

A                            1993                     300

B                            1991                     50

B                            1992                     100

C                            1991                     100

       

Output

Company             Year                      Revenue

A                            1991                     100

A                            1992                     200

A                            1993                     300

B                            1991                     50

B                            1992                     100

C                            1991                     100

I am using this code - but it does not remove duplicates

proc sql;

create table want as

select * from Master

  union all corr

( select * from update

   except all

  select * from master)

;

run;

Super User
Posts: 11,343

Re: Append two files on a index without duplicates

Posted in reply to buckeyefisher

Does it have to be an SQL solution?

Contributor
Posts: 63

Re: Append two files on a index without duplicates

It is not  necessary to have a SQL ,but I thought SQL could be cleaner and simpler.

Respected Advisor
Posts: 3,799

Re: Append two files on a index without duplicates

Posted in reply to buckeyefisher

You could use MODIFY.

data Master(index=(cy=(company year)/unique));
   input Company $            Year                      Revenue;
   cards;
A                            1991                     100
A                            1992                     200
A                            1993                     300
C                            1991                     100
;;;;
   run;
proc print;
  
run;
data transact;
   if 0 then set master;
   input Company             Year                      Revenue;
   cards;
A                            1991                     100
A                            1993                     300
B                            1991                     50
B                            1992                     100
C                            1991                     100
;;;;
   run;
proc print;
  
run;

data master;
   set transact;
   modify master key=cy;
   select(_IORC_);
      when(1230015) output;
      otherwise return /*no need replace there is no change to found records*/;
      end;
   _error_ =
0;
  
run;
proc print data=master;
   run;

4-6-2015 12-40-08 PM.png
Super User
Posts: 5,516

Re: Append two files on a index without duplicates

Posted in reply to buckeyefisher

SAS contains a relatively simple tool to make this happen.  Assuming both data sets are sorted:

data want;

update master changes;

by company year;

run;

This tool does build in certain assumptions.  You would probably hope that all of these conditions would be true, anyway ...

The MASTER data set is not permitted to have more than one record for any company/year combination.  While the CHANGES data set is permitted to have multiple records, the final result will contain only one record per company/year combination after all the changes have been applied.  Any missing values (such as for REVENUE) in the CHANGES data set will be ignored.  But if CHANGES contains a REVENUE value, that value will appear in the final result whether or not MASTER already contains a REVENUE value for that company/year.

Good luck.

Contributor
Posts: 63

Re: Append two files on a index without duplicates

Posted in reply to Astounding

Thanks Astounding,  but I forgot to mention that my master file should not get overwritten by the transaction file if the record exists in the master file. So SAS update command may not help.

Super User
Posts: 5,516

Re: Append two files on a index without duplicates

Posted in reply to buckeyefisher

That's actually easy to overcome, as long as the transaction file never contains multiple records for a COMPANY/YEAR.  (Even that can be overcome but requires either an additional step or a more complex program such as data_null's suggestion.)  Just switch the files:

data want;

update changes master;

by company year;

run;

Now any non-missing values in the MASTER file replace the values in the transaction file.

Contributor
Posts: 63

Re: Append two files on a index without duplicates

Posted in reply to Astounding

that's clever!!! Perfect, it works in the current scenario.

Thanks

Ask a Question
Discussion stats
  • 7 replies
  • 298 views
  • 3 likes
  • 4 in conversation