Help using Base SAS procedures

Combining Datasets

Reply
Frequent Contributor
Posts: 122

Combining Datasets

Hi,

I have a requirement where I need to combine two datasets, but here is what is causing me trouble.

For example: master dataset

Emp_loc       Emp_sal

USA             100000.00

Newyork        25000.00

California       25000.00

Texas            25000.00

Nevada          25000.00

Transaction dataset:

Emp_loc Emp_sal

Texas     30000.00

USA       105000.00

I want to update the master dataset, but need to keep the values of the EMP_Loc in the same order ie, after my change the dataset should look like this:

Updated Master dataset:

Emp_loc       Emp_sal

USA             105000.00

Newyork        25000.00

California       25000.00

Texas            30000.00

Nevada          25000.00

Please share your ideas.

Thanks

Super User
Posts: 10,020

Re: Combining Datasets

Hash Table:

data master;
input Emp_loc $      Emp_sal ;
cards;
USA             100000.00
Newyork        25000.00
California       25000.00
Texas            25000.00
Nevada          25000.00
;
run;
 

 

data Transaction;
input Emp_loc $ Emp_sal ;
cards;
Texas     30000.00
USA       105000.00
;
run;
data want(drop=rc);
 if _n_ eq 1 then do;
  if 0 then set  Transaction;
  declare hash ha(dataset:'Transaction');
   ha.definekey('Emp_loc');
   ha.definedata('Emp_sal');
   ha.definedone();
 end;
set master;
 rc=ha.find();
run;

Ksharp

Trusted Advisor
Posts: 1,137

Re: Combining Datasets

The hash code used by Ksharp above is more efficient.

However there is alternative approach as well by Merge statement, please try the below code

data dat1;

    input Emp_loc $  Emp_sal;

    order=_n_;

cards;

USA             100000.00

Newyork        25000.00

California       25000.00

Texas            25000.00

Nevada          25000.00

;

run;

data dat2;

    input Emp_loc $  Emp_sal;

cards;

Texas     30000.00

USA       105000.00

;

run;

proc sort data = dat1;

    by emp_loc;

run;

proc sort data = dat2;

    by emp_loc;

run;

data all;

    merge dat1(in=a) dat2(in=b);

    by emp_loc;

run;

proc sort data = all;

    by order;

run;

Hope this helps you.

Thanks,

Jagadish

Thanks,
Jag
Super Contributor
Posts: 297

Re: Combining Datasets

From what I understand Hash tables are very useful in instances where you have a large table and a small lookup table.  There is a point where hash tables become less effecient, so it is important that you undertake sufficient testing to ensure you are on the right track.  Therefore I have assumed this is a sample and have provided the below alternative.

DATA MASTER;

INPUT EMP_LOC $      EMP_SAL ;

CARDS;

USA             100000.00

NEWYORK        25000.00

CALIFORNIA       25000.00

TEXAS            25000.00

NEVADA          25000.00

;

RUN;

DATA TRANSACTION;

INPUT EMP_LOC $ EMP_SAL ;

CARDS;

TEXAS     30000.00

USA       105000.00

;

RUN;

PROC SQL;

    UPDATE MASTER

    SET   EMP_SAL = (SELECT EMP_SAL

                     FROM TRANSACTION

                     WHERE MASTER.EMP_LOC = TRANSACTION.EMP_LOC)

    WHERE EMP_LOC IN (SELECT EMP_LOC

                      FROM  TRANSACTION)

    ;

QUIT;

Regards,

Scott

Frequent Contributor
Posts: 122

Re: Combining Datasets

Posted in reply to Scott_Mitchell

Thank you all for sharing your thoughts!

Ask a Question
Discussion stats
  • 4 replies
  • 195 views
  • 0 likes
  • 4 in conversation