Help using Base SAS procedures

Using Modify in the Datastep

Reply
New Contributor
Posts: 4

Using Modify in the Datastep

Using Modify

I have two datasets as follows. The first dataset is an inventory and the second dataset contains the transactions.

dataset1 Inventory

item_no  description

1    Dunlop tyre

2    tube

3    windshield

4    Boss speakers

dataset2 Transactions

purchase_id item_no desc

1 1 dunlop

2 1 Dunlop

3 1 Dun

4 3 windshi

5 2 tub

6 2 tub

7 4 Boss

8 4 Boss Sp

9 3 wind

For some reason the description is truncated in the transaction table and I would like to replace it with the description from the inventory. My transaction table should look like this.

purchase_id item_no desc

1 1 Dunlop tyre

2 1 Dunlop tyre

3 1 Dunlop tyre

4 3 windshield

5 2 tube

6 2 tube

7 4 Boss speakers

7 4 Boss speakers

9 3 windshield

I used the modify statement in the data step. this was my code

data transaction;

      modify transaction inventory;

      by item_no;

      desc=description;

      run;

But this does not work it. My transaction table ends up like this. It replaces only the first occurence of the item_no with the description from the inventory table.

purchase_id item_no desc

1 1 Dunlop tyre

2 1 Dunlop

3 1 Dun

4 3 windshield

5 2 tube

6 2 tub

7 4 Boss speakers

9 3 wind

Could anybody suggest what can be done

Frequent Contributor
Frequent Contributor
Posts: 94

Using Modify in the Datastep

Hi Anu,

The SAS documentation has a specific note about using modify when you've got duplicates - it might help explain this behaviour better than I can: http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001772905.htm .  Watch out for the descriptions of the datasets - where the documentation refers to Master and Transaction datasets, the terms are used in the opposite sense from your own example.

If you want to overwrite the values of Desc, then you might try using "merge" instead of modify, or using hash tables.  I suppose it depends on your exact circumstances, e.g. what you want to do to your Transactions table when there is no matching entry in Inventory.

For example your code might read:

data Inventory;

format item_no 8.;

format Description $20.;

infile datalines dsd delimiter=',';

input item_no Description $;

datalines;

1,Dunlop tyre

2,tube

3,windshield

4,Boss speakers

;

run;

data Transactions;

format purchase_id 8.;

format item_no 8.;

format desc $7.;

infile datalines dsd delimiter=',';

input purchase_id item_no desc $;

datalines;

1,1,dunlop

2,1,Dunlop

3,1,Dun

4,3,windshi

5,2,tub

6,2,tub

7,4,Boss

8,4,Boss Sp

9,3,wind

;

run;

proc sort data=Inventory;

by item_no;

run;

proc sort data=Transactions;

by item_no;

run;

data Transactions;

merge Transactions (in=T drop=Desc) Inventory(rename=(Description=Desc));

by item_no;

if T then output;

run;

This would essentially delete the old Desc, and completely replace every row with the updated Description from Inventory.

The equivalent with a hash table would be:

data Transactions2 (rename=(Description=Desc) drop=rc);

set Transactions (drop=Desc);

format Description $20.;

if _n_ = 1 then do;

    declare hash h(dataset:'Inventory');

    h.defineKey('Item_No');

    h.defineData('Description');

    h.defineDone();

end;

rc = h.find();

run;

This also has the advantage that you don't have to sort the datasets as you would for merge or modify.

Another method you might consider is using a custom format built from Inventory.  That way your Transactions dataset only olds your Purchase_ID and Item_No fields, and the printed description would be taken by using Inventory as a lookup.  This would mean the values never change in Transaction, but you could still use proc report/tabulate etc. to summary based on the "formatted" values.

HTH.

Respected Advisor
Posts: 4,173

Re: Using Modify in the Datastep

Looks to me like a pretty good excercise question which should make you reading about the modify statement and increase your understanding of the differences between transactional data and reference data.

Message was edited by: Patrick Matter

Super User
Posts: 10,028

Re: Using Modify in the Datastep

You are almost to get that. The merge statement is what you need.

data Inventory;

input item_no description & $50.;

cards;

1    Dunlop tyre

2    tube

3    windshield

4    Boss speakers

;

run;

dataTransactions;

input purchase_id item_no desc : $50.;

cards;

1 1 dunlop

2 1 Dunlop

3 1 Dun

4 3 windshi

5 2 tub

6 2 tub

7 4 Boss

8 4 Boss Sp

9 3 wind

;

run;

proc sort data=Transactions;

by item_no;

run;

datawant(drop=desc);

mergeTransactions Inventory;

by item_no;

run;

proc sort data=want;

by purchase_id;

run;

Ksharp

Super Contributor
Posts: 578

Re: Using Modify in the Datastep

Perhaps a more straightforward approach:

proc sql;

update transactions t1

set desc=(select description from inventory where item_no=t1.item_no)

where exists (select * from inventory where item_no=t1.item_no);

quit;

You also might want to ensure the column is the correct width as the values could be truncated.

Ask a Question
Discussion stats
  • 4 replies
  • 164 views
  • 0 likes
  • 5 in conversation