BookmarkSubscribeRSS Feed
Anu_R
Calcite | Level 5

Using Modify

I have two datasets as follows. The first dataset is an inventory and the second dataset contains the transactions.

dataset1 Inventory

item_no  description

1    Dunlop tyre

2    tube

3    windshield

4    Boss speakers

dataset2 Transactions

purchase_id item_no desc

1 1 dunlop

2 1 Dunlop

3 1 Dun

4 3 windshi

5 2 tub

6 2 tub

7 4 Boss

8 4 Boss Sp

9 3 wind

For some reason the description is truncated in the transaction table and I would like to replace it with the description from the inventory. My transaction table should look like this.

purchase_id item_no desc

1 1 Dunlop tyre

2 1 Dunlop tyre

3 1 Dunlop tyre

4 3 windshield

5 2 tube

6 2 tube

7 4 Boss speakers

7 4 Boss speakers

9 3 windshield

I used the modify statement in the data step. this was my code

data transaction;

      modify transaction inventory;

      by item_no;

      desc=description;

      run;

But this does not work it. My transaction table ends up like this. It replaces only the first occurence of the item_no with the description from the inventory table.

purchase_id item_no desc

1 1 Dunlop tyre

2 1 Dunlop

3 1 Dun

4 3 windshield

5 2 tube

6 2 tub

7 4 Boss speakers

9 3 wind

Could anybody suggest what can be done

4 REPLIES 4
DF
Fluorite | Level 6 DF
Fluorite | Level 6

Hi Anu,

The SAS documentation has a specific note about using modify when you've got duplicates - it might help explain this behaviour better than I can: http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001772905.htm .  Watch out for the descriptions of the datasets - where the documentation refers to Master and Transaction datasets, the terms are used in the opposite sense from your own example.

If you want to overwrite the values of Desc, then you might try using "merge" instead of modify, or using hash tables.  I suppose it depends on your exact circumstances, e.g. what you want to do to your Transactions table when there is no matching entry in Inventory.

For example your code might read:

data Inventory;

format item_no 8.;

format Description $20.;

infile datalines dsd delimiter=',';

input item_no Description $;

datalines;

1,Dunlop tyre

2,tube

3,windshield

4,Boss speakers

;

run;

data Transactions;

format purchase_id 8.;

format item_no 8.;

format desc $7.;

infile datalines dsd delimiter=',';

input purchase_id item_no desc $;

datalines;

1,1,dunlop

2,1,Dunlop

3,1,Dun

4,3,windshi

5,2,tub

6,2,tub

7,4,Boss

8,4,Boss Sp

9,3,wind

;

run;

proc sort data=Inventory;

by item_no;

run;

proc sort data=Transactions;

by item_no;

run;

data Transactions;

merge Transactions (in=T drop=Desc) Inventory(rename=(Description=Desc));

by item_no;

if T then output;

run;

This would essentially delete the old Desc, and completely replace every row with the updated Description from Inventory.

The equivalent with a hash table would be:

data Transactions2 (rename=(Description=Desc) drop=rc);

set Transactions (drop=Desc);

format Description $20.;

if _n_ = 1 then do;

    declare hash h(dataset:'Inventory');

    h.defineKey('Item_No');

    h.defineData('Description');

    h.defineDone();

end;

rc = h.find();

run;

This also has the advantage that you don't have to sort the datasets as you would for merge or modify.

Another method you might consider is using a custom format built from Inventory.  That way your Transactions dataset only olds your Purchase_ID and Item_No fields, and the printed description would be taken by using Inventory as a lookup.  This would mean the values never change in Transaction, but you could still use proc report/tabulate etc. to summary based on the "formatted" values.

HTH.

Patrick
Opal | Level 21

Looks to me like a pretty good excercise question which should make you reading about the modify statement and increase your understanding of the differences between transactional data and reference data.

Message was edited by: Patrick Matter

Ksharp
Super User

You are almost to get that. The merge statement is what you need.

data Inventory;

input item_no description & $50.;

cards;

1    Dunlop tyre

2    tube

3    windshield

4    Boss speakers

;

run;

dataTransactions;

input purchase_id item_no desc : $50.;

cards;

1 1 dunlop

2 1 Dunlop

3 1 Dun

4 3 windshi

5 2 tub

6 2 tub

7 4 Boss

8 4 Boss Sp

9 3 wind

;

run;

proc sort data=Transactions;

by item_no;

run;

datawant(drop=desc);

mergeTransactions Inventory;

by item_no;

run;

proc sort data=want;

by purchase_id;

run;

Ksharp

DBailey
Lapis Lazuli | Level 10

Perhaps a more straightforward approach:

proc sql;

update transactions t1

set desc=(select description from inventory where item_no=t1.item_no)

where exists (select * from inventory where item_no=t1.item_no);

quit;

You also might want to ensure the column is the correct width as the values could be truncated.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1044 views
  • 0 likes
  • 5 in conversation