Solved: data step logic

Danglytics · Posted 03-19-2012 03:17 PM

Hi,

I can't seem to figure out this logic with my data.

I have 2 files, Charges and Adjustments.

Charges looks like:

id date product amount

1 01jan2012 box 20

1 03jan2012 ball 5

1 04jan2012 chair 10

2 02feb2012 camera 100

2 03feb2012 cup 4

...

Adjustment looks like:

id date product amount

1 03jan2012 refund -5

2 03feb2012 refund -4

..

I'd like to create a table that removes all adjustments from the charges table

WANT:

id date product amount

1 01jan2012 box 20

1 04jan2012 chair 10

2 02feb2012 camera 100

the Product variable in the Adjustment table will always be 'refund' and can not match back to Charges. To figure out which rows are to remove Date and Amount variables from Adjustment will have to be used to identify the row to remove from the Charges table.

Thank you for your help.

Ksharp · Posted 03-19-2012 10:53 PM

For your situation , I prefer to Hash Table than Sub-query of SQL.

data h1;
input id $ date : date9. product :$ amount;
format date date9.;
cards;
1 01jan2012 box 20
1 03jan2012 ball 5
1 04jan2012 chair 10
2 02feb2012 camera 100
2 03feb2012 cup 4
;
run;
data h2;
input id $ date :date9. product :$ amount;
format date date9.;
cards;
1 03jan2012 refund -5
2 03feb2012 refund -4
;
run;
data want;
 if _n_ eq 1 then do;
  if 0 then set h2(keep=id date);
  declare hash ha(hashexp:10,dataset:'h2');
   ha.definekey('id','date');
   ha.definedone();
 end;
set h1;
 if ha.check() ne 0 then output;
run;

Ksharp

View solution in original post

Haikuo · Posted 03-19-2012 03:23 PM

Try this one:

data h1;

input id $ date : date9. product :$ amount;

cards;

1 01jan2012 box 20

1 03jan2012 ball 5

1 04jan2012 chair 10

2 02feb2012 camera 100

2 03feb2012 cup 4

;

data h2;

input id $ date :date9. product :$ amount;

cards;

1 03jan2012 refund -5

2 03feb2012 refund -4

;

data want;

merge h1 (in=h1) h2(in=h2);

by id date;

if (h1 and not h2);

format date date9.;

run;

proc print;run;

Regards,

Haikuo

Edit: if you want to play with hash, here is one of them:

data want;

if _n_=1 then

do;

set h2(obs=1);

dcl hash h(dataset: 'h2', multidata:'yes');

h.definekey('id','date');

h.definedone();

end;

format date date9.;

set h1;

_n_=h.find();

if _n_ ne 0;

run;

Jay_OAG · Posted 03-19-2012 03:28 PM

Unless you drop=product on h2, I think you will overlay each successfull merge with REFUND for a product....also, I think AMOUNT will need to be on your by statement (as well as in the SORT by statement) to have a unique key.

Ksharp · Posted 03-19-2012 10:53 PM

For your situation , I prefer to Hash Table than Sub-query of SQL.

data h1;
input id $ date : date9. product :$ amount;
format date date9.;
cards;
1 01jan2012 box 20
1 03jan2012 ball 5
1 04jan2012 chair 10
2 02feb2012 camera 100
2 03feb2012 cup 4
;
run;
data h2;
input id $ date :date9. product :$ amount;
format date date9.;
cards;
1 03jan2012 refund -5
2 03feb2012 refund -4
;
run;
data want;
 if _n_ eq 1 then do;
  if 0 then set h2(keep=id date);
  declare hash ha(hashexp:10,dataset:'h2');
   ha.definekey('id','date');
   ha.definedone();
 end;
set h1;
 if ha.check() ne 0 then output;
run;

Ksharp

Haikuo · Posted 03-20-2012 07:07 AM

One thing learned. Check() is more efficient and thus better.

Haikuo

data step logic

data step logic

Re: data step logic

data step logic

data step logic

Re: data step logic

Catch up on SAS Innovate 2026

data step logic

data step logic

Re: data step logic

data step logic

data step logic

Re: data step logic

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away