Hi,
I can't seem to figure out this logic with my data.
I have 2 files, Charges and Adjustments.
Charges looks like:
id date product amount
1 01jan2012 box 20
1 03jan2012 ball 5
1 04jan2012 chair 10
2 02feb2012 camera 100
2 03feb2012 cup 4
...
Adjustment looks like:
id date product amount
1 03jan2012 refund -5
2 03feb2012 refund -4
..
I'd like to create a table that removes all adjustments from the charges table
WANT:
id date product amount
1 01jan2012 box 20
1 04jan2012 chair 10
2 02feb2012 camera 100
the Product variable in the Adjustment table will always be 'refund' and can not match back to Charges. To figure out which rows are to remove Date and Amount variables from Adjustment will have to be used to identify the row to remove from the Charges table.
Thank you for your help.
For your situation , I prefer to Hash Table than Sub-query of SQL.
data h1; input id $ date : date9. product :$ amount; format date date9.; cards; 1 01jan2012 box 20 1 03jan2012 ball 5 1 04jan2012 chair 10 2 02feb2012 camera 100 2 03feb2012 cup 4 ; run; data h2; input id $ date :date9. product :$ amount; format date date9.; cards; 1 03jan2012 refund -5 2 03feb2012 refund -4 ; run; data want; if _n_ eq 1 then do; if 0 then set h2(keep=id date); declare hash ha(hashexp:10,dataset:'h2'); ha.definekey('id','date'); ha.definedone(); end; set h1; if ha.check() ne 0 then output; run;
Ksharp
Try this one:
data h1;
input id $ date : date9. product :$ amount;
cards;
1 01jan2012 box 20
1 03jan2012 ball 5
1 04jan2012 chair 10
2 02feb2012 camera 100
2 03feb2012 cup 4
;
data h2;
input id $ date :date9. product :$ amount;
cards;
1 03jan2012 refund -5
2 03feb2012 refund -4
;
data want;
merge h1 (in=h1) h2(in=h2);
by id date;
if (h1 and not h2);
format date date9.;
run;
proc print;run;
Regards,
Haikuo
Edit: if you want to play with hash, here is one of them:
data want;
if _n_=1 then
do;
set h2(obs=1);
dcl hash h(dataset: 'h2', multidata:'yes');
h.definekey('id','date');
h.definedone();
end;
format date date9.;
set h1;
_n_=h.find();
if _n_ ne 0;
run;
Unless you drop=product on h2, I think you will overlay each successfull merge with REFUND for a product....also, I think AMOUNT will need to be on your by statement (as well as in the SORT by statement) to have a unique key.
For your situation , I prefer to Hash Table than Sub-query of SQL.
data h1; input id $ date : date9. product :$ amount; format date date9.; cards; 1 01jan2012 box 20 1 03jan2012 ball 5 1 04jan2012 chair 10 2 02feb2012 camera 100 2 03feb2012 cup 4 ; run; data h2; input id $ date :date9. product :$ amount; format date date9.; cards; 1 03jan2012 refund -5 2 03feb2012 refund -4 ; run; data want; if _n_ eq 1 then do; if 0 then set h2(keep=id date); declare hash ha(hashexp:10,dataset:'h2'); ha.definekey('id','date'); ha.definedone(); end; set h1; if ha.check() ne 0 then output; run;
Ksharp
One thing learned. Check() is more efficient and thus better.
Haikuo
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.