data have;
input person date product paid ;
1 1/25/15 apple $43
1 1/25/15 apple -$43
1 1/25/15 apple $50
2 1/30/15 orange $50
3 1/31/15 apple $55
;
Would want to remove lines:
1 1/25/15 apple $43
1 1/25/15 apple -$43
since it is the same person #, date, product, and price is retracted. Would want to keep person #1 apple $50 as well as persons 2 and 3.
Thank you!
At first glance, this looks rather straightforward, however, after discovering the question could involve both sequencing and no-sequencing process, It seems that Hash object is to be utilized to deliver a more robust solution.
data have;
input person date : $20. product $ paid : dollar.;
cards;
1 1/25/15 apple $50
1 1/25/15 apple $43
1 1/25/15 apple $-43
2 1/30/15 orange $50
3 1/31/15 apple $55
;
data want;
if _n_=1 then do;
if 0 then set have;
declare hash h(dataset:'have(rename=(paid=_p) where=(_p<0))', multidata:'y');
h.definekey('person','date','product');
h.definedata(all:'y');
h.definedone();
call missing (_p);
end;
set have (where=(paid>=0));
_rc=h.find();
if paid eq -1*_p then do;
_rc=h.removedup();
delete;
end;
drop _:;
run;
Could you summarize by customer, date, product and sum the amount. If amount is 0 then delete?
Good thought - have updated initial question as it does not appear for scenarios where there are other lines for that person that I would want to keep.
Truly trying to remove only retractions with the same positive and negative value.
Here is one example:
data have; input person date $ product $ paid; datalines; 1 1/25/15 apple 43 1 1/25/15 apple -43 2 1/30/15 orange 50 3 1/31/15 apple 55 ; run; data want; set have; by person date product; retain tot; if first.product then tot=paid; if not(first.product) and tot+paid=0 then delete; run;
Thank you for the response. I believe this only deletes the negative retraction, but I would want to delete both the negative and positive paid with the same value.
so both of the below lines should be deleted.
1 1/25/15 apple 43 1 1/25/15 apple -43
Thanks again.
Use a SQL self join to the data on the condition that it not equal a negative value
proc sql;
select a.*
from have as a
left join have as b
on a.id=b.Id and a.date=b.date and a.item =b.item
and a.amount ne -1*b.amount;
quit;
Thanks for the response. This creates duplicates of certain values ... performing
proc sort data=have nodupkey ;
by id date item ;
run ;
does get me to the output I specified, though not accounting for smaller level detalis that Haikuo's solution accounts for. Thanks again.
There are some problem in my code.
data have;
input person date : $20. product $ paid : dollar.;
cards;
1 1/25/15 apple $43
1 1/25/15 apple $-43
1 1/25/15 apple $50
2 1/30/15 orange $50
3 1/31/15 apple $55
;
data a;
set have(where=(paid gt 0));
run;
data b;
set have(where=(paid lt 0));
paid=-paid;
run;
proc sort data=a;
by person date product paid;
run;
proc sort data=b;
by person date product paid;
run;
data want;
ina=0;inb=0;
merge a(in=ina) b(in=inb);
by person date product paid;
if not (ina and inb) then do;
if inb then paid=-paid;
output;
end;
run;
I won't be overly confident on order of events ,
data have;
input person date : $20. product $ paid : dollar.;
cards;
1 1/25/15 apple $50
1 1/25/15 apple $43
1 1/25/15 apple $-43
2 1/30/15 orange $50
3 1/31/15 apple $55
;
data want;
ina=0;inb=0;
merge have(where=(paid lt 0) in=ina) have(where=(paid gt 0) in=inb);
by person date product;
if not (ina and inb);
run;
At first glance, this looks rather straightforward, however, after discovering the question could involve both sequencing and no-sequencing process, It seems that Hash object is to be utilized to deliver a more robust solution.
data have;
input person date : $20. product $ paid : dollar.;
cards;
1 1/25/15 apple $50
1 1/25/15 apple $43
1 1/25/15 apple $-43
2 1/30/15 orange $50
3 1/31/15 apple $55
;
data want;
if _n_=1 then do;
if 0 then set have;
declare hash h(dataset:'have(rename=(paid=_p) where=(_p<0))', multidata:'y');
h.definekey('person','date','product');
h.definedata(all:'y');
h.definedone();
call missing (_p);
end;
set have (where=(paid>=0));
_rc=h.find();
if paid eq -1*_p then do;
_rc=h.removedup();
delete;
end;
drop _:;
run;
Thank you! This gets me to exactly what I need - apologies if the smaller level detalis/intracicies were not clearly stated. Seems as though it is fairly straightforward initially, but there are certainly smaler level detalis that this accounts for such as the sequencing.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.