BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
johnjinkim
Obsidian | Level 7

data have;
input person date product paid ;
1 1/25/15 apple $43

1 1/25/15 apple -$43

1 1/25/15 apple $50

2 1/30/15 orange $50

3 1/31/15 apple $55
;

 

Would want to remove lines:

 

1 1/25/15 apple $43

1 1/25/15 apple -$43

 

since it is the same person #, date, product, and price is retracted. Would want to keep person #1 apple $50 as well as persons 2 and 3.

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

At first glance, this looks rather straightforward, however, after discovering the question could  involve both sequencing and no-sequencing process,  It seems that Hash object is to be utilized to deliver a more robust solution.

 

data have;
input person date : $20. product $ paid : dollar.;
cards;
1 1/25/15 apple $50
1 1/25/15 apple $43
1 1/25/15 apple $-43
2 1/30/15 orange $50
3 1/31/15 apple $55
;

data want;
if _n_=1 then do;
  if 0 then set have;
    declare hash h(dataset:'have(rename=(paid=_p) where=(_p<0))', multidata:'y'); 
	h.definekey('person','date','product');
	h.definedata(all:'y');
	h.definedone();
	call missing (_p);
  end;
set have (where=(paid>=0));
_rc=h.find();
if paid eq -1*_p then do;
_rc=h.removedup();
delete;
end;

drop _:;
run;

View solution in original post

10 REPLIES 10
Reeza
Super User

Could you summarize by customer, date, product and sum the amount. If amount is 0 then delete? 

johnjinkim
Obsidian | Level 7

Good thought - have updated initial question as it does not appear for scenarios where there are other lines for that person that I would want to keep.

 

Truly trying to remove only retractions with the same positive and negative value.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Here is one example:

data have;
  input person date $ product $ paid;
datalines;
1 1/25/15 apple 43
1 1/25/15 apple -43
2 1/30/15 orange 50
3 1/31/15 apple 55
;
run;

data want;
  set have;
  by person date product;
  retain tot;
  if first.product then tot=paid;
  if not(first.product) and tot+paid=0 then delete;
run;

 

johnjinkim
Obsidian | Level 7

Thank you for the response. I believe this only deletes the negative retraction, but I would want to delete both the negative and positive paid with the same value.

 

so both of the below lines should be deleted.

1 1/25/15 apple 43
1 1/25/15 apple -43

Thanks again.

Reeza
Super User

Use a SQL self join to the data on the condition that it not equal a negative value

 

proc sql;

select a.* 

from have as a

left join have as b

on a.id=b.Id and a.date=b.date and a.item =b.item

and a.amount ne -1*b.amount;

quit;

johnjinkim
Obsidian | Level 7

Thanks for the response. This creates duplicates of certain values ... performing

 

proc sort data=have nodupkey ;

   by id date item ;

run ;

 

does get me to the output I specified, though not accounting for smaller level detalis that Haikuo's solution accounts for. Thanks again.

 

Ksharp
Super User

There are some problem in my code.

 

 

data have;
input person date : $20. product $ paid : dollar.;
cards;
1 1/25/15 apple $43
1 1/25/15 apple $-43
1 1/25/15 apple $50
2 1/30/15 orange $50
3 1/31/15 apple $55
;
data a;
 set have(where=(paid gt 0));
run;
data b;
 set have(where=(paid lt 0));
 paid=-paid;
run;
proc sort data=a;
 by  person date product paid;
run;
proc sort data=b;
 by  person date product paid;
run;

data want;
 ina=0;inb=0;
 merge a(in=ina)  b(in=inb);
 by person date product paid;
 if not (ina and inb) then do;
  if inb then paid=-paid;
  output;
 end;
run;
Haikuo
Onyx | Level 15

I won't be overly confident on order of events ,

 

data have;
input person date : $20. product $ paid : dollar.;
cards;
1 1/25/15 apple $50
1 1/25/15 apple $43
1 1/25/15 apple $-43
2 1/30/15 orange $50
3 1/31/15 apple $55
;
data want;
 ina=0;inb=0;
 merge have(where=(paid lt 0) in=ina) have(where=(paid gt 0) in=inb);
 by person date product;
 if not (ina and inb);
run;
Haikuo
Onyx | Level 15

At first glance, this looks rather straightforward, however, after discovering the question could  involve both sequencing and no-sequencing process,  It seems that Hash object is to be utilized to deliver a more robust solution.

 

data have;
input person date : $20. product $ paid : dollar.;
cards;
1 1/25/15 apple $50
1 1/25/15 apple $43
1 1/25/15 apple $-43
2 1/30/15 orange $50
3 1/31/15 apple $55
;

data want;
if _n_=1 then do;
  if 0 then set have;
    declare hash h(dataset:'have(rename=(paid=_p) where=(_p<0))', multidata:'y'); 
	h.definekey('person','date','product');
	h.definedata(all:'y');
	h.definedone();
	call missing (_p);
  end;
set have (where=(paid>=0));
_rc=h.find();
if paid eq -1*_p then do;
_rc=h.removedup();
delete;
end;

drop _:;
run;
johnjinkim
Obsidian | Level 7

Thank you! This gets me to exactly what I need - apologies if the smaller level detalis/intracicies were not clearly stated. Seems as though it is fairly straightforward initially, but there are certainly smaler level detalis that this accounts for such as the sequencing.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 4127 views
  • 1 like
  • 5 in conversation