BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
johnjinkim
Obsidian | Level 7

data have;
input person date product paid ;
1 1/25/15 apple $43

1 1/25/15 apple -$43

1 1/25/15 apple $50

2 1/30/15 orange $50

3 1/31/15 apple $55
;

 

Would want to remove lines:

 

1 1/25/15 apple $43

1 1/25/15 apple -$43

 

since it is the same person #, date, product, and price is retracted. Would want to keep person #1 apple $50 as well as persons 2 and 3.

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

At first glance, this looks rather straightforward, however, after discovering the question could  involve both sequencing and no-sequencing process,  It seems that Hash object is to be utilized to deliver a more robust solution.

 

data have;
input person date : $20. product $ paid : dollar.;
cards;
1 1/25/15 apple $50
1 1/25/15 apple $43
1 1/25/15 apple $-43
2 1/30/15 orange $50
3 1/31/15 apple $55
;

data want;
if _n_=1 then do;
  if 0 then set have;
    declare hash h(dataset:'have(rename=(paid=_p) where=(_p<0))', multidata:'y'); 
	h.definekey('person','date','product');
	h.definedata(all:'y');
	h.definedone();
	call missing (_p);
  end;
set have (where=(paid>=0));
_rc=h.find();
if paid eq -1*_p then do;
_rc=h.removedup();
delete;
end;

drop _:;
run;

View solution in original post

10 REPLIES 10
Reeza
Super User

Could you summarize by customer, date, product and sum the amount. If amount is 0 then delete? 

johnjinkim
Obsidian | Level 7

Good thought - have updated initial question as it does not appear for scenarios where there are other lines for that person that I would want to keep.

 

Truly trying to remove only retractions with the same positive and negative value.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Here is one example:

data have;
  input person date $ product $ paid;
datalines;
1 1/25/15 apple 43
1 1/25/15 apple -43
2 1/30/15 orange 50
3 1/31/15 apple 55
;
run;

data want;
  set have;
  by person date product;
  retain tot;
  if first.product then tot=paid;
  if not(first.product) and tot+paid=0 then delete;
run;

 

johnjinkim
Obsidian | Level 7

Thank you for the response. I believe this only deletes the negative retraction, but I would want to delete both the negative and positive paid with the same value.

 

so both of the below lines should be deleted.

1 1/25/15 apple 43
1 1/25/15 apple -43

Thanks again.

Reeza
Super User

Use a SQL self join to the data on the condition that it not equal a negative value

 

proc sql;

select a.* 

from have as a

left join have as b

on a.id=b.Id and a.date=b.date and a.item =b.item

and a.amount ne -1*b.amount;

quit;

johnjinkim
Obsidian | Level 7

Thanks for the response. This creates duplicates of certain values ... performing

 

proc sort data=have nodupkey ;

   by id date item ;

run ;

 

does get me to the output I specified, though not accounting for smaller level detalis that Haikuo's solution accounts for. Thanks again.

 

Ksharp
Super User

There are some problem in my code.

 

 

data have;
input person date : $20. product $ paid : dollar.;
cards;
1 1/25/15 apple $43
1 1/25/15 apple $-43
1 1/25/15 apple $50
2 1/30/15 orange $50
3 1/31/15 apple $55
;
data a;
 set have(where=(paid gt 0));
run;
data b;
 set have(where=(paid lt 0));
 paid=-paid;
run;
proc sort data=a;
 by  person date product paid;
run;
proc sort data=b;
 by  person date product paid;
run;

data want;
 ina=0;inb=0;
 merge a(in=ina)  b(in=inb);
 by person date product paid;
 if not (ina and inb) then do;
  if inb then paid=-paid;
  output;
 end;
run;
Haikuo
Onyx | Level 15

I won't be overly confident on order of events ,

 

data have;
input person date : $20. product $ paid : dollar.;
cards;
1 1/25/15 apple $50
1 1/25/15 apple $43
1 1/25/15 apple $-43
2 1/30/15 orange $50
3 1/31/15 apple $55
;
data want;
 ina=0;inb=0;
 merge have(where=(paid lt 0) in=ina) have(where=(paid gt 0) in=inb);
 by person date product;
 if not (ina and inb);
run;
Haikuo
Onyx | Level 15

At first glance, this looks rather straightforward, however, after discovering the question could  involve both sequencing and no-sequencing process,  It seems that Hash object is to be utilized to deliver a more robust solution.

 

data have;
input person date : $20. product $ paid : dollar.;
cards;
1 1/25/15 apple $50
1 1/25/15 apple $43
1 1/25/15 apple $-43
2 1/30/15 orange $50
3 1/31/15 apple $55
;

data want;
if _n_=1 then do;
  if 0 then set have;
    declare hash h(dataset:'have(rename=(paid=_p) where=(_p<0))', multidata:'y'); 
	h.definekey('person','date','product');
	h.definedata(all:'y');
	h.definedone();
	call missing (_p);
  end;
set have (where=(paid>=0));
_rc=h.find();
if paid eq -1*_p then do;
_rc=h.removedup();
delete;
end;

drop _:;
run;
johnjinkim
Obsidian | Level 7

Thank you! This gets me to exactly what I need - apologies if the smaller level detalis/intracicies were not clearly stated. Seems as though it is fairly straightforward initially, but there are certainly smaler level detalis that this accounts for such as the sequencing.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 2625 views
  • 1 like
  • 5 in conversation