Solved: Re: Delete a record if a matching record is found from the same datase...

pawan · Posted 03-23-2016 05:03 AM

I've come across a strange situation where I've to delete a record if there is any other record having the matching key variables. Any help would be highly appreciated. My requirement is to drop both these records from further processing. My sample data looks like below where the fields Vendor, Job and Grade are the key variables.

	Vendor	Job	Grade	Payment
1	ABC	Boxes	01	$3,500.00
2	XYZ	Tape	01	$2,000.00
3	LMN	Straps	02	$1,500.00
4	ABC	Pins	02	$1,200.00
5	LMN	Straps	02	$0.00
6	ABC	Boxes	01	$3,500.00
7	LMN	Straps	02	$0.00
8	ABC	Boxes	01	$0.00

A record has to be dropped from further processing if there is another record anywhere in the table having the amount field(Payment) a value $0.

Considering this the record #1 for vendor ABC having Payment $3500 should be deleted because there is another record #8 where Payment is $0, but #6 having Payment $3500 should be processed because #8 is no more available as it is already matched(/deleted along) with #1. Similarly #5 for vendor LMN should also be dropped as there is a matching record at #7 having payment $0.

Ksharp · Posted 03-23-2016 05:58 AM

You didn't post the result yet.

data have;
infile datalines expandtabs;
	input vendor $ job $ grade payment;
datalines;
ABC	Boxes	01	3500.00
XYZ	Tape	01	2000.00
LMN	Straps	02	1500.00
ABC	Pins	02	1200.00
LMN	Straps	02	0.00
ABC	Boxes	01	3500.00
LMN	Straps	02	0.00
ABC	Boxes	01	0.00
;
run;
proc sort data=have ;by vendor  job  grade;run;
data want;
 ina=0;inb=0;
 merge have(where=(payment ne 0) in=ina)
 have(rename=(payment=_payment) where=(_payment eq 0) in=inb);
 by vendor  job  grade;
 if not inb;
 run;

View solution in original post

RW9 · Posted 03-23-2016 05:29 AM

I would caution the approach, as you may not want to delete all the records associated, but going from the data you posted - and I would also point out that its a good idea to post test data in the form of a datastep as illustrated:

data have;
	input vendor $ job $ grade payment;
datalines;
ABC	Boxes	01	3500.00
XYZ	Tape	01	2000.00
LMN	Straps	02	1500.00
ABC	Pins	02	1200.00
LMN	Straps	02	0.00
ABC	Boxes	01	3500.00
LMN	Straps	02	0.00
ABC	Boxes	01	0.00
;
run;

proc sql undopolicy=none;
  delete from HAVE A
  where exists(select distinct VENDOR from HAVE where VENDOR=A.VENDOR and JOB=A.JOB and PAYMENT=0);
quit;

pawan · Posted 03-23-2016 08:06 AM

Thanks RW9, Sorry for not mentioning my approach trials there but the solution provided by Ksharp is really sharp and helped in what I needed.

Ksharp · Posted 03-23-2016 05:58 AM

You didn't post the result yet.

data have;
infile datalines expandtabs;
	input vendor $ job $ grade payment;
datalines;
ABC	Boxes	01	3500.00
XYZ	Tape	01	2000.00
LMN	Straps	02	1500.00
ABC	Pins	02	1200.00
LMN	Straps	02	0.00
ABC	Boxes	01	3500.00
LMN	Straps	02	0.00
ABC	Boxes	01	0.00
;
run;
proc sort data=have ;by vendor  job  grade;run;
data want;
 ina=0;inb=0;
 merge have(where=(payment ne 0) in=ina)
 have(rename=(payment=_payment) where=(_payment eq 0) in=inb);
 by vendor  job  grade;
 if not inb;
 run;

pawan · Posted 03-23-2016 08:00 AM

It worked like a Gem!! Thanks, but just wondering why are those 2 fields ina & inb, initialised 0's, not taken for output. How did they disappear? I'm not seeing a DROP here.. Anyways, thanks so much.

ballardw · Posted 03-23-2016 11:55 AM

The In= dataset option creates a temporary variable whose value is not saved to the output dataset unless you explicitly assign it to another variable. There are several options on a SET or INFILE statement that do this also such as END NOBS POINT EOV FILENAME FILEVAR LENGTH LINE

pawan · Posted 03-24-2016 02:36 AM

Yepp that is true but my question was why did Ksharp initialise those variables with 0's. Is the default assignment different from this? I didnt understand how did the behavior of this step changed completely with just inclusion of that "Initialisisng step".

Ksharp · Posted 03-24-2016 02:44 AM

By default,ina , inb will retain its value . I reset them to avoid such problem .

pawan · Posted 03-24-2016 03:22 AM

Oh! It makes sense now. Thanks a lot! And now I can throw my 100 lines of code away into trash and just use this master piece!

Delete a record if a matching record is found from the same dataset

Re: Delete a record if a matching record is found from the same dataset

Re: Delete a record if a matching record is found from the same dataset

Re: Delete a record if a matching record is found from the same dataset

Re: Delete a record if a matching record is found from the same dataset

Re: Delete a record if a matching record is found from the same dataset

Re: Delete a record if a matching record is found from the same dataset

Re: Delete a record if a matching record is found from the same dataset

Re: Delete a record if a matching record is found from the same dataset

Re: Delete a record if a matching record is found from the same dataset

Registration is open

Registration is open

SAS Training: Just a Click Away