Solved: Re: Delete an entry if one of the column values appeared in other tabl...

sarahzhou · Posted 10-12-2022 02:58 AM

Hi,

I have a table A:

User_ID	Customer_ID	Email_add	Invoice_no	Invoice_account
u1	c1	example1@gmail.com	v1	100
u2	c2	example2@gmail.com	v2	120
u2	c2	example2b@gmail.com	v3	130
u3	c3	example3@gmail.com	v4	150
u4	c4	example4@gmail.com	v5	165
u5	c5	example5@gmail.com	v6	180

And table B:

User_ID	Customer_ID	Email_add
	c1	example1@gmail.com
		example2@gmail.com
u3

I wish to delete the whole entry when either User_ID, Customer_ID or Email_add appeared in table B.

So the result table C should look like this:

User_ID	Customer_ID	Email_add	Invoice_no	Invoice_account
u4	c4	example4@gmail.com	v5	165
u5	c5	example5@gmail.com	v6	180

I've tried using where clause A.User_ID=B.User_ID or A.Customer_ID=B.Customer_ID or A.Email_add=B.Email_add to delete the rows (failed T.T.). This operation is very slow when the datasets for table A and Table B is large.

Please advise, thanks!

PeterClemmensen · Posted 10-12-2022 03:15 AM

Try this

data a;
input User_ID $ Customer_ID $ Email_add :$20. Invoice_no $ Invoice_account;
datalines;
u1 c1 example1@gmail.com  v1 100 
u2 c2 example2@gmail.com  v2 120 
u2 c2 example2b@gmail.com v3 130 
u3 c3 example3@gmail.com  v4 150 
u4 c4 example4@gmail.com  v5 165 
u5 c5 example5@gmail.com  v6 180 
;
 
data b;
input User_ID $ Customer_ID $ Email_add :$20.;
infile datalines missover dlm = '|';
datalines;
   | c1 | example1@gmail.com 
   |    | example2@gmail.com 
u3 |    |                    
;   


proc sql;
   create table want as
   select * from a
   where User_ID     not in (select User_ID     from b)
     and Customer_ID not in (select Customer_ID from b)
     and Email_add   not in (select Email_add   from b)
   ;
quit;

The DATA to DATA Step Macro
Blog: SASnrd

View solution in original post

PeterClemmensen · Posted 10-12-2022 03:15 AM

Try this

data a;
input User_ID $ Customer_ID $ Email_add :$20. Invoice_no $ Invoice_account;
datalines;
u1 c1 example1@gmail.com  v1 100 
u2 c2 example2@gmail.com  v2 120 
u2 c2 example2b@gmail.com v3 130 
u3 c3 example3@gmail.com  v4 150 
u4 c4 example4@gmail.com  v5 165 
u5 c5 example5@gmail.com  v6 180 
;
 
data b;
input User_ID $ Customer_ID $ Email_add :$20.;
infile datalines missover dlm = '|';
datalines;
   | c1 | example1@gmail.com 
   |    | example2@gmail.com 
u3 |    |                    
;   


proc sql;
   create table want as
   select * from a
   where User_ID     not in (select User_ID     from b)
     and Customer_ID not in (select Customer_ID from b)
     and Email_add   not in (select Email_add   from b)
   ;
quit;

The DATA to DATA Step Macro
Blog: SASnrd

sarahzhou · Posted 10-12-2022 10:54 PM

@PeterClemmensen , thank you!

PeterClemmensen · Posted 10-12-2022 03:19 AM

Also, the 3rd obs should be in the desired result as well, correct?

The DATA to DATA Step Macro
Blog: SASnrd

Patrick · Posted 10-12-2022 04:02 AM

You could use the EXIST clause.

/* create a new table */
proc sql;
  create table want as
    select * from a
    where not exists
      ( select * from b 
        where 
          a.user_id=b.user_id or
          a.customer_id=b.customer_id or
          a.email_add=b.email_add
      )
    ;
quit;

/* delete from existing table in-place */
proc sql;
  delete from a 
    where not exists
      ( select * from b 
        where 
          a.user_id=b.user_id or
          a.customer_id=b.customer_id or
          a.email_add=b.email_add
      )
    ;
quit;

If you're dealing with SAS tables then it's normally better to create a new table and not to directly delete rows in an existing table ....unless you have a big source table and need only to delete a minor percentage of rows. Reason: SAS deleted the rows only logically but no physically. The deleted rows still remain in the table and add to the tables volume (file size).

proc contents data=a;
quit;

If it's a table in a database then using DELETE is likely the "correct" approach. Having said that: Delete is also in a database a slow process so if you need to delete a high percentage of rows then creating a new table will likely execute faster.

Delete an entry if one of the column values appeared in other table

Re: Delete an entry if one of the column values appeared in other table

Re: Delete an entry if one of the column values appeared in other table

Re: Delete an entry if one of the column values appeared in other table

Re: Delete an entry if one of the column values appeared in other table

Re: Delete an entry if one of the column values appeared in other table

Catch up on SAS Innovate 2026

Delete an entry if one of the column values appeared in other table

Re: Delete an entry if one of the column values appeared in other table

Re: Delete an entry if one of the column values appeared in other table

Re: Delete an entry if one of the column values appeared in other table

Re: Delete an entry if one of the column values appeared in other table

Re: Delete an entry if one of the column values appeared in other table

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away