BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sarahzhou
Quartz | Level 8

Hi, 

 

I have a table A:

User_ID Customer_ID Email_add Invoice_no Invoice_account
u1 c1 example1@gmail.com v1 100
u2 c2 example2@gmail.com v2 120
u2 c2 example2b@gmail.com v3 130
u3 c3 example3@gmail.com v4 150
u4 c4 example4@gmail.com v5 165
u5 c5 example5@gmail.com v6 180

 

And table B:

User_ID Customer_ID Email_add
  c1 example1@gmail.com
    example2@gmail.com
u3    

 

I wish to delete the whole entry when either User_ID, Customer_ID or Email_add appeared in table B.

 

So the result table C should look like this:

User_ID Customer_ID Email_add Invoice_no Invoice_account
u4 c4 example4@gmail.com v5 165
u5 c5 example5@gmail.com v6 180

 

I've tried using where clause A.User_ID=B.User_ID or A.Customer_ID=B.Customer_ID or A.Email_add=B.Email_add to delete the rows (failed T.T.). This operation is very slow when the datasets for table A and Table B is large.

 

Please advise, thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Try this

 

data a;
input User_ID $ Customer_ID $ Email_add :$20. Invoice_no $ Invoice_account;
datalines;
u1 c1 example1@gmail.com  v1 100 
u2 c2 example2@gmail.com  v2 120 
u2 c2 example2b@gmail.com v3 130 
u3 c3 example3@gmail.com  v4 150 
u4 c4 example4@gmail.com  v5 165 
u5 c5 example5@gmail.com  v6 180 
;
 
data b;
input User_ID $ Customer_ID $ Email_add :$20.;
infile datalines missover dlm = '|';
datalines;
   | c1 | example1@gmail.com 
   |    | example2@gmail.com 
u3 |    |                    
;   


proc sql;
   create table want as
   select * from a
   where User_ID     not in (select User_ID     from b)
     and Customer_ID not in (select Customer_ID from b)
     and Email_add   not in (select Email_add   from b)
   ;
quit;

View solution in original post

4 REPLIES 4
PeterClemmensen
Tourmaline | Level 20

Try this

 

data a;
input User_ID $ Customer_ID $ Email_add :$20. Invoice_no $ Invoice_account;
datalines;
u1 c1 example1@gmail.com  v1 100 
u2 c2 example2@gmail.com  v2 120 
u2 c2 example2b@gmail.com v3 130 
u3 c3 example3@gmail.com  v4 150 
u4 c4 example4@gmail.com  v5 165 
u5 c5 example5@gmail.com  v6 180 
;
 
data b;
input User_ID $ Customer_ID $ Email_add :$20.;
infile datalines missover dlm = '|';
datalines;
   | c1 | example1@gmail.com 
   |    | example2@gmail.com 
u3 |    |                    
;   


proc sql;
   create table want as
   select * from a
   where User_ID     not in (select User_ID     from b)
     and Customer_ID not in (select Customer_ID from b)
     and Email_add   not in (select Email_add   from b)
   ;
quit;
PeterClemmensen
Tourmaline | Level 20

Also, the 3rd obs should be in the desired result as well, correct?

Patrick
Opal | Level 21

You could use the EXIST clause.

/* create a new table */
proc sql;
  create table want as
    select * from a
    where not exists
      ( select * from b 
        where 
          a.user_id=b.user_id or
          a.customer_id=b.customer_id or
          a.email_add=b.email_add
      )
    ;
quit;

/* delete from existing table in-place */
proc sql;
  delete from a 
    where not exists
      ( select * from b 
        where 
          a.user_id=b.user_id or
          a.customer_id=b.customer_id or
          a.email_add=b.email_add
      )
    ;
quit;

If you're dealing with SAS tables then it's normally better to create a new table and not to directly delete rows in an existing table ....unless you have a big source table and need only to delete a minor percentage of rows. Reason: SAS deleted the rows only logically but no physically. The deleted rows still remain in the table and add to the tables volume (file size).

proc contents data=a;
quit;

Patrick_0-1665561613802.png

 

If it's a table in a database then using DELETE is likely the "correct" approach. Having said that: Delete is also in a database a slow process so if you need to delete a high percentage of rows then creating a new table will likely execute faster.

 

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1955 views
  • 0 likes
  • 3 in conversation