BookmarkSubscribeRSS Feed
c8826024
Calcite | Level 5

I have a SAS SQL code which really takes a long time to run. Are there anybody can help me to make it efficient?

proc sql;

delete from recover as a

where exists (select * from nbcase as b where a.customer_id=b.customer_id and (a.nb_created_date < b.nb_first_date or a.nb_created_date > b.nb_last_date));

quit;

Thanks.

5 REPLIES 5
DF
Fluorite | Level 6 DF
Fluorite | Level 6

Can you provide some background information about what the fields in the table mean, and what the purpose of each table is?  It's difficult to make any suggestions without understanding the problem.

c8826024
Calcite | Level 5

I want to delete some records from Table recover where customer id matched the same id in table nbcase and nb_created _date < nb_first_date or nb_created_date > nb_last_date.

Both tables have duplicate customer ids and table nbcase unique on customer id and nb_first_date.

DF
Fluorite | Level 6 DF
Fluorite | Level 6

Ok, so I'm assuming the table nb_case is some kind of history for a customer, where each change is recorded with First_Date and Last_Date as the effective start/end dates for the other information on that row.

What is the relationship between Created Date and first/last, and what is the meaning of the rows in the Recover table?

It might also help to show some made up sample data for both tables.

FriedEgg
SAS Employee

This is just to provide sample data for your example as well as a duplicate of your SQL.

data recover(rename=(i=customer_id) sortedby=customer_id);

do i=1 to 10000;

  nb_created_date=round(ranuni(1234)*100,11)+'01AUG2011'd;

  output;

end;

run;

data nbcase(rename=(i=customer_id) sortedby=customer_id drop=j k);

do i=1 to 10000;

  j=round(ranuni(1234)*10,1);

  do k=1 to j;

   nb_first_date=round(ranuni(1234)*100,11)+'01AUG2011'd;

   nb_last_date=round(ranuni(2345)*100,11)+'01AUG2011'd;

   output;

  end;

end;

run;

proc sql;

delete from _recover r

where exists ( select *

                  from nbcase nb

                 where r.customer_id=nb.customer_id

                       and ( r.nb_created_date < nb.nb_first_date

                             or r.nb_created_date > nb.nb_last_date )

);

quit;

Message was edited by: Matthew Kastin Reason: Previous posting was erroneous after testing.

SASJedi
SAS Super FREQ

The resource hog is the correlated subquery.  Try to do it with an non-correlated subquery instead. 

For example, if FriedEgg's data sets are a good simulation, try this SQL query instead:

proc sql stimer;
create table del as
select distinct r.customer_ID 
   from nbcase as nb
      , Recover as r
   where r.customer_id=nb.customer_id
     and ( r.nb_created_date < nb.nb_first_date
          or r.nb_created_date > nb.nb_last_date )
;
delete from recover r
where Customer_ID in (select Customer_ID
                  from del)
;
quit;

Even though I created a temp table, the 2 queries above consumed a total of 0.8 CPU seconds vice 16.5 CPU seconds.  Elapsed time was 0.3 seconds vice 16 seconds for the original query (I have a quad core processor, so CPU time sometimes exceeds Elapsed Time when multi-threading). 

With the constant re-execution of the correlated subquery (inner query ) in the original code, the process is probably I/O bound.  The non-correlated sub query in the code above (inner query) executes once and passes the resulting list of values to the outer query. This is much less I/O intensive. YMMV but it's worth a shot.

Message was edited by: Mark Jordan due to spelling errors in the original entry.

Check out my Jedi SAS Tricks for SAS Users

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1135 views
  • 0 likes
  • 4 in conversation