Solved: SQL Code Speed

mk131190 · Posted 06-14-2016 10:11 AM

Hello,

I have been given some code from SQL developer to put into SAS and the majority of it runs fine. I have one step which takes 10 minutes to run and i was hoping to improve its performance.

Does anything stand out from the code below which could be improved performance wise? The exists and not exists pieces seem to make the code take longer to run (it runs in seconds without them). These parts reference back to the original base datasets (which are only going to increase in size over time). I am not sure if there is enough information here but any advice would be appreciated!

PROC SQL;
CREATE TABLE DATASET1 AS
SELECT A.CUSTOMER,
A.CUSTOMER_REC,
A.APPNO,
A.TYPE,
A.DATE1,
A.STATUS,
A.STATUS1,
A.STATUS2,
A.DATE,
A.TDAY
FROM DATASET AS A
WHERE (A.STATUS = 'A'
OR (A.STATUS = 'D'
AND NOT EXISTS (SELECT 1
FROM BASE.BASE1 B,
BASE.BASE2 C
WHERE B.CUSTOMER_REC = A.CUSTOMER_REC
AND B.UNIQUEKEY = C.UNIQUEKEY
AND C.VAR = 2
AND C.VAR2 IN('41','42','43'))
AND EXISTS (SELECT 1
FROM BASE.BASE1 B,
BASE.BASE2 C
WHERE B.CUSTOMER_REC = A.CUSTOMER_REC
AND B.UNIQUEKEY = C.UNIQUEKEY
AND C.VAR = 2
AND C.VAR2 IN('41','44')) ) )
AND (A.STATUS IN ('A','D')
AND A.STATUS1 = 'C'
AND A.STATUS2 NOT IN ('A','B'));
QUIT;

RW9 · Posted 06-14-2016 10:47 AM

Well, I would suggest creating a inter file first of the base1 and base2 datasets first, this can then be merged on directly to the main data, something like this (but note it is quite hard to ascertain what the best approach is without seeing any data - why do you have three datasets for example - what is the point of base1 and base2 - without seeing all the links its hard to re-structure):

proc sql;
  create table BASE as
  select  CUSTOMER_REC,
          (select 1 from BASE.BASE2 A where A.CUSTOMER_REC=CUSTOMER_REC and A.VAR2 in ('41','42','43')) as FLAG1,
          (select 1 from BASE.BASE2 A where A.CUSTOMER_REC=CUSTOMER_REC and A.VAR2 in ('41','44')) as FLAG2
  from    BASE1;

  create table WANT as
  select  A.CUSTOMER,
          A.CUSTOMER_REC,
          A.APPNO,
          A.TYPE,
          A.DATE1,
          A.STATUS,
          A.STATUS1,
          A.STATUS2,
          A.DATE,
          A.TDAY
  from    DATASET A
  left join CUSTOMER_REC B
  on      A.CUSTOMER_REC=B.CUSTOMER_REC
  where   (A.STATUS="A" or (A.STATUS="D" and B.FLAG1=1 and B.FLAG2=1))
    and   (A.STATUS in ("A","D") and A.STATUS1="C" and A.STATUS2 not in ("A","B"));
quit;

View solution in original post

RW9 · Posted 06-14-2016 10:47 AM

Well, I would suggest creating a inter file first of the base1 and base2 datasets first, this can then be merged on directly to the main data, something like this (but note it is quite hard to ascertain what the best approach is without seeing any data - why do you have three datasets for example - what is the point of base1 and base2 - without seeing all the links its hard to re-structure):

proc sql;
  create table BASE as
  select  CUSTOMER_REC,
          (select 1 from BASE.BASE2 A where A.CUSTOMER_REC=CUSTOMER_REC and A.VAR2 in ('41','42','43')) as FLAG1,
          (select 1 from BASE.BASE2 A where A.CUSTOMER_REC=CUSTOMER_REC and A.VAR2 in ('41','44')) as FLAG2
  from    BASE1;

  create table WANT as
  select  A.CUSTOMER,
          A.CUSTOMER_REC,
          A.APPNO,
          A.TYPE,
          A.DATE1,
          A.STATUS,
          A.STATUS1,
          A.STATUS2,
          A.DATE,
          A.TDAY
  from    DATASET A
  left join CUSTOMER_REC B
  on      A.CUSTOMER_REC=B.CUSTOMER_REC
  where   (A.STATUS="A" or (A.STATUS="D" and B.FLAG1=1 and B.FLAG2=1))
    and   (A.STATUS in ("A","D") and A.STATUS1="C" and A.STATUS2 not in ("A","B"));
quit;

SQL Code Speed

Re: SQL Code Speed

Re: SQL Code Speed

Registration is open

SQL Code Speed

Re: SQL Code Speed

Re: SQL Code Speed

Registration is open

SAS Training: Just a Click Away