Hi Team,
I need help in a query optimization.
The code is attached.
Is there any other way can we replace the table c code to get the maximum status date, to avoid the duplicates.
because of the sub query ,the query is taking long time to run.
Is there a way to optimise.
proc sql;
connect to db2 as pscon(user=&db2_id
password=&db2_ps
database = &db2_db
readbuff=1);
%put &sqlxmsg;
create table &output as
select * from connection to pscon (
select a.cmpny||a.brch||a.polcy||a.typ as pol_key1, a.DATE_INCEPN,
a.DATE_EXPIRES, a.REAS_TERM, c.status_date, a.polcy_status, a.POLCY_PAY_PLAN,
c.NOTICE_TO, c.DATE_ORIGL_INCEPN,b.*,e.POLCY_EXTRA_CHRG1 as cancel_fee_prem,
e.POLCY_EXTRA_CHRG2 as cancel_fee_gst
from dbadm.potpoly as a
left join (select * from ( select row_number() over (partition by poly_cmpny||poly_brch||poly_polcy||poly_typ order by status_date desc) as rownum,
poly_cmpny,poly_brch,poly_polcy,poly_typ,status_date,NOTICE_TO,DATE_ORIGL_INCEPN,valid_flag
from
dbadm.potpolh01 ) cc where rownum=1) AS C
on c.poly_cmpny=a.cmpny and c.poly_brch=a.brch
and c.poly_polcy=a.polcy and c.poly_typ=a.typ and c.valid_flag='1'
left join dbadm.potstrni1 as b
on b.poly_cmpny=a.cmpny and b.poly_brch=a.brch
and b.poly_polcy=a.polcy and b.poly_typ=a.typ
left join dbadm.potbtrn01 as e
on e.poly_cmpny=a.cmpny and e.poly_brch=a.brch
and e.poly_polcy=a.polcy and e.poly_typ=a.typ and b.tran_no=e.tran_no
and b.BATC_BATCH_TYPe=e.BATC_BATCH_TYP
and e.POLCY_CHRG_TYP1='CF'
where a.CMPNY in ('1','6') and
a.DATE_INCEPN >=20120000 and ((substr(a.date_expires,1,4)=substr(a.DATE_INCEPN,1,4) and
substr(a.date_expires,5,2)-substr(a.DATE_INCEPN,5,2) in (1,0))
or (substr(a.date_expires,5,2)='01' and substr(a.DATE_INCEPN,5,2)='12' and
substr(a.date_expires,1,4)-substr(a.DATE_INCEPN,1,4)=1))
and a.typ not in ('0TP','1TP','2TP','3TP','4TP','5TP','6TP','7TP','8TP','9TP','STP','QTP')
and a.VALID_FLAG='1'
and b.DATE_EFFCT >=a.DATE_INCEPN and a.polcy_status='09'
);
quit;
%put $sqlxrc &sqlxmsg
disconnect from db2;
quit;
Would this work better?
left join (select POLY_CMPNY,POLY_BRCH,POLY_POLCY,POLY_TYP,STATUS_DATE,NOTICE_TO,DATE_ORIGL_INCEPN,VALID_FLAG
from DBADM.POTPOLH01
group by POLY_CMPNY,POLY_BRCH,POLY_POLCY,POLY_TYP
having STATUS_DATE = max(STATUS_DATE) ) as C
This makes sense I suppose. Can you add all columns in the GROUP BY clause?
I tried it Chris.
It gives the result based on the whole set ,which is not my requirement.I need the group to be considered as mentioned before only 4 columns ,not all columns.
Well it looks like DB2 cannot accept a partial list of variables in the GROUP BY list. I don't have another idea.
Was it faster even?
So duplicate dates for the same four ID variables is fine?
My window function skills are rusty. Can you show me an example of data that would differ between the 2 syntaxes?
Also, did you look at a speed difference?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.