Below is my script, is a long one...
After a first pass through your code it looks like there could be a lot of room for performance tweaking.
The reason why performance degrades is that there is likely a flaw in your creation of base table work.P6MTH_BASE. The way you've coded you are replicating data appending 6 months of source data for every call of %consolidate(...)
Have a look at below you you'll understand what I'm talking about.
%macro consolidate(DATE);
%include "/sasdata2/ifrs9_validation/macro/00_Date_format.sas";
/* PAST 6 MTHS DATA */
%MACRO PDATE(DATE);
PROC SQL;
CREATE TABLE P_MORT AS SELECT
month
,AR_ID
,UNQ_ID_SRC_STM
,UNQ_ID_SRC_STM2
,MIA
,F_DEFAULT
,F_RR
,MOB
/*,ADVANCE_INSTALLMENT*/
,OVERDUE
,F_LM_AMT
,ORGL_AMT
,f_os_bal
,SHD_AMT
,DISB_PERC
FROM mortdata.mortgage_perfmc_&DATE
;
QUIT;
PROC APPEND BASE=P6MTH_BASE DATA=P_MORT;
RUN;
%MEND PDATE;
%PDATE(&P_Month_1);
%PDATE(&P_Month_2);
%PDATE(&P_Month_3);
%PDATE(&P_Month_4);
%PDATE(&P_Month_5);
%PDATE(&P_Month_6);
%PDATE(&P_Month_7);
/*** more code consuming work.P6MTH_BASE ***/
%mend consolidate;
%consolidate(31Mar2016);
%consolidate(30Apr2016);
%consolidate(31May2016);
%consolidate(30Jun2016);
%consolidate(31Jul2016);
%consolidate(31Aug2016);
%consolidate(30Sep2016);
%consolidate(31Oct2016);
%consolidate(30Nov2016);
%consolidate(31Dec2016);
Ideally you build your base table only once with all the required months. If not then you need to delete your base table at the beginning of each call of macro %consolidate().
I'll have to look a bit further into your code to propose something more detailed.
Question: Are your monthly source tables already sorted by AR_ID?
To add to my previous answer below some performance improved code.
Besides of no more replicating data the main tweaks are changes that reduce passes through the data and even more importantly avoid sorting the data. All your SQL's with left join, Group By and DISTINCT triggered implicit sorting.
What really would help if asking coding question is provision of some sample data an ALL the code (like the bit in the %include statement as well as any code that populates macro variables used).
I had to make a few assumption when testing the changes I've made especially in regards how date related macro variables and the month variable are populated.
I first mocked-up some sample source data
/** create sample data **/
options dlcreatedir;
libname mortdata "%sysfunc(pathname(work))\mortgage";
%let report_month=%sysfunc(intnx(month,%sysfunc(today()),-1,e),date9.);
%put &=report_month;
%macro prep_env(iter);
data _null_;
call symputx('yyyymm',put(intnx('month',"&report_month"d,-&iter),yymmn6.),'l');
run;
data mortdata.mortgage_perfmc_&yyyymm;
format month yymmn6.;
month=intnx('month',input("&yyyymm",yymmn6.),0,'e');
array num_vars {*} 8
ar_id
UNQ_ID_SRC_STM
UNQ_ID_SRC_STM2
MIA
F_DEFAULT
F_RR
MOB
OVERDUE
F_LM_AMT
ORGL_AMT
f_os_bal
SHD_AMT
DISB_PERC
(13*1)
;
do i=2 to dim(num_vars);
num_vars[i]=num_vars[i]+&iter;
end;
output;
ar_id=2;
output;
%if &iter ne 0 %then
%do;
ar_id=3;
output;
%end;
run;
%mend;
data _null_;
length cmd $100;
do i=0 to 8;
cmd=cats('%prep_env(',i,');');
rc=dosubl(cmd);
end;
run;
I then used this sample data to test my code changes
/********************************************************************************************************/
/**** below some performance tuned version for the logic you've shared ***/
/** create base table **/
/*%let report_month=31OCT2023;*/
%let report_month=%sysfunc(intnx(month,%sysfunc(today()),-1,e),date9.);
%let yyyymm_repmth=%sysfunc(inputn(&report_month,date9.),yymmn6.);
%put &=report_month;
%put &=yyyymm_repmth;
/* create list of source tables for the last 7 months based on report month */
proc sql noprint;
select cats(libname, '.', memname,'(keep=month ar_id mia overdue disb_perc)') as source_table into :source_tables separated by ' '
from dictionary.tables
where
libname='MORTDATA'
and memname like 'MORTGAGE^_PERFMC^_%' escape '^'
and input(substr(memname,length('MORTGAGE_PERFMC_')+1),yymmn6.) between intnx('month',"&report_month"d,-7,'e') and intnx('month',"&report_month"d,-0,'e')
order by source_table desc
;
quit;
%put Selected source tables 6 months prior to report month: &source_tables;
data work.p7mth_wide(keep=ar_id f_mia0-f_mia6 f_overdue0-f_overdue6 f_disb_perc0-f_disb_perc6 f_disb_perc_p7mth);
set &source_tables;
by ar_id descending month;
array _vars {3,7} f_mia0-f_mia6 f_overdue0-f_overdue6 f_disb_perc0-f_disb_perc6;
retain _vars _select_flg;
/* only select rows where ar_id exists in the source table for the report month */
if first.ar_id then
do;
if month="&report_month"d then _select_flg=1;
else _select_flg=0;
end;
if _select_flg ne 1 then delete;
_ind= intck('month',month,"&report_month"d)+1;
/* f_mia<ind> */
_vars[1,_ind]=max(mia,_vars[1,_ind]);
/* f_overdue<ind> */
_vars[2,_ind]=max(overdue,_vars[1,_ind]);
/* disb_perc<ind> */
if disb_perc >= 70 then _vars[3,_ind]=max(1,_vars[1,_ind]);
else _vars[3,_ind]=max(0,_vars[1,_ind]);
if last.ar_id then
do;
f_disb_perc_p7mth=sum(of f_disb_perc:);
output;
call missing(of _all_);
end;
run;
proc format;
invalue s_employer_typ (upcase)
'BCB GROUP','ABC GROUP','FINANCIAL INSTITUTION'
,'GOVT AGENCIES','KLSE LISTED','MNC - NON FI'
,'MNC (NON FI)','PUBLIC LIMITED COMPANY - NOT LISTED'
,'PUBLIC LIMITED COMPANY (NOT LISTED)'
= 71
'PRIVATE LIMITED COMPANIES','PRIVATE LIMITED COMPA' = 43
'NONE (UNEMPLOYED)','NONE / UNEMPLOYED'
,'OTHERS','SOLE PROPRIETORSHIP / PARTNERSH',
'SOLE PROPRIETORSHIP / PARTNERSHIP',
'SOLE PROPRIETORSHIP/PARTNERSHIP'
,'SOLE PROPRIETORSHIP /','' = 36
OTHER = 36
;
invalue s_gender (upcase)
'FEMALE', 'F' = 66
other = 43
;
invalue s_state (upcase)
'EAST MALAYSIA','WILAYAH PERSEKUTUAN') = 61
'PENANG','SELANGOR','SELANGOR DARUL EHSAN','NEGERI SELANGOR' = 51
'MELAKA / NEGERI SEMBILAN','PAHANG/PERAK' = 48
'JOHOR','','KEDAH/PERLIS','EAST COAST','A15','999','FOREIGN' = 29
OTHER = 29
;
run;
data mortdata.WANT_mortgage_perfmc_&yyyymm_repmth (drop = i);
merge mortdata.mortgage_perfmc_&yyyymm_repmth
work.p7mth_wide;
by ar_id;
format GRP $50.;
format SEGMENT $8.;
format curr_overdue_install 8.;
format entity $20.;
if F_DEFAULT=1 or F_RRP6MTH=1 then
GRP='Default';
else
do;
if F_ABANDONED = 1 or F_RRP12MTH = 1 or MIA>=2 then
GRP='MIA>=2';
else if MIA=1 then
GRP='MIA=1';
else if MIA<1 then
do;
if MOB=. then
GRP='Amodel';
else if disb_perc < 70 and MOB <= 36 then
GRP='Amodel';
else if F_DISB_PERC_P7MTH<7 and MOB <=36 then
GRP='Amodel';
else GRP='Bmodel';
end;
end;
/*Define PD*/
if GRP='Default' then
PD=100;
else if GRP='MIA>=2' then
PD=50.15;
else if GRP='MIA=1' then
PD=22.69;
else PD=0;/*To be define later*/
IF GRP = 'Amodel' then
do;
if c_firstccris_2 = -9999999 then
s_length_cch = 48;
else if saa = 1 or Number_of_adverse_Credit > 0 then
s_length_cch = 15;
else if c_firstccris_2 = -888888 then
s_length_cch = 15;
else if c_firstccris_2 <= 24 then
s_length_cch = 15;
else if c_firstccris_2 <= 60 then
s_length_cch = 47;
else if c_firstccris_2 <= 78 then
s_length_cch = 55;
else s_length_cch = 72;
if max_cnt1abv_6m_all = -9999999 then
s_cnt_mia = 48;
else if saa = 1 or Number_of_adverse_Credit > 0 then
s_cnt_mia = 18;
else if max_cnt1abv_6m_all = -888888 then
s_cnt_mia = 18;
else if max_cnt1abv_6m_all = 0 then
s_cnt_mia = 63;
else if max_cnt1abv_6m_all <= 0.5 then
s_cnt_mia = 40;
else s_cnt_mia = 18;
if recent_p12m_NR = -9999999 then
s_borr = 48;
else if saa = 1 or Number_of_adverse_Credit > 0 then
s_borr = 28;
else if recent_p12m_NR = -888888 then
s_borr = 28;
else if recent_p12m_NR <= 0 then
s_borr = 60;
else if recent_p12m_NR <= 1 then
s_borr = 38;
else s_borr = 28;
if utilization_ratio_rev = -9999999 then
s_uti = 48;
else if saa = 1 or Number_of_adverse_Credit > 0 then
s_uti = 31;
else if utilization_ratio_rev = -888888 then
s_uti = 31;
else if utilization_ratio_rev in (.,-999999) then
s_uti = 31;
else if utilization_ratio_rev <= 0.3 then
s_uti = 69;
else if utilization_ratio_rev <= 0.6 then
s_uti = 48;
else if utilization_ratio_rev <= 0.9 then
s_uti = 38;
else s_uti = 31;
if undrawn_ratio_nonrev = -9999999 then
s_undrawn = 48;
else if saa = 1 or Number_of_adverse_Credit > 0 then
s_undrawn = 35;
else if undrawn_ratio_nonrev = -888888 then
s_undrawn = 35;
else if undrawn_ratio_nonrev = . then
s_undrawn = 55;
else if undrawn_ratio_nonrev = -999999 then
s_undrawn = 35;
else if undrawn_ratio_nonrev <= 0.05 then
s_undrawn = 35;
else if undrawn_ratio_nonrev <= 0.25 then
s_undrawn = 47;
else if undrawn_ratio_nonrev <= 0.40 then
s_undrawn = 53;
else s_undrawn = 55;
if max_limit_rev_income = -9999999 then
s_max_limit = 48;
else if saa = 1 or Number_of_adverse_Credit > 0 then
s_max_limit = 40;
else if max_limit_rev_income = -888888 then
s_max_limit = 40;
else if max_limit_rev_income <= 0.5 then
s_max_limit = 40;
else if max_limit_rev_income <= 1.5 then
s_max_limit = 46;
else if max_limit_rev_income <= 2.5 then
s_max_limit = 52;
else s_max_limit = 59;
if ccris_ind_new = 1 then
s_ccris_ind = 54;
else s_ccris_ind = 20;
if ltv_re_pr = . then
s_ltv = 29;
else if ltv_re_pr <= 75 then
s_ltv = 61;
else if ltv_re_pr <= 95 then
s_ltv = 48;
else s_ltv = 29;
s_state = input(pr_state_grouped,s_state.);
s_employer=input(employer_typ,s_employer_typ.);
s_gender =input(gender,s_gender.);
if joint_app = 'Y' then
s_joint = 80;
else s_joint = 37;
if marital_dep in ('Widowed','Single','Separated','Divorced') then
s_marital= 34;
else if marital_dep in ('Married1','Married0','Married.') then
s_marital= 53;
else if marital_dep in ('Married2','Married3','Married4','Married5') then
s_marital= 58;
else s_marital= 34;
if occupation_grouped_1 in ('Professional') then
s_occu = 69;
else if occupation_grouped_1 in ('Management','Executive') then
s_occu = 54;
else if occupation_grouped_1 in ('Armed forces','Clerical','Housewife','No Investigatio'
,'Self Employed','Skilled','Student','Unemployed','Unskilled','') then
s_occu = 40;
else s_occu = 40;
Ascore = sum(s_length_cch,s_cnt_mia,s_borr,s_uti,s_undrawn,s_max_limit,s_ccris_ind,s_ltv,s_state,s_employer,s_gender,s_joint,s_marital,s_occu);
if Ascore<=630 then
PD=0.93;
else if 630<Ascore<=665 then
PD=0.61;
else if 665<Ascore<=700 then
PD=0.27;
else if 700<Ascore<=745 then
PD=0.14;
else if Ascore>745 then
PD=0.05;
if Ascore<=630 then
segment= "A5";
else if 630<Ascore<=665 then
segment= "A4";
else if 665<Ascore<=700 then
segment= "A3";
else if 700<Ascore<=745 then
segment= "A2";
else if Ascore>745 then
segment= "A1";
END;
if GRP = 'Bmodel' then
do;
array cnt6{6} F_MIA7-F_MIA12;
array f_overdue{6} F_OVERDUE7-F_OVERDUE12;
array dueAMTFIN{6} dueamtfin7-dueamtfin12;
do i = 1 to 6; /* number of times >0 */
/* if SHD_AMT^=0 then dueINSTAL{i} = overdue{i}/SHD_AMT;*/
/* else dueINSTAL{i}=0;*/
if F_LM_AMT^=0 then
dueAMTFIN{i} = f_overdue{i}/F_LM_AMT;
else dueAMTFIN{i}=0;
/* if cls_cr_lmt^=0 then dueAMTFINANCE{i} = f_overdue{i}/cls_cr_lmt;*/
/* else dueAMTFINANCE{i}=0;*/
end;
/* Maximum Total amount Overdue over limit in past 6 mths */
Max_odue_limit = max(dueamtfin7,dueamtfin8,dueamtfin9,dueamtfin10,dueamtfin11,dueamtfin12);
/* Maximum Total amount Overdue over limit in past 6 mths */
/*Max_odue_limit1 = max(dueAMTFINANCE1,dueAMTFINANCE2,dueAMTFINANCE3,dueAMTFINANCE4,dueAMTFINANCE5,dueAMTFINANCE6);*/
if Max_odue_limit = 0 then
s_Max_odue_limit=155;
else if Max_odue_limit <= 0.005 then
s_Max_odue_limit=118;
else if Max_odue_limit <= 0.01 then
s_Max_odue_limit=89;
else if Max_odue_limit > 0.01 then
s_Max_odue_limit=41;
/*if Max_odue_limit1 = 0 then s_Max_odue_limit1=155;*/
/*else if Max_odue_limit1 <= 0.005 then s_Max_odue_limit1=118;*/
/*else if Max_odue_limit1 <= 0.01 then s_Max_odue_limit1=89;*/
/*else if Max_odue_limit1 > 0.01 then s_Max_odue_limit1=41;*/
IF MAX(F_MIA7,F_MIA8,F_MIA9,F_MIA10,F_MIA11,F_MIA12) > 0 then
s_MIA_P6MTH = 80;
ELSE IF MAX(F_MIA7,F_MIA8,F_MIA9,F_MIA10,F_MIA11,F_MIA12) = 0 then
s_MIA_P6MTH = 131;
ELSE s_MIA_P6MTH = 80;
if SHD_AMT^=0 then
curr_overdue_install = overdue/SHD_AMT;
else curr_overdue_install=0;
if curr_overdue_install = 0 then
s_curr_overdue_install=135;
else if curr_overdue_install <= 0.5 then
s_curr_overdue_install=84;
else if curr_overdue_install > 0.5 then
s_curr_overdue_install=67;
if sum_apportioned_collateral_value <= 75000 then
s_coll=103;
else if sum_apportioned_collateral_value <= 125000 then
s_coll=115;
else if sum_apportioned_collateral_value <= 225000 then
s_coll=125;
else if sum_apportioned_collateral_value <= 375000 then
s_coll=135;
else s_coll=151;
if F_1_DIS_DT le '30JUN2006'd then
entity='BCB/SBB';
else entity='ABC';
if entity = "ABC" then
s_entity=130;
else s_entity = 100;
Bscore = sum(s_Max_odue_limit,s_MIA_P6MTH,s_curr_overdue_install,s_coll,s_entity);
if Bscore<=530 then
PD=11.57;
else if 530<Bscore<=630 then
PD=2.10;
else if 630<Bscore<=660 then
PD=0.57;
else if 660<Bscore<=680 then
PD=0.27;
else if Bscore>680 then
PD=0.16;
if Bscore<=530 then
segment= "B5";
else if 530<Bscore<=630 then
segment= "B4";
else if 630<Bscore<=660 then
segment= "B3";
else if 660<Bscore<=680 then
segment= "B2";
else if Bscore>680 then
segment= "B1";
END;
/* keep month*/
/* AR_ID*/
/* UNQ_ID_SRC_STM*/
/* UNQ_ID_SRC_STM2*/
/* F_ACNO*/
/* AR_TP_ID*/
/* AR_FRQ_TP_ID*/
/* AR_PPS_TP_ID*/
/* AR_CR_RSK_RTG_ID*/
/* AR_FNC_ST_TP_ID*/
/* STATUS*/
/* OPEN_DT*/
/* MATURE_DT*/
/* CLOSED_DT*/
/* (and many other columns)*/
;
run;
You need also to verify if you've overwritten your source data by mistake. What I find in the code you've shared:
Name of source table:
Name of target table:
If the date strings used for the source and target table don't have a different pattern then you're overwriting your source tables - and your code logic replicates data.
I've added to the code I've shared also a change that shows you how you could replace some of your if/then/else logic with an informat. In my experience using formats/informats for such cases often helps to "declutter" code and make it easier to maintain.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.