Hello,
I am in the process of trying to recreate an Excel workbook in SAS, and I am struggling to find the best solution. For reference, this is the output I am trying to recreate:
I've created a MACRO that basically does everything that I want, however, this MACRO takes 20+ minutes to run when there are ~300+ branches in a data file. Here is the code that I created (I whipped this together earlier this morning so I am sure could make this more efficient):
%MACRO delinquency_by_branch ; /* program timer start for performance testing */ %let _timer_start = %sysfunc(DATETIME()) ; PROC SQL ; SYSECHO "Creating Delinquency By Branch" ; CREATE TABLE t_branch_names AS SELECT DISTINCT COMPRESS(&_branch_name., "'") AS &_branch_name. FROM work.t_&client_number._&medte._analysis_extract ORDER BY &_branch_name. ; QUIT ; DATA _NULL_ ; SYSECHO "Creating Delinquency By Branch" ; SET t_branch_names END=last ; CALL SYMPUTX("_branch_num"||LEFT(PUT(_N_, best8.)), TRIM(&_branch_name.), "L") ; IF last THEN DO ; CALL SYMPUT("_total", LEFT(PUT(_N_, best32.))) ; END ; RUN ; %do i=1 %to &_total. ; PROC SQL ; SYSECHO "Creating Delinquency By Branch" ; CREATE TABLE work.t_branch_dq_classify&i._1 AS SELECT &_branch_name. format=$100. , CONTRACTUAL_DELINQUENCY_c , COUNT(*) format=comma7. AS COUNT , SUM(&_gross_balance.) format=dollar18.2 AS &_gross_balance. FROM work.t_&client_number._&medte._analysis_extract WHERE UPCASE(&_branch_name.) = "%upcase(&&_branch_num&i.)" GROUP BY &_branch_name. , CONTRACTUAL_DELINQUENCY_c ORDER BY &_branch_name. , CONTRACTUAL_DELINQUENCY_c ; QUIT ; PROC SQL ; SYSECHO "Creating Delinquency By Branch" ; CREATE TABLE work.t_branch_dq_classify&i._2 AS SELECT "Total Branch Outstandings" AS &_branch_name. format=$100. , "" AS CONTRACTUAL_DELINQUENCY_c , COUNT(*) format=comma7. AS COUNT , SUM(&_gross_balance.) format=dollar18.2 AS &_gross_balance. FROM work.t_&client_number._&medte._analysis_extract WHERE UPCASE(&_branch_name.) = "%upcase(&&_branch_num&i.)" GROUP BY &_branch_name. ORDER BY &_branch_name. ; QUIT ; PROC SQL OUTOBS=1 ; SYSECHO "Creating Delinquency By Branch" ; CREATE TABLE work.t_branch_dq_classify&i._3 AS SELECT "Total Branch Delinquency" AS &_branch_name. format=$100. , "" AS CONTRACTUAL_DELINQUENCY_c , (SELECT COUNT(*) FROM work.t_&client_number._&medte._analysis_extract WHERE CONTRACTUAL_DELINQUENCY_c IN (" 30", " 60", " 90", "120", "150", "180", "210") AND UPCASE(&_branch_name.) = "%upcase(&&_branch_num&i.)") format=comma7. AS COUNT , (SELECT SUM(&_gross_balance.) FROM work.t_&client_number._&medte._analysis_extract WHERE CONTRACTUAL_DELINQUENCY_c IN (" 30", " 60", " 90", "120", "150", "180", "210") AND UPCASE(&_branch_name.) = "%upcase(&&_branch_num&i.)") format=dollar18.2 AS &_gross_balance. FROM work.t_&client_number._&medte._analysis_extract GROUP BY &_branch_name. ORDER BY &_branch_name. ; QUIT ; PROC SQL OUTOBS=1 ; SYSECHO "Creating Delinquency By Branch" ; CREATE TABLE work.t_branch_dq_classify&i._4 AS SELECT " " AS &_branch_name. format=$100. , " " AS CONTRACTUAL_DELINQUENCY_c , . AS COUNT , . AS &_gross_balance. FROM work.t_&client_number._&medte._analysis_extract ; QUIT ; DATA work.t_branch_delinq_classify_&i. ; SYSECHO "Creating Delinquency By Branch" ; LENGTH &_branch_name. $ 100 ; SET work.t_branch_dq_classify&i._1 work.t_branch_dq_classify&i._2 work.t_branch_dq_classify&i._3 work.t_branch_dq_classify&i._4 ; RUN ; PROC DELETE LIBRARY=work DATA=t_branch_dq_classify&i._1 t_branch_dq_classify&i._2 t_branch_dq_classify&i._3 t_branch_dq_classify&i._4 (MEMTYPE=data) ; SYSECHO "Creating Delinquency By Branch" ; RUN ; %end ; PROC SQL NOPRINT; SYSECHO "Creating Delinquency By Branch" ; SELECT DISTINCT memname INTO :_branch_list SEPARATED BY " " FROM dictionary.tables WHERE UPCASE(libname) = "WORK" AND UPCASE(memname) CONTAINS "T_BRANCH_DELINQ_CLASSIFY" ; QUIT ; DATA work.f_final_branch_classify ; SYSECHO "Creating Delinquency By Branch" ; SET &_branch_list. ; RUN ; PROC DELETE LIBRARY=work DATA=t_branch_names &_branch_list. (MEMTYPE=data) ; SYSECHO "Creating Delinquency By Branch" ; RUN ; /* program timer end */ DATA _NULL_ ; SYSECHO "Program Timer Stopping" ; dur = DATETIME() - &_timer_start. ; PUT 100*"-" / / " PROGRAM RUNTIME:" dur time13. / / 100*"-" ; RUN ; %MEND delinquency_by_branch ;
And this is the output data from the above MACRO program:
I know that I wont be able to do this in PROC REPORT without "pre-processing" the data, but I am unsure exactly how to do this. I would either like to make my macro program more efficient if possible, perform less preprocessing and let PROC REPORT do the rest, or use PROC TABULATE.
Can I create multiple summary rows in PROC TABULATE?
Any help is much appreciated!!
work.t_&client_number._&medte._analysis_extract is the full data set with 300+ variables, with none of them other than the ones posted being relevant.
I use a driver/control file that holds key variable names across 100+ clients that helps make my macro programs more dynamic (I hope). and those two variables are such.
%let _branch_name = BRANCH_NAME_c ;
%let _gross_balance = current_gross_balance ;
The other macro variables are completely irrelevant towards the question at hand, but because you asked &client_number. is an internal code to identify a specific client and &medte. is a date variable in the form of MAR2021.
May you please share your source dataset or a sample of it.
DATA have ;
LENGTH BRANCH_NAME_c $ 20 ;
INFILE DATALINES DELIMITER="," ;
INPUT BRANCH_NAME_c $ CONTRACTUAL_DELINQUENCY_c $ current_gross_balance ;
DATALINES ;
ABBEVILLE LA, 30, 125412.26
AUBURN AL, 60, 3636.25
ABBEVILLE LA, 0, 4561.30
ABBEVILLE LA, 0, 100.00
ABBEVILLE LA, 90, 110.00
AUBURN AL, 90, 147.25
ABBEVILLE LA, 0, 45888.20
ABBEVILLE LA, 0, 1.66
ABBEVILLE LA, 30, 2580.00
ABBEVILLE LA, 30, 125412.26
AUBURN AL, 0, 12.98
ABBEVILLE LA, 0, 999.26
AUBURN AL, 90, 6489.26
ABBEVILLE LA,120, 8524.99
AUBURN AL,180, 125412.26
ABBEVILLE LA,180, 125412.26
ABBEVILLE LA,180, 12.26
ABBEVILLE LA,150, 412.99
ABBEVILLE LA,210, 125412.26
AUBURN AL, 0, 125412.26
ABBEVILLE LA, 0, 125412.26
ABBEVILLE LA, 30, 125412.28
ABBEVILLE LA, 30, 5412.88
AUBURN AL, 30, 1512.71
ABBEVILLE LA, 30, 2512.50
AUBURN AL, 30, 2541.55
ABBEVILLE LA, 30, 5412.54
ABBEVILLE LA, 30, 412.89
AUBURN AL, 30, 125.92
ABBEVILLE LA, 0, 25412.11
ABBEVILLE LA, 60, 12412.26
ABBEVILLE LA, 0, 15412.26
AUBURN AL, 0, 12541.26
ABBEVILLE LA,120, 12412.26
ABBEVILLE LA, 90, 15630.36
ABBEVILLE LA, 30, 5.50
;
Your macro code includes references to data sets like
work.t_&client_number._&medte._analysis_extract
Is this set supposed to be the same as your "HAVE" example?
The macro variables &_branch_name , &_gross_balance appear to reference variables names but no idea what they should be.
but does not show where any of the macro variables &client_number &medte &_branch_name &_branch_name &_gross_balance are set.
I'm somewhat lazy about completely rewriting code for undocumented variables and data set names.
Better practice is to include these as macro parameters .
work.t_&client_number._&medte._analysis_extract is the full data set with 300+ variables, with none of them other than the ones posted being relevant.
I use a driver/control file that holds key variable names across 100+ clients that helps make my macro programs more dynamic (I hope). and those two variables are such.
%let _branch_name = BRANCH_NAME_c ;
%let _gross_balance = current_gross_balance ;
The other macro variables are completely irrelevant towards the question at hand, but because you asked &client_number. is an internal code to identify a specific client and &medte. is a date variable in the form of MAR2021.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.