Turn on: Options MPRINT Fullstimer; before running the job next time.
READ the log for the places that take a lot of time.
You might find that a data step runs faster than
PROC SQL;
SYSECHO "Formatting 4/4 - ANLY_IND 1 - Step 2/4 - Sub Step &k./&COUNT_ANLY_1.";
CREATE TABLE ranked_out(drop=N Var_Rank) AS
SELECT *,
CASE WHEN N = . THEN 'Missing'
WHEN N <= CEIL(MIN(N) + (MAX(N) - MIN(N) + 1)* 1/100 - 1) THEN '0% to 1%'
WHEN N <= MIN(N) + (MAX(N) - MIN(N) + 1)* 5/100 - 1 THEN '1% to 5%'
WHEN N <= MIN(N) + (MAX(N) - MIN(N) + 1)* 10/100 - 1 THEN '5% to 10%'
WHEN N <= MIN(N) + (MAX(N) - MIN(N) + 1)* 25/100 - 1 THEN '10% to 25%'
WHEN N <= MIN(N) + (MAX(N) - MIN(N) + 1)* 50/100 - 1 THEN '25% to 50%'
WHEN N <= MIN(N) + (MAX(N) - MIN(N) + 1)* 75/100 - 1 THEN '50% to 75%'
WHEN N <= MIN(N) + (MAX(N) - MIN(N) + 1)* 90/100 - 1 THEN '75% to 90%'
WHEN N <= MIN(N) + (MAX(N) - MIN(N) + 1)* 95/100 - 1 THEN '90% to 95%'
WHEN N <= FLOOR(MIN(N) + (MAX(N) - MIN(N) + 1)* 99/100 - 1) THEN '95% to 99%'
ELSE '99% to 100%'
END AS VAR_SEG /* Percentile Segmentation variable */
FROM ranked
;
QUIT;
The above step indicates that almost every single end point of your range result appears in 2 ranges. So when yu have exactly 75% which range should a value go into?
You are apparently rerunning Proc format code multiple times to create a same named formats with values that do not change:
%if (&type. in (1, 5)) %then %do;
proc format;
value POP_FMT (multilabel)
. = "Missing"
other = "Populated"
;
run;
%end;
%else %if (&type. = 4) %then %do;
proc format;
value $ POP_FMT (multilabel)
' ' = "Missing"
other = "Populated"
;
run;
%end;
Create the formats ONE time, and if you are doing this at all often then the format likely belongs in a permanent library and add that to your FMTSEARCH path and quit wasting cpu cycles on repetitive code.
I think that it might be time to actually look at what Proc summary can do in the counting multiple variables and examining _TYPE_ values. I think you are going back through the same data repeatedly to get a summary for a single variable when not needed.
What is this supposed to actually accomplish. I do not have the hours of time to try to parse code and without input data to examine what the results possibly could be...
How big is the input data set, in terms of records and variables?
SAS can easily run into IO issues and slow network, slow disks, or code that makes data read/write to the disks more often than needed can have performance issues.
I would say that I think you have considerable wasted cycles with your Iterm macro. Since it creates global macro variables that are invariate (they don't change values in any way, you may just use fewer of them at some point) then create the list ONE time in the session. You are calling that sucker in every single macro practically.
... View more