Solved: Lag performance over 400 million sorted records - Page 2

Kurt_Bremser · Posted 10-25-2018 02:44 AM

Interesting. This is the log when I ran your test on my server:

35         data _null_;        * 7s CPU/Elapse;
36           set TEST;
37           PAY_LAG1 = lag (PAYMENT);
38           PAY_LAG2 = lag2(PAYMENT);
39         run;

NOTE: There were 100000000 observations read from the data set WORK.TEST.
NOTE: DATA statement used (Total process time):
      real time           44.22 seconds
      cpu time            5.75 seconds
      

40         
41         data _null_;        * 5s CPU/Elapse;
2                                                          Das SAS System                           08:26 Thursday, October 25, 2018

42           retain PAY_LAG1;
43           PAY_LAG2 = PAY_LAG1;
44           PAY_LAG1 = PAYMENT;
45           set TEST;
46         run;

NOTE: There were 100000000 observations read from the data set WORK.TEST.
NOTE: DATA statement used (Total process time):
      real time           46.69 seconds
      cpu time            4.88 seconds

SAS 9.4M5, AIX 7.1, 2 POWER8 cores, MEMSIZE 256M (that's why SASFILE won't work).

All this with a rather constant run queue around 8, and one can see that the potential gain in real time is negligible (actually, in this particular test there wasn't one), as the factor I/O is much bigger.

When reducing the obs count to 1e7, sasfile was possible, and I got this:

35         data _null_;        * 7s CPU/Elapse;
36           set TEST;
37           PAY_LAG1 = lag (PAYMENT);
38           PAY_LAG2 = lag2(PAYMENT);
39         run;

NOTE: There were 10000000 observations read from the data set WORK.TEST.
NOTE: DATA statement used (Total process time):
      real time           3.97 seconds
      cpu time            0.54 seconds
      

40         
41         data _null_;        * 5s CPU/Elapse;
42           retain PAY_LAG1;
43           PAY_LAG2 = PAY_LAG1;
44           PAY_LAG1 = PAYMENT;
45           set TEST;
46         run;

NOTE: There were 10000000 observations read from the data set WORK.TEST.
NOTE: DATA statement used (Total process time):
      real time           3.53 seconds
      cpu time            0.44 seconds

Once again, the difference in CPU seconds lies in the 10-15% range.

What environment do you use for SAS?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

ChrisNZ · Posted 10-25-2018 03:27 AM

@Kurt_Bremser

I am testing under Windows. I can post the log tomorrow, but there is nothing to be learnt there I think.

>as the factor I/O is much bigger

The idea is to test whether retrieving previous values with LAG() is slower than saving values before they are overwritten by SET.

It may well be that, depending on other parameters, this difference in efficiency makes no difference in various cases.

That is besides the point. That's why I used data _null_ and sasfile : to isolate the process we are measuring.

Once again, the difference in CPU seconds lies in the 10-15% range.

Your numbers show an increase in CPU time when using LAG of 17 and 22%. Not as high as my test (maybe due to the high load you have?), but enough to conclude that LAG is less efficient imho.

MEMSIZE 256M

Oh wow that's low! So much CPU power so little RAM space!

High-Performance SAS Coding - Third Edition

Kurt_Bremser · Posted 10-25-2018 04:23 AM

@ChrisNZ wrote:

@Kurt_Bremser

MEMSIZE 256M

Oh wow that's low! So much CPU power so little RAM space!

That's following a recommendation by SAS themselves. If you allow SAS lots of memory in a heavily multi-user setup, SAS will (in most cases with data steps and "simple" procedures that do not need much memory for themselves) "waste" memory by caching SAS dataset data internally, when the operating system already does that on it's own. In extreme, the OS would then have to page out program pages that actually contain data which is already present in the portion the OS reserves for caching disk data. Double whammy. Restricting SAS memory prevents that and makes the most of the operating system's capabilities in terms of read-ahead, sharing cached data and the like.

So we work with the least memory that's necessary for our codes to work.

At the moment I ran my tests, we had our monthly closure running, so the accountants and actuaries where riding the server hardest, pushing up the run queue and the I/O load.

The batch programs I develop are also meant to run in parallel, so using memory economically is also a must.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Astounding · Posted 10-25-2018 09:05 AM

Remember, in this application there are several variables that need to be lagged 5 times. This might be a more appropriate test of the savings:

data want1;

set have;

pay_lag1 = lag(payment);

pay_lag2 = lag2(payment);

pay_lag3 = lag3(payment);

pay_lag4 = lag4(payment);

pay_lag5 = lag5(payment);

run;

data want2;

pay_lag5 = pay_lag4;

pay_lag4 = pay_lag3;

pay_lag3 = pay_lag2;

pay_lag2 = pay_lag1;

pay_lag1 = payment;

set have;

retain pay_lag1 - pay_lag5;

run;

Yes, I know I don't need to retain PAY_LAG5. But that can only help the program run a hair faster.

Kurt_Bremser · Posted 10-25-2018 09:38 AM

With 5 lags vs. 5 retains, the difference becomes significant. With 3e7 observations I got

      real time           3.83 seconds
      cpu time            2.13 seconds

for the lag() vs.

      real time           2.32 seconds
      cpu time            1.29 seconds

for the retain version. All tests ran from a dataset loaded with sasfile and used in a data _null_ step, so I/O was mostly removed as a factor.

I hereby revoke my former statement about SAS functions being optimized.

I then added a step with a temporary array:

data _null_;
set have;
array pay_lag{5} _temporary_;
do i = 1 to 4;
  pay_lag{i} = pay_lag{i+1};
end;
pay_lag{5} = payment;
run;

and this turned out to be the slowest of all, more than double the CPU time of the retain version.

Who's the culprit here? It turns out to be the do loop, because (after digging out the old C programmer in me who uses #DEFINEs to speed up the compilate)

data _null_;
set have;
array pay_lag{5} _temporary_;
%macro mymac;
%do i = 1 %to 4;
  pay_lag{&i} = pay_lag{%eval(&i+1)};
%end;
%mend;
%mymac
pay_lag{5} = payment;
run;

performed as well as the retain version.

Bottom line: if you engage in optimizing code in this way, creating lots of statements with a macro can save significantly in runtime.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

ChrisNZ · Posted 10-25-2018 05:07 PM

@Kurt_Bremser

1. if you engage in optimizing code in this way, creating lots of statements with a macro can save significantly in runtime.

Top notch optimisation research here Kurt. Well done! 🙂

2.SAS will "waste" memory by caching SAS dataset data internally, when the operating system already does that on it's own

- I've never heard of this before. As far as I know, a step allocates virtual memory based on best guess upon starting, requests more as/if needed until the value of ~MEMSIZE is reached, and then uses utility files if that's a feature supported by that step. At the end of the step, all the virtual memory is released.

- The latest value seems to be a more reasonable 2G (it was 512M in version 9.3).

High-Performance SAS Coding - Third Edition

s_lassen · Posted 10-24-2018 06:45 AM

I think the temporary array idea can work, and I do not think you have to initialize the array for each account. Just keep track of the number of records you have read from that account, something like

data want;
  array dpd_lag(100) 8 _temporary_;
  array pay_lag(100) 8 _temporary_;
  array prin_lag(100) 8 _temporary_;
  do _N_=1 by 1 until(last.account_number);
    set have;
    by account_number;
    /* assuming that you want to do the calculation for loan_count=1 as well, with no lag, I put the current variables in also */
    dpd_lag(_N_)=dpd;
    pay_lag(_N_)=payment_amount;
    prin_lag(_N_)=payment;
    if loan_count>_N_ then /* would be the same as getting the lag from a previous account */
      error 'Wrong loan count';
    else do;
      lag_index=_N_-loan_count+1;
      /* do the calculation here, using dpd_lag(lag_index) etc. as the lagged vars */
      /* I assume the calculation is basically the same for all loan counts */
      output;
      end;
    end;
run;

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

Re: Lag performance over 400 million sorted records

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away