08-09-2017 06:26 AM
PROC HPNLMOD and PROC NLMIXED : Which one is quick and good for estimation of model?
Initailly I thought the process was hung when I run PROC HPNLMOD, but actually it was not hung. Rather, it was taking longer than the PROC NLMIXED estimation of the same model.
Why is PROC HPNLMOD taking longer than NLMIXED -- in real time as well as in CPU time? Despite being a high-performance procedure and using 4 CPUs instead of 1, HPNLMOD is faster only if my dataset is smallish (about 100,000 and 13 variables). With a dataset of about 200,000 obs, its CPU time is much longer, and its real time only a little shorter, than NLMIXED. And for 2 million obs, HPNLMOD is *slower* in real time (and also in CPU time).
This doesn't make sense to me, as HPNLMOD is supposed to be a high-performance procedure compared to NLMIXED. Is there something I am not understanding about system parameters/constraints and that prevents me from obtaining the performance benefits of using HPNLMOD instead of NLMIXED?
Runing my estimation routine on SASGrid (9.4V) from EG 7.12
Any help would be greatly appreciated...
08-09-2017 06:20 PM
Are both procedures running through Grid?
One thing that would be pure guess is that convergence criteria may be different, possibly with HPNLMOD using defaults that may take more iterations.
08-14-2017 11:20 AM
Yes, both are running on Grid. They are each running with their default optimization method. I believe they are using the same convergence criterion, but I will check that. I’ve tried other optimization methods with each, but the default has proven to be faster than others.
08-15-2017 09:45 AM
We'd have to see your syntax to make a guess, but recall that the default solution method in PROC HPNLMOD is optimization of a least squares regression model, whereas PROC NLMIXED uses maximum likelihood estimation. Depending on your data and the model, you might be seeing differences due to the underlying solution technique rather than the procedures themselves. You can read about the advantages and disadvantages of the different optimization algorithms. PROC HPNLMOD uses the LEVMAR technique by default, and the doc says "for large problems, it consumes more memory and takes longer than the other techniques."
Both procedures support a TECHNIQUE= option where you can specify the solution technique. By default, PROC NLMIXED uses TECHNIQUE=QUANEW for quasi-Newton optimization. Try switching to TECHNIQUE=QUANEW in PROC HPNLMOD and see what happens.
09-08-2017 04:17 AM
Yes, I have found that the choice of technique matters to run time for both procedures.
The documentation for HPNLMOD says that LEVMAR technique is the default, but in my experience, the procedure always uses NRRIDG technique if I don’t specify the technique. I don’t know if it matters, but I am using the “general” distribution rather than a built-in distribution (i.e., using a user-defined likelihood function).
I have found that, for a dataset of 500,000 obs or so, the NRRIDG technique was faster than other techniques. But once the dataset was more than 1,000,000 obs, it seems that the QUANEW technique becomes significantly faster.