Hello,
I would add the following:
Include a weight decay value (L2) and tune the value on your validation data.
You mentioned ASE... if your target is interval, consider changing the error function and output activation function to match your target's understood distribution.
I can typically achieve performance improvements in my neural networks when I divide and concur the input space. You can do this in several ways. One that is easiest for me is to build several networks, giving each network a different subset of inputs and average the predictions of the networks. I hope this helps.
Best,
Robert
... View more