Hello, I have a data for prediction (binary target) with 80K observation and 100 input variables. Methods like Gradient Boosting fit the data quite well with a validation Gini of over 70%. When I fit a Neural Network with all 100 variables, I get a Gini of around 15% (both training and validation). When I do a variable selection and use 25-odd variables in NN, the validation Gini increases to 30% - which is still materially worse than the other models. I tried the default NN in EM 6.2 with the following changes: 1. Architecture : MLP 2. #Hidden Units : 2, 3, 5, 10, 20 3. Decay : 0, 0.05, 0.1, 0.5, 1, 5, 10, 25, 50. Decay seems to hardly impact model performance. 4. Standardization : Standard Deviation and Range 5. Sufficient #iterations to ensure model convergence. No other changes to optimization properties. (Have also tried to play with some other properties like RBU, Act Function, Combination functions, direct connections etc without any material change in model) Clearly the model is converging to a local minima; the 25-variable model to a slightly better minima. Am I missing some basic setting/feature which is leading to such poor NN models? Thanks, Nil
... View more