03-23-2017 10:44 AM - edited 03-27-2017 08:43 AM
I am trying to build a model on a dataset with 2 million observations and an ordinal response variable with 53 levels with a normal distribution ranging from -26 to +26.
Both SAS/Base 9.4 and Enterprise Miner 13.2 are available for use.
I am looking for any suggestions on modeling techniques that could be used especially in EM.
Was also wondering if there's a way to use cumulative logit link function in EM.
Does that make sense to consider the response variable interval and then to use GLM?
03-23-2017 10:55 AM - edited 03-23-2017 11:38 AM
Unless your 53 levels are highly non-linear, I would treat them as a continuous variable and perform partial least squares regression modelling (which in my opinion is probably the best way to model 900 independent variables) in PROC PLS. I do not know if this is available in Enterprise Miner as I don't use it. I would not use PROC GLM with 900 independent variables.
03-23-2017 11:36 AM
When you use the Regression node in Enterprise Miner with an ordinal target, it does use the cumulative logit link function.