Hello. I am trying to build a multinomial logistic model using LINK=GLOGIT to predict customer membership in 6 groups. The customers are grouped into 6 groups based on survey responses to spend amount in the industry overall & with one particular vendor in the industry.
Behavioral data (retail purchase data) of these surveyed customers is what I'm using to try & develop a model to predict membership in one of the 6 groups. All solutions I've found result in nearly all modeled customers being placed into the largest segment (based on max score) & no customers being placed in some of the smaller segments. Basically, 93+% of those in each segment are scored to be placed into the largest segment, not the segment they actually fall in. The largest segment is 43% of all customers.
Any tips on how to alleviate this situation and reduce the very large rate of misclassifications? Different Proc options, data preparation & variable steps?