To build a market response model either by logistic regress or decision tree techniques, I often include some nominal variables that contain mutiple levels. A typical example would be 66 PRIZM segments or over 800 zip3 region segments. I understand that SAS, by default, sets a threshold for a class variable to 20, which means that any class variable with more than 20 levels will be excluded from the further process. So I usually increas this threshold into 80 in order to include PRIZM codes. But the difficulty is it would be very difficult to explain these multiple level variable model to business and management. Is there any technique to reduce the number of level but not lose any specific information?
... View more