Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

How to reduce number of levels in an input variable in SAS EM

Reply
New Contributor
Posts: 3

How to reduce number of levels in an input variable in SAS EM

Hey

 

I have a dataset where some of the input variables has a lot of levels. E.g., School_city with 513 levels and School_state with 49 levels. How can I reduce the number of levels in an input variable, or in some way group levels together in SAS Enterprise Miner?

 

I'm kind of new to SAS EM, so I need some help figuring this out.

 

Thanks.

 

/LB

Super User
Posts: 19,817

Re: How to reduce number of levels in an input variable in SAS EM

Usually I'd say use some sort of binning but you have ordinal variables. 

You need to group them in some logical manner ideally, city into states for example. Though this then becomes redundant. Or some sort of spatial relationship - especially if you're wanting to be able to interpret the results afterwards. 

 

If you're looking for some sort of rules to create these groups it sort of becomes a data mining problem in itself, using decision trees or clustering is one method. 

 

 

New Contributor
Posts: 3

Re: How to reduce number of levels in an input variable in SAS EM

Thanks for the input, Reeza

 

I tried using a decision tree to consolidate the levels. It worked for the variables seperately and grouped the levels. But then I cannot figure out how to get the 5 different consolidated trees into the data again? 

The outputs from the separate decision trees are just _NODE_ for the new variables derived. So can I change the name, so they wont have the same name?

 

 

Ask a Question
Discussion stats
  • 2 replies
  • 115 views
  • 0 likes
  • 2 in conversation