Hi there
I ws wondering how does SAS EM handles categorical variables. I am used to Python and hot encoding.
For instance if my variable COUNTRY has Germany, France and Spain in it does it create 2 columns (not 3 to avoid the dummy variable trap) of 0 and 1. The reason I ask is that because there is a Dummy Indicator optio in the Transform Variables Node so it does seem like it is not done by default by SAS EM. Many thanks
@NicolasC wrote:
Hi there
I ws wondering how does SAS EM handles categorical variables. I am used to Python and hot encoding.
For instance if my variable COUNTRY has Germany, France and Spain in it does it create 2 columns (not 3 to avoid the dummy variable trap) of 0 and 1. The reason I ask is that because there is a Dummy Indicator optio in the Transform Variables Node so it does seem like it is not done by default by SAS EM. Many thanks
There are multiple ways to specify a categorical variables, and you can include your own, so the Dummy Indicator is a way to include your own dummy variable.
SAS will create dummy variables behind the scene but it won't be in your dataset. Note that there are several ways to parameterize dummy variables so make sure it's using the method you expect, ie Referential vs GLM
Hi Reeza
Thanks for your reply. If I unerstand corrctly, SAS EM will automaticaly create dummy indicators to handle categorical varibles.
If so why using the Tansform Variable Node to create dummy indicators? Isn't it redundant? Thanks
Nicolas
@NicolasC wrote:
Isn't it redundant? Thanks
Nicolas
Different procedures likely require different structures. Some may want these separated out. And then you can regroup into different categories if desired.
There are many ways to do the same things in SAS....many, many, so yes it may be redundant but that's common in programming languages and data analysis tools 🙂
Reeza, you indicated, "SAS will create dummy variables behind the scene but it won't be in your dataset."
Is there a reference to a SAS EM manual that confirms this and the 'behind the scene' method?
I have looked through a number of documents and can't find this information. I would like to be able to evaluate when I need to create Dummy Variables and when I can just let the Default SAS EM method do it.
We are working on HP SVM nodes currently.
Rod.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.
