BookmarkSubscribeRSS Feed
b_smsha
Obsidian | Level 7

Hi Everyone,

 

So I have a query regarding my question. I have an excel file filled up with 23 variables consisting of around 9k records of crimes and victim information for my project. Now I am following the book "Predictive Modelling with SAS Enterprise Miner"and in it I saw they mention that we need to certain preprocessing steps before we actually import to SAS Eminer. 

 

Now I already know what I want to change but I want to know how far am I supposed to with doing these steps within Excel itself? How much data should be changed and what not, what more steps can SAS eminer help me with? 

 

3 REPLIES 3
PaigeMiller
Diamond | Level 26

I believe the general idea is to do most of the pre-processing in SAS Enterprise Miner, and not in Excel, as the tools available in EM are far superior.

 

If you look at this web page, you will see a section entitled "Data preparation, summarization and exploration", those are the likely tools you would use (and you don't have to use all of them, you use the ones that make sense for your data).

--
Paige Miller
b_smsha
Obsidian | Level 7

Yes, I agree with that, its just that the book im referring to mentions I need to carry out some form of preprocessing before importing to Eminer, i was wondering what that could be ??

 

For example; my gender data has some info such as M, F, unknown, blanks, Other 

Moreover, certain columns have data with different variations of capitals etc so would i need to fix this before importing or can this be done within Eminer through the mentioned nodes?

PaigeMiller
Diamond | Level 26

@b_smsha wrote:

Yes, I agree with that, its just that the book im referring to mentions I need to carry out some form of preprocessing before importing to Eminer, i was wondering what that could be ??

 

For example; my gender data has some info such as M, F, unknown, blanks, Other 

Moreover, certain columns have data with different variations of capitals etc so would i need to fix this before importing or can this be done within Eminer through the mentioned nodes?


E-Miner has data cleaning routines. You can also write a "code node" in E-Miner with simple SAS code to do specific cleaning. Or if you want, you can certainly do cleaning before importing. It's entirely up to you, whatever works.

--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1544 views
  • 1 like
  • 2 in conversation