BookmarkSubscribeRSS Feed
b_smsha
Obsidian | Level 7

Hi Everyone,

 

So I have a query regarding my question. I have an excel file filled up with 23 variables consisting of around 9k records of crimes and victim information for my project. Now I am following the book "Predictive Modelling with SAS Enterprise Miner"and in it I saw they mention that we need to certain preprocessing steps before we actually import to SAS Eminer. 

 

Now I already know what I want to change but I want to know how far am I supposed to with doing these steps within Excel itself? How much data should be changed and what not, what more steps can SAS eminer help me with? 

 

3 REPLIES 3
PaigeMiller
Diamond | Level 26

I believe the general idea is to do most of the pre-processing in SAS Enterprise Miner, and not in Excel, as the tools available in EM are far superior.

 

If you look at this web page, you will see a section entitled "Data preparation, summarization and exploration", those are the likely tools you would use (and you don't have to use all of them, you use the ones that make sense for your data).

--
Paige Miller
b_smsha
Obsidian | Level 7

Yes, I agree with that, its just that the book im referring to mentions I need to carry out some form of preprocessing before importing to Eminer, i was wondering what that could be ??

 

For example; my gender data has some info such as M, F, unknown, blanks, Other 

Moreover, certain columns have data with different variations of capitals etc so would i need to fix this before importing or can this be done within Eminer through the mentioned nodes?

PaigeMiller
Diamond | Level 26

@b_smsha wrote:

Yes, I agree with that, its just that the book im referring to mentions I need to carry out some form of preprocessing before importing to Eminer, i was wondering what that could be ??

 

For example; my gender data has some info such as M, F, unknown, blanks, Other 

Moreover, certain columns have data with different variations of capitals etc so would i need to fix this before importing or can this be done within Eminer through the mentioned nodes?


E-Miner has data cleaning routines. You can also write a "code node" in E-Miner with simple SAS code to do specific cleaning. Or if you want, you can certainly do cleaning before importing. It's entirely up to you, whatever works.

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1509 views
  • 1 like
  • 2 in conversation