turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Data Mining
- /
- Statistics required for Data Analytics / Data mini...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2011 07:18 AM

hey guys ,

I am basically a functional / business guy. I have access for using the best tools like enterprise Miner , enterprise guide , etc.

My objective is to become a pro in Data mining / Marketing Analytic / CRM Analytics.

I know my business problems and where the concepts can be applied.

I am looking for guidance on statistics.

How should I go ahead with learning Statistical concepts . Any books you can recommend for learning the same

rgds

I am basically a functional / business guy. I have access for using the best tools like enterprise Miner , enterprise guide , etc.

My objective is to become a pro in Data mining / Marketing Analytic / CRM Analytics.

I know my business problems and where the concepts can be applied.

I am looking for guidance on statistics.

How should I go ahead with learning Statistical concepts . Any books you can recommend for learning the same

rgds

Accepted Solutions

Solution

2 weeks ago

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-08-2011 12:43 AM

Hi.

Say, you seem like a pretty bright guy... so why would you want to become a Data mining / Marketing Analytic / CRM Analytics pro?

There may be a path to instant enlightenment, but I have 95% confidence (that's a little statistics joke) that it takes extensive effort to acquire the necessary foundational education for an analytics practitioner. But you haven't told us where you're starting from. Are you a total tyro? Have you had any basic probability and statistics? How much math have you studied?

If you have little or no math / stat background, and feel very weak on fundamentals, you might want to check out Khan Academy (http://www.khanacademy.org/). They have instructional videos and a self-testing regimen designed to take you from ABCs up through introductory college level classes in many subjects, including math and statistics. If you can start right in at college level, you can get a nice introduction to applied statistics through MIT open courseware (http://ocw.mit.edu/index.htm). There are loads of other tutorials, books, etc, available, but the point is, you need at least the equivalent of an introductory year long college statistics class. And that's just the start.

You need a bit of math to get to the next level. The equivalent of a one-semester linear algebra class is essential (available at both the Khan and MIT sites mentioned above, or from other sources). Technically, you could live without calculus, but a basic understanding of calculus really helps you to formulate and understand a wide range of concepts in statistics and data mining.

Then you'll want the kind of knowledge you'd get in a one semester course with a title like "Regression and Multivariate Data Analysis." For example, at NYU they teach "data analysis and management, multiple linear and nonlinear regression, selection of variables, residual analysis, model building, autoregression, and multicollinearity. Topics in multivariate data analysis include principal components, analysis of variance, categorical data analysis, factor analysis, cluster analysis, discriminant analysis, and logistic regression." You'll also want some exposure to Time Series Forecasting, and Machine Learning (e.g., decision trees, neural networks, genetic algorithms). As for book recommendations, here's a piece of advice one of my professors gave me in graduate school: if you get just about any three books on a subject, they each seem much better than any of them would alone, because they reinforce each other. But if you're going to be using SAS products like Enterprise Guide and Enterprise Miner, you might want to get some books specifically geared towards those products. The Little SAS Book for Enterprise Guide by Slaughter and Delwiche is kind of a classic, but it's aimed mainly at the mechanics of using EG, not statistics. There is a book by Davis called Statistics Using Enterprise Guide, but I've never used it, so I can't vouch for the quality.

I've just been describing a bottom up approach, which I think is crucial for really learning the subject, but you might want to try a top down approach in parallel. There are some easy to read books about analytics that give you a feel for the methodologies and how to approach problems, without miring you in technical details (but you won't truly learn the subject). Two such books I like a lot are Rud's Data Mining Cookbook (which has a lot of examples using "vanilla" SAS) and Berry and Linoff's Data Mining Techniques.

I hope this is helpful. Good luck!

Say, you seem like a pretty bright guy... so why would you want to become a Data mining / Marketing Analytic / CRM Analytics pro?

There may be a path to instant enlightenment, but I have 95% confidence (that's a little statistics joke) that it takes extensive effort to acquire the necessary foundational education for an analytics practitioner. But you haven't told us where you're starting from. Are you a total tyro? Have you had any basic probability and statistics? How much math have you studied?

If you have little or no math / stat background, and feel very weak on fundamentals, you might want to check out Khan Academy (http://www.khanacademy.org/). They have instructional videos and a self-testing regimen designed to take you from ABCs up through introductory college level classes in many subjects, including math and statistics. If you can start right in at college level, you can get a nice introduction to applied statistics through MIT open courseware (http://ocw.mit.edu/index.htm). There are loads of other tutorials, books, etc, available, but the point is, you need at least the equivalent of an introductory year long college statistics class. And that's just the start.

You need a bit of math to get to the next level. The equivalent of a one-semester linear algebra class is essential (available at both the Khan and MIT sites mentioned above, or from other sources). Technically, you could live without calculus, but a basic understanding of calculus really helps you to formulate and understand a wide range of concepts in statistics and data mining.

Then you'll want the kind of knowledge you'd get in a one semester course with a title like "Regression and Multivariate Data Analysis." For example, at NYU they teach "data analysis and management, multiple linear and nonlinear regression, selection of variables, residual analysis, model building, autoregression, and multicollinearity. Topics in multivariate data analysis include principal components, analysis of variance, categorical data analysis, factor analysis, cluster analysis, discriminant analysis, and logistic regression." You'll also want some exposure to Time Series Forecasting, and Machine Learning (e.g., decision trees, neural networks, genetic algorithms). As for book recommendations, here's a piece of advice one of my professors gave me in graduate school: if you get just about any three books on a subject, they each seem much better than any of them would alone, because they reinforce each other. But if you're going to be using SAS products like Enterprise Guide and Enterprise Miner, you might want to get some books specifically geared towards those products. The Little SAS Book for Enterprise Guide by Slaughter and Delwiche is kind of a classic, but it's aimed mainly at the mechanics of using EG, not statistics. There is a book by Davis called Statistics Using Enterprise Guide, but I've never used it, so I can't vouch for the quality.

I've just been describing a bottom up approach, which I think is crucial for really learning the subject, but you might want to try a top down approach in parallel. There are some easy to read books about analytics that give you a feel for the methodologies and how to approach problems, without miring you in technical details (but you won't truly learn the subject). Two such books I like a lot are Rud's Data Mining Cookbook (which has a lot of examples using "vanilla" SAS) and Berry and Linoff's Data Mining Techniques.

I hope this is helpful. Good luck!

All Replies

Solution

2 weeks ago

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-08-2011 12:43 AM

Say, you seem like a pretty bright guy... so why would you want to become a Data mining / Marketing Analytic / CRM Analytics pro?

There may be a path to instant enlightenment, but I have 95% confidence (that's a little statistics joke) that it takes extensive effort to acquire the necessary foundational education for an analytics practitioner. But you haven't told us where you're starting from. Are you a total tyro? Have you had any basic probability and statistics? How much math have you studied?

If you have little or no math / stat background, and feel very weak on fundamentals, you might want to check out Khan Academy (http://www.khanacademy.org/). They have instructional videos and a self-testing regimen designed to take you from ABCs up through introductory college level classes in many subjects, including math and statistics. If you can start right in at college level, you can get a nice introduction to applied statistics through MIT open courseware (http://ocw.mit.edu/index.htm). There are loads of other tutorials, books, etc, available, but the point is, you need at least the equivalent of an introductory year long college statistics class. And that's just the start.

You need a bit of math to get to the next level. The equivalent of a one-semester linear algebra class is essential (available at both the Khan and MIT sites mentioned above, or from other sources). Technically, you could live without calculus, but a basic understanding of calculus really helps you to formulate and understand a wide range of concepts in statistics and data mining.

Then you'll want the kind of knowledge you'd get in a one semester course with a title like "Regression and Multivariate Data Analysis." For example, at NYU they teach "data analysis and management, multiple linear and nonlinear regression, selection of variables, residual analysis, model building, autoregression, and multicollinearity. Topics in multivariate data analysis include principal components, analysis of variance, categorical data analysis, factor analysis, cluster analysis, discriminant analysis, and logistic regression." You'll also want some exposure to Time Series Forecasting, and Machine Learning (e.g., decision trees, neural networks, genetic algorithms). As for book recommendations, here's a piece of advice one of my professors gave me in graduate school: if you get just about any three books on a subject, they each seem much better than any of them would alone, because they reinforce each other. But if you're going to be using SAS products like Enterprise Guide and Enterprise Miner, you might want to get some books specifically geared towards those products. The Little SAS Book for Enterprise Guide by Slaughter and Delwiche is kind of a classic, but it's aimed mainly at the mechanics of using EG, not statistics. There is a book by Davis called Statistics Using Enterprise Guide, but I've never used it, so I can't vouch for the quality.

I've just been describing a bottom up approach, which I think is crucial for really learning the subject, but you might want to try a top down approach in parallel. There are some easy to read books about analytics that give you a feel for the methodologies and how to approach problems, without miring you in technical details (but you won't truly learn the subject). Two such books I like a lot are Rud's Data Mining Cookbook (which has a lot of examples using "vanilla" SAS) and Berry and Linoff's Data Mining Techniques.

I hope this is helpful. Good luck!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-11-2011 08:44 PM

I ditto TOPKATZ. I also recommend the online text **Elements of Statistical Learning: Data Mining, Inference, and Prediction.**

http://www-stat.stanford.edu/~tibs/ElemStatLearn/

If you have, as TOPKATZ mentions, some basic understanding of statistics including regression, as well as some knowledge of calculus and linear algebra, I would also recommend Dr. Andrew Ng's (Stanford) online Machine Learning lectures:

http://www.youtube.com/view_play_list?p=A89DCFA6ADACE599

I think Leo Breiman's**Statistical Modeling: The Two Cultures** offers necessary persepective for someone traditionally trained in probability and statistics, but unfamiliar with the algorithmic or machine learning approach to data analysis:

http://www.stat.osu.edu/~bli/dmsl/papers/Breiman.pdf

http://www-stat.stanford.edu/~tibs/ElemStatLearn/

If you have, as TOPKATZ mentions, some basic understanding of statistics including regression, as well as some knowledge of calculus and linear algebra, I would also recommend Dr. Andrew Ng's (Stanford) online Machine Learning lectures:

http://www.youtube.com/view_play_list?p=A89DCFA6ADACE599

I think Leo Breiman's

http://www.stat.osu.edu/~bli/dmsl/papers/Breiman.pdf