5
4

Hello,

I am a grad student and will be starting my masters thesis on Data mining in the next 3-4 months. Where should I start? I have taken a course on data mining next semester, but I want to utilize this summer and get going.

So, can you guys suggest some areas of interest in data mining, starting pointers etc.

Thanks

UPDATE 1

One of my professors have recommended this book: Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data by Bing Liu

So how is that one for basic concepts?

asked Jul 03 '10 at 03:26

zengr's gravatar image

zengr
110379

edited Dec 03 '10 at 07:13

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1896744214334


6 Answers:

There are a number of good resources:

1-The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition alt text This book is also available online on authors' website

2- Pattern Recognition and Machine Learning There are google videos on statistical data mining that use this book. Hi highly recommend you watch them on youtube.

alt text

3- Data Mining: Practical Machine Learning Tools and Techniques, Second Edition This book uses Weka which may come in very handy

alt text

answered Jul 03 '10 at 05:32

Mark%20Alen's gravatar image

Mark Alen
1118233442

edited Jul 09 '10 at 01:40

The third book is probably the easiest to get started with, although the 2nd one will probably be more useful to you over the course of your course (I haven't gone through the first myself yet)

(Jul 03 '10 at 13:08) Aditya Mukherji

I think I will start with Data Mining: Practical Machine Learning Tools and Techniques, Second Edition, BTW they have 3rd edition out in the market, so I should go with the 3rd edition right???

Thanks for your reply guys!

(Jul 03 '10 at 13:37) zengr

To books, listed by Mark, I'll also recommend Data Mining: Concepts and Techniques

answered Jul 03 '10 at 05:45

Alex%20Ott's gravatar image

Alex Ott
7514

I would definitely check out An introduction to Machine Learning by Ethem Alpaydin. It is the book that we used in our machine learning class and provides a good introduction to a lot of ML techniques.

It combines statistics and algorithms pretty nicely.

The book alone is not sufficient to become an expert in most of the topics it covers, but is well enough to get you started.

answered Jul 03 '10 at 12:54

levesque's gravatar image

levesque
3353514

edited Jul 03 '10 at 12:54

Many answers here seem to confuse machine-learning and data-mining. While similar, these are different fields, with different focuses and different objectives.

The Bishop and Hastie et.al books are very good machine-learning texts, and Alpaydin is also ok on a more introductory level, but I would NOT recommend them for getting into data-mining.

answered Sep 05 '10 at 08:59

yoavg's gravatar image

yoavg
69671825

“introduction to data mining” by tan, steinbach and kumar is also pretty good, similiar to the weka book but slightly more indepth

answered Aug 08 '10 at 06:42

mat%20kelcey's gravatar image

mat kelcey
51145

+1: never seen a comparison between IDM and weka book.

(Sep 05 '10 at 08:42) Lucian Sasu

I second this answer: recently, I had bought the book and it's really well written.

(Dec 05 '10 at 02:35) Lucian Sasu

And if you want some software to help you get started with some practical work with the minimum of coding ... you could take a look at WEKA:

http://www.cs.waikato.ac.nz/ml/weka/

answered Jul 12 '10 at 10:18

image_doctor's gravatar image

image_doctor
15

Not sure why this comment got a -1 point. Weka is an invaluable tool for data mining. I agree that to understand ML algorithms, one need to read some more theoretical works and implement some of them. But weka gives you a quick way of comparing the performance of numerous algorithms, and on various data as well.

(Dec 03 '10 at 10:27) Denzel
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.