Last quarter I took a NN/ML course outside of my department (I am currently in MS Applied Statistics -- I may end up in this department for PhD CS studies after completing my current program though) because I am interested in the topic overall and I had previous experience with genetic algorithms.

In the course I took we used the Mitchell text. I also used that opportunity grab a few other texts to dive into some topics in greater depth and get exposure to other topics entirely. Namely Hastie, Tibshirani, Friedman text, Bishop's NNPR text, Koller's Probabilistic Graphical Models text, and Kolaczyk's Statistical Analysis of Network Data text.

Now based on the TOC in PRML I feel like these other books cover much if not all of the topics in this other Bishop text. Would you agree that that is a fair assessment? I however know that PRML is a very popular text for those in computer science and I've seen it mentioned a few times on this site. So those of you familiar with the text's content do you feel I'd still be missing out on something without the other Bishop text, or can I get by without?

Many of these texts have a lot of overlapping content so it is interesting to hear essentially the same thing stated several different (and sometimes not so different) ways, but I would also be interested in recommendations of other texts that are more advanced since essentially all of these text books come with the standard "targeted for advanced undergraduates and beginning graduate students".

Don't get me wrong I understand that the cutting edge will be found in journals and I have been reading through many of them (COLT, AAAI, IEEE, etc), but there has to be at least a couple of text books that are more advanced than the ones already mentioned. In our class the Mitchell text was used to minimize the mathematical exposure and PRML was suggested for heavier math content. I am however interested in texts that may have even more advanced mathematics still. Potential topics of interest in a more advanced text: more on ensemble methods, asymptotic properties, and stability. Of course other advanced topics are welcomed as well. I mean I know I can find out a lot about the later two in actual math and stats texts, but I was hoping to see more about them in the context of machine learning.

I have a paper (PDF) I wrote as my final for that class. It is an overview and could use a lot of details added to it (it more than satisfied the requirements of the final though), but I was trying to present the basics to a wider audience so I tried not to go into the formulations and left it to the references for those details. Any thoughts or references to the questions I put forward at the end would be very much appreciated. Even general feedback on the paper would be nice.

(Sorry for it not being concise, but I figured one lengthy post in this case might make more sense than breaking it up into numerous more focused posts -- though I suppose I'm about to find out whether you all agree with that or not ;) )

This question is marked "community wiki".

asked Apr 26 '11 at 04:52

Chris%20Simokat's gravatar image

Chris Simokat

The cutting edge is actually more findable in conferences instead of journals. I think Bishop's text covers the more usual applications of bayesian methods (related with graphical models) for ML better than the koller & friedman text, which is far heavier in the graphical-model mechanics per se and doesn't stop in many details about its applications to machine learning.

No textbook is complete, and no perspective is complete, so I'm not sure you should be thinking about these textbooks in terms of "missing out on something" or "getting by without". PRML is an interesting resource for its broad overview of almost all of machine learning at the time it was written, but it also doesn't cover large parts of the literature on structured learning, online learning, learning theory, for example.

(Apr 26 '11 at 05:50) Alexandre Passos ♦

As always Alexandre I appreciate reading your contribution. I get what you're saying in general, but some of my interests in reading the texts themselves is to look at historic aspects to how the techniques were developed to begin with. So like in Statistics I have a tendency to not be thrilled with books that skip over talking about Guinness when talking about Gosset or introducing the T-distribution, neglecting Lady Tasting Tea when talking about exact testing or Fisher in general, or the Isis data in multivariate. I suppose that's just personal preference since I think such anecdotes help people connect with the material, but that is what I meant when I was talking about missing out that and if there is some excellent proof of something in one text that is not present in another -- sort of like comparing Wackerly to Cassella. As for skipping it, it just seemed between the Hastie book (which is more recent than PRML) and the other Bishop book I own in particular that there wasn't exactly any different topical coverage at least at a glance. But I do think based on you and the other posts that when I do get a chance I will grab a copy for references sake and if for nothing else the works it cites in its references. I guess I just wanted to know if it was like a 3-alarm fire to not have it immediately! ;)

(Apr 26 '11 at 21:00) Chris Simokat

2 Answers:

Hello Chris,

I'll try to answer as concise and useful as possible.

First, you'll need the basics of statistics, which I'm guessing you are getting from your masters. For this there is a number of books and papers that are useful. And actually most statistics you'll ever need are already well developed. Most modern models where described in the 70's, 80's. Blei actually said that if something was not figured out in the 80's it was a really difficult problem.

Given the list of books you presented, I do recommend you the Bishop NPML Book, look at it as the updated version of Mitchell's, I do like his book, the problem is that it is somewhat old and it lacks a lot of the current and widely used algorithms. Mitchell does not go into Variational Inference or SVM's for example.

With that said, Machine Learning is a wide field, and you could delve among one single theme for your entire PhD, and perhaps your entire Research Life. It is hard to find someone who does research on more than a couple of topics.

You need a base book, that'll let you look into most of the algorithms you can choose from, and after that, you'll need another more specialized books. For example, if you are into Non Parametric Models, you might start learning Gibbs Sampling, Distributions and Mixture Models from Bishop, and then you'll have to read Ghosh's "Bayesian Non Parametrics" Book.

Pick a topic and then start thinking on the books, otherwise you might end up spending a lot of time reading a book that is not closely related to your research (which isn't a bad thing at all, but if you are in a PhD, the last thing you want to do is read something that won't be as useful as it should be)

Some conferences to look into:

  • For theory (Really heavy math): NIPS, ICML, COLT
  • For Applications (So so math, cool apps): KDD (This is a bit of theory as well), ICMLA, IROS, ICRA (Last 2 are on robotics)

If you wish to learn the latest on GA try GECCO, that is the top conference in the topic.

Hope it helps you


This answer is marked "community wiki".

answered Apr 26 '11 at 07:08

Leon%20Palafox's gravatar image

Leon Palafox ♦

Leon thanks for the list of conferences. I really appreciate you taking the time to answer.

(Apr 26 '11 at 22:00) Chris Simokat

Leon's answer is pretty good. I'll just add a bit. The problem I have is not collecting reading material, but actually getting the reading done. You can find many very good textbooks online these days. There is a post dedicated to them here on Metaoptimize. They will be more than sufficient to fill the gaps between the books you have already collected for the foundational topics. I would advise spending your time learning from the materials you already have, and then, as Leon suggested, finding the more specialised texts to pursue your interests in more depth.

This answer is marked "community wiki".

answered Apr 26 '11 at 08:55

Noel%20Welsh's gravatar image

Noel Welsh

Noel I appreciate the link to the other post. I will check it out.

(Apr 26 '11 at 22:02) Chris Simokat
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.