I
didn't do a PhD on machine learning (was mostly focused on Signal
Processing and Software Engineering) so I get this question a lot. The
typical person that asks me this question is a software engineer with a
computer science background, so I will address it from that perspective.
If you are a Math major, for example, my answer might be less useful.
Take an online course
The first thing I tell someone who wants to get into machine learning is to take Andrew Ng's online course.
I think Ng's course is very much to-the-point and very well organized,
so it is a great introduction for someone wanting to get into ML. I am
surprised when people tell me the course is "too basic" or "too
superficial". If they tell me that I ask them to explain the difference
between Logistic Regression and Linear Kernel SVMs, PCA vs. Matrix
Factorization, regularization, or gradient descent. I have interviewed
candidates who claimed years of ML experience that did not know the
answer to these questions. They are all clearly explained in Ng's
course. There are many other other online courses you can take after
this one (see My answer to What is the best MOOC to get started in Machine Learning?) but at this point you are mostly ready to go to the next step.
Implement an algorithm
My
recommended next step is the following. Get a good ML book (my list
below), read the first intro chapters, and then jump to whatever chapter
includes an algorithm you are interested. Once you have found that
algo, dive into it, understand all the details, and, especially,
implement it. In the previous online course you would already have
implemented some algorithms in Octave. But, here I am talking about
implementing an algorithm from scratch in a "real" programming language.
You can still start with an easy one such as L2-regularized Logistic
Regression, or k-means, but you should also push yourself to implement
more interesting ones such as LDA (Latent Dirichlet Allocation) or SVMs.
You can use a reference implementation in one of the many existing
libraries to make sure you are getting comparable results, but ideally
you don't want to look at the code but actually force yourself to
implement it directly from the mathematical formulation in the book.
Some book recommendations
So, what are some good books to do this? Many have been mentioned before. Some of my favorite (see my answer to What are the best books about machine learning? for more details):
- Kevin Murphy's Machine learning: a Probabilistic Perspective
- Hastie, Tibshirani, and Friedman's The Elements of Statistical Learning
- Bishop's Pattern Recognition and Machine Learning
- David Barber's Bayesian Reasoning and Machine Learning
- Larry Wasserman's All of Statistics: A Concise Course in Statistical Inference (more details on this book in my edit below)
You can also go directly to a research paper that introduces an algorithm or approach you are interested on and dive into it.
My
main point is that machine learning is both about breadth as depth. You
are expected to know the basics of the most important algorithms (see
my answer to What are the top 10 data mining or machine learning algorithms?).
On the other hand, you are also expected to understand low-level
complicated details of algorithms and their implementation details. I
think the approach I am describing addresses both these dimensions and I
have seen it work.
Ready for a career in Machine Learning?
The
next logic step some people ask about is whether they should now be
ready to start a career in machine learning. That is, of course, a
different question. Please refer to amy answer to How should one start a career in machine learning? for that.
Edit 08/26/2015
In
response to some comments and questions, I feel that I should add
another book recommendation. If you feel like you lack some background
in Statistics, I would totally recommend:
- Larry Wasserman's All of Statistics: A Concise Course in Statistical Inference
No comments:
Post a Comment