Answer on @Quora by @xamat to How do I learn machine learning?

http://qr.ae/79GFW0

I didn’t do a PhD on machine learning (was mostly focused on Signal Processing and Software Engineering) so I get this question a lot. The typical person that asks me this question is a software engineer with a computer science background, so I will address it from that perspective. If you are a Math major, for example, my answer might be less useful.

The first thing I tell someone who wants to get into machine learning is to take Andrew Ng’s online course. I think Ng’s course is very much to-the-point and very well organized, so it is a great introduction for someone wanting to get into ML. I am surprised when people tell me the course is “too basic” or “too superficial”. If they tell me that I ask them to explain the difference between Logistic Regression and Linear Kernel SVMs, PCA vs. Matrix Factorization, regularization, or gradient descent. I have interviewed candidates who claimed years of ML experience that did not know the answer to these questions. They are all clearly explained in Ng’s course. There are other online courses you can take after this one such as Mining Massive Datasets or Recommender Systems, but at this point you are mostly ready to go to the next step.

My recommended next step is the following. Get a good ML book (my list below), read the first intro chapters, and then jump to whatever chapter includes an algorithm you are interested. Once you have found that algo, dive into it, understand all the details, and, especially, implement it. In the previous online course you would already have implemented some algorithms in Octave. But, here I am talking about implementing an algorithm from scratch in a “real” programming language. You can still start with an easy one such as L2-regularized Logistic Regression, or k-means, but you should also push yourself to implement more interesting ones such as LDA (Latent Dirichlet Allocation) or SVMs. You can use a reference implementation in one of the many existing libraries to make sure you are getting comparable results, but ideally you don’t want to look at the code but actually force yourself to implement it directly from the mathematical formulation in the book.

So, what are some good books to do this? Many have been mentioned before. Some of my favorite:

You can also go directly to a research paper that introduces an algorithm or approach you are interested on and dive into it.

My main point is that machine learning is both about breadth as depth. You are expected to know the basics of the most important algorithms (see my answer to What are the top 10 data mining or machine learning algorithms?). On the other hand, you are also expected to understand low-level complicated details of algorithms and their implementation details. I think the approach I am describing addresses both these dimensions and I have seen it work.

Advertisements

One Response

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: