Swimming In The Sea Of Machine Learning

Artificial Intelligence / Data Science / Machine Learning

There are thousands of machine learning algorithms. How should one decide to what to use, especially when starting out in data science? In all truth, there is probably no shortcut for experience. However, there are many useful resources out there to get started. Scikit-learn does an algorithm cheat-sheet, as does dlib. If you have £300 to spare and want a deep dive into the theory, The Encyclopedia of Machine Learning is an option. Then there are good libraries, like Weka, Scikit-learn, PyBrain, Mahout and lots of repositories on GitHub. A neat exercise could be comparing 3 classifier models using the same datasets. Not only it is fairly straight forward to implement, it allows you to gain some understanding of the models. It is also a nice little project to add to your data science portfolio.

Machine learning has a rich and long history. I was quite surprised it started in 1642! It is still not perfect and there is a great deal of improvement needed. To give you an example: go to Translation Party and enter an English phrase. After a few iterated translations using Google's automated translator (converting into Japanese, then English, then Japanese etc) you might reach an equilibrium but even then, the result sentence rarely resembles your original sentence. Other interesting quirk of machine learning is talked about by Scott Locklin in his blog post on some neglected machine learning ideas.

Share on: Twitter, Facebook, LinkedIn or Google+