Last week I was lucky enough to be able to attend a presentation by Grant Ingersoll to the RTP Semantic Web Group on Apache Mahout. Mahout is set of technologies designed to tackle machine learning problems in various clever ways, one of which is by using Apache Hadoop (hence the name, Mahout apparently means “elephant driver”).
Grant did a great job explaining the use cases for Mahout, describing how Google news, Netflix recommendations, Amazon suggestions and virtually every other feature advanced web applications have come to rely on machine learning.
While HTML5 has been widely touted as “the future of the web”, I think it’s important to remember that the content has to come from somewhere, no matter how beautifully rendered, and that increasingly the content will be the result of inferences created by software like Mahout. This is yet more incentive for me to continue plowing through Algorithms of the Intelligent Web.
Grant also spoke briefly about his new book Taming Text, which is all about extracting inferences from blobs of text, which strikes me as a highly useful set of related techniques to have on hand, given the amount of data-munging typical software development jobs inevitably require.
Thanks to Phil Rhodes for putting together this excellent presentation.