Skip to main content

more options

Thorsten Joachims

Professor Sally McKee

Automated Innovation

by Toby Butterfield, student writer

Thorsten Joachims is a man dedicated to making your toaster smarter than you. That’s one way of characterizing at his research, anyway. A less duplicitous way is to say that he’s interested in machine learning. Not in the teach-a-robot-to-love-a-puppy way, but in the engineering way. If software is capable of learning, then it can adapt to its situation in the same way any living thing would. Your applications could actually get better as you use them—personalized searches or spam filters, tailored to your specifications. This is only the beginning of Joachims’ research.

Much of Joachims’ work revolves around improving search engines. Google, the mold from which all modern search engines are patterned, uses the interlinking of the Web to determine relevancy of pages. This algorithm, known as Page Rank, essentially grades pages based on their popularity with other pages already known to be relevant to a query. The circular nature of the algorithm has a kind of insane brilliance, and works wonderfully. But all good things come to pass.

“One problem that’s at least rumored, is that the link structure of the Web is actually degrading. In a nice, organic link structure, people write pages and the hyperlinks they put in can be interpreted as a recommendation. This may be becoming less frequent -- people write pages differently, and spam links to cheat Page Rank.”

As time goes on, we’re going to need new ways to rank the pages on the Web, a task that is really the heart of every search engine. After first blush, there are some other problems with Page Rank as well. Only a very small percentage of the users of the Internet ever make Web pages, and thus the relevancy of pages is determined, in effect, by a small minority. “For example, my Mom. Never written a Web page, probably never will. But she’s on Google every day.”

Joachims’ is testing a way to rank pages that is simultaneously more democratic than Page Rank, and independent of the link structure of the pages.