Matt Hurst wrote a post at Data Mining summarizing some of the arguments from the future of search conference. Here is a summary of his contribution:
Currently, the main stream search engines are doing a poor job of integrating social media in their results. Conversely, blog search engines take no advantage of the main stream web to analyze the influence and content of blogs…. Main stream web can be used to rank social media content, and social media (blog) content could be used to rank the main stream web.
Again, I’m really surprised that a regular web search engine has yet to make use of the real-time efforts of bloggers, editors at Wikipedia, social bookmarkers, members of aggregating sites, etc. Although, in reference to the quote above, I can’t see main stream web helping to rank blog content. Also, I imagine that it would be most useful to have social media results be more of a side offer for a main stream web search (what recent highly visited blog posts use this phrase or link to these pages), rather than as an input for ranking the main stream web results.
After reading around the future of search posts, I found Greg Linden’s blog which has lots of interesting commentary on personalization and recommendation strategies for search. This from a recent post explaining a paper on Google news personalization:
MinHash and PLSI are both clustering methods; a user is matched to a cluster of similar users, then they look at the aggregate behavior of users in that cluster to find recommendations. Covisitation is an item-based method that computes which articles people tend to look at if they looked at a given article (i.e. “Customers who visited X also visited…”).
The Google paper he’s talking about explains why delivering recommendations on recent content is still relatively hard: you have to continually update your model based on recent clicks and still deliver results within a tenth of a second.
I’m worried about personalization roping me in to an area defined by my old interests – I would definitely want a button: ‘Stop personalizing and give general results’. I’d rather have search become more refined to my direct input than try to construct a model of me based on my aggregate data, determine what is true, or act on my behalf.