Research award for microblog searchPosted: July 23, 2010
When it rains it pours.
After the exciting news that Google funded my application to their digital humanities program, I found out this week that they will also fund another project of mine (full list): Defining and Solving Key Challenges in Microblog Search. The research will focus largely on helping people find and make sense of information that comes across Twitter.
Over the next year the project will support me and two Ph.D. students as we address (and propose some responses to) questions such as:
- What are meaningful units of retrieval for IR over microblog data?
- What types of information needs do people bring to microblogging environments and how can we support them? Is there a place for ad hoc IR in this space? If not (or even if so) what might constitute a ‘query’ in microblog IR?
- What criteria should we pursue to help people find useful information in microblog collections? Surely time plays a role here. Topical relevance is a likely suspect, as are various types of reputation factors such as TunkRank (and here).
- How does microblog IR relate to more established IR problems such as blog search, expert finding, and other entity search issues?
For me, one of the most interesting issues at work in microblog IR is: how can we aggregate (and then retrieve) data in order to create information that is useful once collected but that might be uninteresting on its own?
Is it useful to retrieve an individual tweet that shares keywords with an ad hoc query? Maybe. But it seems more likely that people might seek debates, consensus, emerging sub-topics, or communities of experts with respect to a given topic. These are just a few of the aggregates that leap to mind. I’m sure readers can think of others. And I’m sure readers can think of other tasks that can help move microblog IR forward.
In case anyone wonders how this project relates to the other work of mine that Google funded (which treats retrieval over historically diverse texts in Google Books data), the short answer is that both projects concern IR in situations where change over time is a critical factor, a topic similar to what I addressed in a recent JASIST paper.