Having returned from CHI and the CHI2010 microblog research workshop, I’m jazzed–new problems to tackle, studies to run. In other words, the conference did just what it should; it gave me ideas for new research projects.
One of these projects is time-sensitive (I can’t go into detail because doing so will bias the results. More on that later.) As they put it on the Twitter search page, it’s what’s happening right now. More seriously, the questions need to run within a few days of CHI’s end. But the study will involve asking real people a few questions. For a researcher at a university, this means that I must get human subjects approval from my local institutional review board (IRB).
It’s easy to kvetch about IRB’s. See the Chronicle of Higher Ed’s piece, Do IRB’s Go Overboard? In fact, I’ve found the IRB officers at my institution to be extremely helpful, so I’m not going to kvetch (thinking of strategic ways of posing IRB applications recently led me to the very interesting IRB Review Blog that offers nuanced, substantive reflections on the subject).
As anyone who has sat through a university’s research ethics training knows, IRB’s were created in the wake of several odious and damaging studies. This motivation is clear and impeccable.
But for those of us working in research related to information use, especially in domains such as IR, HCI, and social informatics broadly construed, the risk of damage or exploitation of subjects is often (though not always; privacy issues can be problematic) minimal.
But more interestingly, I think our work challenges the basic model that underpins contemporary research practice in the university.
My point in writing this post, is not to argue that we should occupy a rarefied, unsupervised domain. But recently I’ve dealt with several particular matters that suggest that research on information behavior (mostly HCIR work) pushes some matters to the fore that I think will soon be more general. The following is a brief list. I invite elaborations or arguments.
- crowd-sourced studies. Services like Amazon’s Mechanical Turk offer a huge opening for IR research, as an upcoming SIGIR workshop makes clear. What is the status of turkers with respect to human subjects approval? In a future post I’ll describe in detail my own experience shepherding an MTurk-based study through university approval channels.
- search log analysis. This isn’t a new problem wrt to IRB, and it definitely does raise issues of privacy. But I wonder where more broadly informed studies of user behavior fit into this picture. As an example, I was recently given permission to use a set of query logs without human subjects approval. These logs were already in existence; I got them from a third party. However, in a new study I want to collect logs from my own system. Initial interaction with IRB led to the decision that this work must go through the application process. Likewise, clickthrough data raised red flags.
- real-time user studies. As I mentioned above, I’m in a situation where I need to collect information (essentially survey data) from Twitter users now. Until very recently the subject of this “survey” didn’t exist, and it won’t exist in any meaningful sense for long. I anticipate that this issue will be common for me, and perhaps for others.
Again, my point in writing this is not to say that I should have carte blanche to do research outside of normal channels. What I am saying is twofold:
- Research on information interactions is pushing the limits of the current human subjects/IRB model used by most universities. This is evidenced by unpredictable judgments on the status of projects.
- I think the community of researchers in “our” areas would do well to consider strategies for approaching IRB and other institutional hurdles. We don’t want to game the system. But I think the way we describe the work we do has an impact on the status of that work. If current models are going to change, it would be great if we could (by our interactions with relevant officers) influence those changes in a positive way.
Yesterday’s microblogging workshop at CHI2010 was great, as those of you following #CHImb on twitter already know. All of the participants brought interesting ideas–too many to list here. So I’m just going to focus on a few themes/results that relate most closely to IR. I highly recommend browsing the list of accepted papers to see for yourself the many, many interesting contributions.
First, I’ll mention that Gene Golovchinsky did a wonderful job presenting our paper on making sense of twitter search. Gene has posted his slides and some discussion of the workshop. The questions we posed in the paper and the presentation were:
- What information needs do people actually bring to microblog search?
- What should a test collection for conducting research on microblog search look like?
Instead of dwelling on our own contribution, though, I want to offer a recap of some of the work of other people…
I was especially interested in work by several researchers from Xerox PARC.
Michael Bernstein showed a system, eddi, that helps readers who follow many people manage their twitter experience, avoiding information overload via intelligent filtering on several levels. Ed Chi introduced FeedWinnower, another ambitious system for managing twitter information. I was especially interested in Bongwon Suh‘s talk. He focused on the role that serendipity plays (or should play) in twitter search. He suggested that search over microblog data (I know, microblog is not equal to twitter) benefits from serendipity. Of course only certain types of serendipity are valuable in this context (he said something to the effect of courting previously unknown relevance).
Another really interesting paper (and an interesting conversation over lunch) came from Alice Oh. The paper focused on using people’s list memberships to induce models of their interests and expertise. I think Alice’s paper speaks to the challenge of finding sources of evidence for information management in microblog environments.
With respect to IR and microblogging, I came away with from the workshop with new questions and with a keener edge on questions I already had. Here’s a very abbreviated list of some challenges that researchers in this area face.
information needs: What types of information needs are most germane in this space? Are users interested in known-item search, ad hoc retrieval, recommendations, browsing, something completely new?
unit of retrieval: Of course this goes back to the matter of information needs (as do all of the following points). Certainly the task at hand will sway exactly what it is that systems should show users. But my sense is that some sort of entity search is almost always likely to be of more value than treating an individual tweet as a ‘document.’ i.e. Search over people, conversations, communities, hashtags, etc. will, I think, lend more value than tweets taken out of context.
data acquisition and evaluation: It’s easy to get lots of twitter data; just latch onto the garden hose and go. In some cases, data from the hose may be perfectly useful for research and development. Do we need or want formal test collections of this type of data? If so, what should they look like? How does obsolescence figure into creating a test collection of de facto ephemeral data? And of course, there’s probably more to ground truth the mechanical Turk.
objective functions: In the arena of microblog search, what criteria should we use to rank (if we ARE ranking) entities? Certainly twitter’s own search engine sees temporality as paramount. As always, relevance is dicey here–a murky mixture of topicality, usefulness, trustworthiness, timeliness, etc.