Title:The Search for Truth in Objective and Subjective CrowdsourcingSpeaker: Matthew Lease (School of Information, University of Texas at Austin)
Date: Tuesday, March 3rd
Room: NSH 1507
One of the most studied problems to date in human computation is how to effectively aggregate responses from different people to infer a "correct", consensus answer. In the first part of my talk, I will discuss how it is that we know so little about progress on this problem despite the enormous research investment that has been made in it. To better gauge field progress, my lab has developed an open source benchmark called SQUARE (ir.ischool.utexas.edu/square) for evaluating relative performance of alternative statistical techniques across diverse use cases. Our findings suggest surprising lack of generality of existing techniques and progress toward a general solution. I argue that if we truly wish to “solve” this problem, current evaluation practices must change. A more fundamental limitation of the consensus-based approaches is the underlying assumption of task objectivity: that a correct answer exists to be found. With subjective tasks, response diversity is valid and desirable, suggesting we cannot merely test for agreement with ground truth or peer responses. As a specific, motivating example, I will discuss the thorny phenomenon of search relevance: what makes search results more or less relevant to different users? Borrowing from psychology, I will describe how psychometrics methodology provides a useful avenue toward both ensuring data quality with subjective tasks in general and gaining new traction on the specific, long-entrenched problem of understanding latent factors underlying search relevance.
Matthew Lease is an Assistant Professor in the School of Information at the University of Texas at Austin, with promotion to Associate Professor effective in Fall 2015. Since receiving his Ph.D. in Computer Science from Brown University in 2010, his research in information retrieval, human computation, and crowdsourcing has been recognized by early career awards from NSF, IMLS, and DARPA. Lease has presented crowdsourcing tutorials at ACM SIGIR, ACM WSDM, CrowdConf, and SIAM Data Mining. From 2011-2013, he co-organized the Crowdsourcing Track for the U.S. National Institute of Standards & Technology (NIST) Text REtrieval Conference (TREC). In 2012, Lease spent a summer sabbatical at CrowdFlower tackling crowdsourcing challenge problems at industry-scale. His research has also been popularly featured in WIRED magazine's "Danger Room".