Exploring Crowdsourcing to PersonalizeWeb Experiences

Author: Deepti Aggarwal
Date: 2013-12-03
Report no: IIIT/TH/2013/58
Advisor:Vasudeva Varma,Venkatesh Choppella

Abstract

With the rapid advancements of communication and computational technology, theWorldWideWeb has witnessed a rapid growth in the user-generated content with more and more users actively creating, publishing and sharing content over the web. As a result, the web is now overloaded with information on varied topics contributed by diverse set of contributors. This phenomenon has given rise to the “big data” wherein lies a key problem of intelligently extracting the most relevant and accurate information specific to a user. With this, it is essential to provide more personalized web experiences to the user, where every user query is catered and satisfied according to her preferences. However, doing so would require extraction and understanding of the context and semantics of the content, which currently is not readily available. Moreover, automated systems show limited capabilities in performing the same task. This thesis is an attempt to utilize collective human intelligence to support extraction and understanding of the content over the web, which will in turn help to create personalized web experiences. In particular, we propose crowdsourcing based systems for the following tasks: 1) extracting user preferences, 2) extracting named entities, and 3) renarration of the web documents. First, we propose a friend sourcing based approach called as Crowd Consensus where we extract user preferences from the collected opinions from her friends and tested it with an online game called as Power of Friends. The current method of eliciting information is to pose direct questions to friends and expect a truthful response in return. Power of Friends, on the other hand, involves a novel way of identifying the unanimous opinion of all the friends about a question related to an individual. Next, we describe a system called as uPick, which extracts named entities and their relations from a given text and crowdsource these extracted named entities for validation. The existing systems built around the task of identifying associated relations among named entities within a text document lack human precision and they also struggle to handle erroneous documents. uPick helps to improve the accuracy of the generated relations by gathering judgments from the interested users and validate the relations based on the majority responses. Finally, we worked on a renarration framework to the web called as Alipi to make the multi-lingual documents accessible to the users. This framework supports alternative descriptions for a web page or parts of it via rewriting for a given target audience by volunteers. We developed a browser plugin to enable users to re-narrate any page and to render the requested page dynamically based on the user preferences. Our developed prototypes along with the studies show that leveraging human energy and skills have potential to provide solution to the problems that machines cannot accomplish solely. We hope our work would inspire system designers to consider crowdsourcing based systems for creating personalizing web experiences and to think beyond system efficiency and accuracy by focusing on the task experience and invested efforts by the users.

Full thesis: pdf

Centre for Software Engineering Research Lab

IIIT Hyderabad Publications

Exploring Crowdsourcing to PersonalizeWeb Experiences

Abstract