Towards Understanding the Credibility of Textual Claims

Author: Rajat Singh
Date: 2018-09-27
Report no: IIIT/TH/2018/93
Advisor:Manish Shrivastava

Abstract

Text articles propagating false news have lately become annoying for the Internet users. These articles circulate quickly in the social media platforms and make it difficult for users to bifurcate actuality from fiction. Various social media platforms and instant messaging applications have become the source of these kinds of text articles. Since a large audience is active on these platforms, it becomes easy to propagate media content among the active users. These text articles range from news content belonging to international news to hyperlocal journalism which is propagated for financial or political gains for some organization or community. Users on the Internet easily believe any kind of shared news on such social media platforms. Very few try to verify and then share further these news articles. But many tend to blatantly accept and proceed further to share on various social media and instant messaging platforms because most of the time false news tend to be more interesting with misleading facts. Lately, the scientific community has shown a keen interest in finding better computational methods to segregate such misleading facts from a lot of news articles. Past work on computational credibility analysis has concentrated on fact examination and linguistic features. The primary focus of the credibility analysis task is to learn features which differentiate the false claims from the true claims by finding the conflicting viewpoints. This task of news verification and scoring have been attempted in various ways ranging from traditional NLP methods to machine learning approaches. This work describes a novel architecture and methodology of Credibility Outcome (CREDO) which processes a given text article and gives the trustworthiness score of a textual claim in an open domain setting. CREDO constitutes of various modules which capture different characteristics responsible for the credibility of the textual claim. These characteristics include reliability of the article’s source and writer, semantic comparability between the article and related dependable articles recovered from an information base and the opinion expressed in the textual claim. All these characteristics are learned from various modules and the scores are fed into a neural network architecture which learns the features from these modules pertaining to the overall credibility of the textual claim. The noisy text is also attempted to normalize by using a novel normalizing approach using distributed semantics and ranking approaches to find normalized word during the preprocessing of text, in case the system receive noisy text article as input. Various experiments are conducted on a popular dataset which exhibits that CREDO outruns the state-of-the-art approaches based on linguistic features

Full thesis: pdf

Centre for Language Technologies Research Centre

IIIT Hyderabad Publications

Towards Understanding the Credibility of Textual Claims

Abstract