MODELLING LEXICAL PSYCHOLOGY OF AN INDIVIDUAL

Author: Shivani Poddar
Date: 2016-08-27
Report no: IIIT/TH/2016/52
Advisor:Navjyoti Singh

Abstract

In today’s era social media is our window into creating contextual systems of advanced technology. These systems span a breadth of industries and domains. Whether it is contextual phone assistants like Siri and Crotana or search based advertisements, all of them rely deeply on user data to teach themselves human-like responsiveness. A major chunk of the data available at our disposal is lexical data (primarily from social media platforms). Thus, psychology infered via lexical markers online is an indispensable part of the future of our technology. Lexical psychology and personality are paradigms that have been given immense attention for over a decade. Most state of the art techniques use pre-existing features from social media functions (for instance Number of followers , Likes on Facebook etc. to predict various personality models (such as Big Five , MBTI etc.) that have been around for over 70 years. Parts of our work attempt to tackle this approach of using pre-existing features (available for direct reference from social media). In doing so, we traced the conception of the Big Five model of personality to be concieved by means of extensive lexical analysis (both empirical and theoretical). Based on this knowledge, we illustrate the importance of additional features (informed by the fundamentals of psychology) in training robust personality pre- diction models. Training one such model ourselves, the results we achieved were a clear indicator of the effectiveness of lexical features versus stand-alone technological features to predict user personality. We then move on to tackle the drawbacks of the continued usage of the Big Five model to understand user personality since the last 70 years. At the time this model was proposed by Goldberg et al. the only mode for obtaining relevant user data was by means of psychometric tests. While the databases available to us have advanced significantly, the models used to capture lexical personality from these data reserves remain constant. They make available to us an overall persona of an individual. Contrary to these, abundant literature in the domain of psychology suggests that the persona of an individual is ever evolving. It is dynamic and changes temporally with the individual’s cognition, action and experiences. As opposed to this our technology today is still dependent on such overall personality models (static personality traits from Big Five and other’s of it’s likes). We have by means of this study attempted to discover, formulate and compute a dynamic model of representing the persona of an individual to overcome these limitations. In each of the domains described above an iterative model will elevate the contextual information by a manifold as opposed to the approximate static summarization of an individual’s persona. Thus, inspired by the Abhidhamma scholarship of Buddhism, we proposed PACMAN - the psycho computational model of an individual. The model theoretically represents a stochastic finite state ma- chine that encapsulates the moment by moment mental states of an individual. It is empowered by its genesis which lies in the foundations of Brentano’s western psychological theories, phenomenological thinking and strong parallels with the characteristic biomedical processes that define the psychology of an individual. Another feature which helps this model to carve it’s own niche is it’s close resemblance to a formal system. Much like a formal model is defined by a set of states and their inter state transitions, this model is also defined by it’s moments and transitional rules between these moments. It is adept in establishing a definite schema which can be ubiquitously populated by any given data reserve and helps in visualising the persona depicted. Thus, we first categorize the temporal (time to time) social media data into pre-defined mental states (concieved via Abhidhamma). These mental states are then popu- lated into the respective temporal moments of our automaton. Finally, they are used to draw various inferences about the lexical psychology and/or personalities of the respective social media users. We bring closure to our work by illustrating the functionality of our model in three different use cases. We first use our model to formalise the abstract idea of an individual in the concept of social machines. We also exemplify how our model tackles sparsity and non-uniformity of data in social media. The second use case is to model the psychological phenomenon of Anxiety, its conception, continuation etc. by means of lexical data available online. We finally introduce the lexico-psychological mental engagement factor to capture the maximum engagement of each individual with any given mental state. As opposed to the psychometric methods used to capture such a factor, our model uses lexical features to compute it. We discuss the importance of each of these in contributing towards a deeper understanding of the lexical persona of a an individual (primarily a social media user)

Full thesis: pdf

Centre for Exact Humanities

IIIT Hyderabad Publications

MODELLING LEXICAL PSYCHOLOGY OF AN INDIVIDUAL

Abstract