IIIT Hyderabad Publications |
|||||||||
|
Semantic Textual Similarity For HindiAuthors: darshan.agarwal ,vandan.mujadia ,Radhika Mamidi,Dipti Misra Sharma Conference: 18th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing-2017 2017) Location Budapest, Hungary Date: 2017-04-17 Report no: IIIT/TR/2017/27 AbstractSemantic textual similarity is the degree of equivalence between the two sentences semantically. We may also say, it is the ability to substitute one text for the other without changing its meaning. In this paper, we propose rule based and supervised systems which measure the semantic relatedness between two Hindi sentences on the scale of 0 (least similar) to 5 (most similar). Both systems make use of several syntactico-semantic features such as language specific linguistic characteristics, distributional semantics and dependency clusters. With several constraints on these features, our rule based system is able to achieve around 75.23% accuracy on Hindi news similarity corpus. In supervised approach, we use support vector machine (SVM) with above mentioned features and euclidean distance between dependency clusters to derive word level alignments. Later, we use these alignments to assign similarity score between two sentences. With this approach we are able to achieve considerable accuracy on a small set of our corpus. Full paper: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |