Introducing a Weighted Ontology to Improve the Graph-Based Semantic Similarity Measures

Roberto Saia, Ludovico Boratto, and Salvatore Carta
Dipartimento di Matematica e Informatica, Università di Cagliari, Cagliari, Italy
Abstract—The semantic similarity measures are designed to compare terms that belong to the same ontology. Many of these are based on a graph structure, such as WordNet, a well-known lexical database that groups the words into sets of synonyms called synsets. The literature shows several ways to determine the similarity between words or sentences through WordNet, but almost all of them do not take into account the peculiar aspects of the used dataset. In some contexts this strategy could lead toward bad results, because it considers only the relationship between vertexes of the WordNet semantic graph, without giving them a different weight based on the synsets frequency (i.e., common and rare synsets are valued equally). This could create problems in some applications, such as those of recommender systems, where WordNet is exploited to evaluate the semantic similarity between the textual descriptions of the items positively evaluated by the users, and the other ones not evaluated yet. In this context, we need to identify the user preferences as best as possible, and not taking into account the synsets frequency, we risk to not recommend certain items to the users, since the semantic similarity generated by the most common synsets present in the description of other items could prevail. We face this problem by introducing a novel criterion of evaluation of the similarity between terms that exploits WordNet, adding to it the weight information of the synsets. The effectiveness of the proposed strategy is verified in the recommender systems context.
Index Terms—semantic graph, semantic analysis, ontology, graph theory, metrics

Cite: Roberto Saia, Ludovico Boratto, and Salvatore Carta, "Introducing a Weighted Ontology to Improve the Graph-Based Semantic Similarity Measures," International Journal of Signal Processing Systems, Vol. 4, No. 5, pp. 375-381, October 2016. doi: 10.18178/ijsps.4.5.375-381
