by Sentimenti Team | May 3, 2021 | Scientific publications
Place of publication:
Information Processing & Management, 2021
Title:
Mapping WordNet onto human brain connectome in emotion processing and semantic similarity recognition
Authors:
Jan Kocoń, Marek Maziarz
Abstract:
In this article we extend a WordNet structure with relations linking synsets to Desikan’s brain regions. Based on lexicographer files and WordNet Domains the mapping goes from synset semantic categories to behavioural and cognitive functions and then directly to brain lobes. A human brain connectome (HBC) adjacency matrix was utilised to capture transition probabilities between brain regions. We evaluated the new structure in several tasks related to semantic similarity and emotion processing using brain-expanded Princeton WordNet (207k LUs) and Polish WordNet (285k LUs, 30k annotated with valence, arousal and 8 basic emotions). A novel HBC vector representation turned out to be significantly better than proposed baselines. URL: https://www.sciencedirect.com/science/article/pii/S0306457321000388
Link: ScienceDirect
Citation BibTeX:
@article{kocon2021mapping,
title={Mapping WordNet onto human brain connectome in emotion processing and semantic similarity recognition},
author={Koco{\'n}, Jan and Maziarz, Marek},
journal={Information Processing \& Management},
volume={58},
number={3},
pages={102530},
year={2021},
publisher={Elsevier}
}
by Agnieszka Czoska | Apr 9, 2020 | Sentimenti research
Dr Jan Kocoń is a natural language engineer and the person behind the machine learning process within SentiTool, our solution for analyzing emotions in the text. Dr Kocoń coordinates the work of the linguistics team, integrates individual elements of the tool, and works closely with the IT team.
If you have to describe Sentimenti and the tools to anybody, what would you say first?
Sentimenti is a project meant to analyze emotions hidden in the text. Unlike competitive solutions that recognize the overtones of the text only (positive, neutral or negative), our tools manage to understand the text, assign specific meanings to the words in the text and name the certain emotions people feel about them. These emotions, in turn, provide the knowledge base for a machine learning mechanism that automatically recognizes emotions at the level of sentences and the whole text.
What does it mean that we analyse emotions in the text?
In the research carried out in our project we adapted the Plutchik model. It includes eight basic emotions: joy, sadness, trust, repulsion, expectation, fear, surprise and anger. We are able to estimate to what extent these emotions are expressed in the text.
How do we know what emotions people feel?
The knowledge base that helps our project includes more than 30.000 meanings of words, for which 20.000 unique respondents assign ratings for overtones and emotions. We are talking about “meanings” and not “words” on purpose, because words are ambiguous; for example “dark” means something different in “dark blue” or “dark people” and only in the latter case it carries emotions. Each meaning will ultimately receive 50 marks from different people. This allows us to know what feelings are evoked by certain meanings in the text. However, the emotion of the text is not a simple summation of the emotions assigned to the meanings in the text...
What else makes the emotion analysis tools in the text work?
Two things come to us to help. The first one is our gargantuan database of opinions. It came with associated overtones, derived from different areas: travel, medicine, products, services and more. We have over 10 millions of such texts in our database, which is an excellent source of information about the general feeling of the author. However, in order to find out what emotions a given text evokes in the reader, we also conduct our own research, analogous to research on single meanings.
This time the subject of these studies is the texts. The respondents attribute basic emotions to them, exactly the same way as they do with meanings of the words.
The second pillar of our Sentimenti tool is a combination of various machine learning methods. Experts in natural language processing provide us with tools for text analysis at the syntactic and semantic level, additionally they create rules for the analysis of meanings in context such as: negation, conjecture, weakening or strengthening of the overtones, etc. This is an additional help for automatic methods, such as deep neural networks, which are used to make the right conclusions about the emotions in the measured text.
What do you think automatic emotion analysis can be useful for?
Ultimately, I see many applications for our tools. The very first area that comes to my mind would be the marketing, or, more precisely, display advertising. This certain area covers the market of advertisements displayed in the context of web articles and is matching them with the emotions that the text of the publication evokes in readers. For example, in a sad text there could be an advertisement of an insurance company, and in a merry, joyful text there could be an advertisement for a trip.
Another area that we could cover is brand monitoring, i.e. analyzing how companies’ customers write on the Internet about a given company, its products and what emotions accompany them. Another interesting area could be sorting customers’ email complaints against the emotions contained in them, detecting conflicts arising in employee correspondence, detecting upcoming crises in Social Media, and even the possibility of diagnosing mental illnesses – the potential of Sentimenti tools is really huge!
What else do you plan to do in Sentimenti?
So far, there is a prototype ready with a simple text analysis on the level of meanings with an overtone analysis using our huge opinion resources. Currently in the Sentimenti team in Wroclaw I am managing to build a machine learning mechanism. It will make it possible to aggregate both information from the meaning knowledge base and information from the natural language processing stream. We are constantly receiving new data about the feelings of people reading certain texts, which are our teaching collection. The more data we gather, the better the quality of the tool there is.
by Sentimenti Team | Jul 10, 2019 | Conferences, Scientific publications
Place of publication:
Proceedings of the 10th Global Wordnet Conference
Title:
Propagation of emotions, arousal and polarity in WordNet using Heterogeneous Structured Synset Embeddings
Authors:
Jan Kocoń, Arkadiusz Janz
Abstract:
In this paper we present a novel method for emotive propagation in a wordnet based on a large emotive seed. We introduce a sense-level emotive lexicon annotated with polarity, arousal and emotions. The data were annotated as a part of a large study involving over 20,000 participants. A total of 30,000 lexical units in Polish WordNet were described with metadata, each unit received about 50 annotations concerning polarity, arousal and 8 basic emotions, marked on a multilevel scale. We present a preliminary approach to propagating emotive metadata to unlabeled lexical units based on the distribution of manual annotations using logistic regression and description of mixed synset embeddings based on our Heterogeneous Structured Synset Embeddings.
Link: ACL Anthology
Citation BibTeX:
@inproceedings{kocon-janz-2019-propagation,
title = “Propagation of emotions, arousal and polarity in {W}ord{N}et using Heterogeneous Structured Synset Embeddings”,
author = “Koco{\’n}, Jan and
Janz, Arkadiusz”,
booktitle = “Proceedings of the 10th Global Wordnet Conference”,
month = jul,
year = “2019”,
address = “Wroclaw, Poland”,
publisher = “Global Wordnet Association”,
url = “https://www.aclweb.org/anthology/2019.gwc-1.43”,
pages = “336–341”,
abstract = “In this paper we present a novel method for emotive propagation in a wordnet based on a large emotive seed. We introduce a sense-level emotive lexicon annotated with polarity, arousal and emotions. The data were annotated as a part of a large study involving over 20,000 participants. A total of 30,000 lexical units in Polish WordNet were described with metadata, each unit received about 50 annotations concerning polarity, arousal and 8 basic emotions, marked on a multilevel scale. We present a preliminary approach to propagating emotive metadata to unlabeled lexical units based on the distribution of manual annotations using logistic regression and description of mixed synset embeddings based on our Heterogeneous Structured Synset Embeddings.”,
}