Amharic political sentiment analysis using deep learning approaches Scientific Reports

A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM Scientific Reports

semantic analysis of text

Only the input and forget gates are present in this bidirectional recurrent neural network. We placed the most weight on core features and advanced features, as sentiment analysis tools should offer robust capabilities to ensure the accuracy and granularity of data. We then assessed each tool’s cost and ease of use, followed by customization, integrations, and customer support. Sentiment analysis is a subset of AI, employing NLP and machine learning to automatically categorize a text and build models to understand the nuances of sentiment expressions.

Another potential challenge in translating foreign language text for sentiment analysis is irony or sarcasm, which can prove intricate in identifying and interpreting, even for native speakers. Irony and sarcasm involve using language to express the opposite of the intended meaning, often for humorous purposes47,48. For instance, a French review may use irony or sarcasm to convey a negative sentiment; however, individuals ChatGPT App lacking fluency in French may struggle to comprehend this intended tone. Similarly, a social media post in German may employ irony or sarcasm to express a positive sentiment, but this could be arduous to discern for those unfamiliar with language and culture. To accurately identify sentiment within a text containing irony or sarcasm, specialized techniques tailored to handle such linguistic phenomena become indispensable.

Original Imbalanced Data

In terms of semantic subsumption, the results of both Wu-Palmer Similarity and Lin Similarity in Table 2 indicate that verbs in CT are less similar to their root hypernyms than those in ES. As a result, they seem to have a deeper average semantic depth and a higher level of explicitness than verbs in ES. The results of Mann-Whitney U tests indicate statistically significant results, implying that verbs in CT show a quite pronounced characteristic of explicitation in terms of semantic subsumption. To begin with, Leneve’s tests were conducted on each index to see whether there was a homogeneity of variance. The results in Table 1 indicate that there are unequal variances between ES and CT for all indices.

  • To tokenize Urdu text, spaces between words must be removed/inserted because the boundary between words is not visibly apparent.
  • CNN models use convolutional layers and pooling layers to extract features, whereas Bidirectional-LSTM models preserve long-term dependencies between word sequences22.
  • On the other hand, all the syntactic subsumption features (ANPV, ANPS, and ARL) for A1 and A2 in CT are significantly lower in value than those in ES.
  • The rise in increasing popularity of social media has led to a surge in trolling, hostile and insulting comments, which really is a significant problem in terms of the good and bad effects that a communication can have on a person or group of people.

Noteworthy studies include Shen et al.13 for IMDB movie reviews, Zhou et al.14 for Chinese product reviews, Alharbi15 for Arabic datasets, and Ref.16 for Afaan Oromo datasets. Meena et al.17, proposes an effective sentiment analysis model using deep learning, particularly the CNN strategy, to evaluate customer sentiment from online product reviews. The findings suggest the potential for using online reviews to inform future product selections.

Text classification techniques

Words with different semantics and the same spelling have the same representation. And synonym words with different spelling have completely different representations28,29. Representing documents based on the term frequency does not consider that common words have higher occurrence than other words, and so the corresponding dimensions are defined by much higher values than rare but discriminating words. Term weighting techniques are applied to assign appropriate weights to the relevant terms to handle such problems.

semantic analysis of text

Favored by experienced NLP developers and beginners, this toolkit provides a simple introduction to programming applications that are designed for language processing purposes. This functionality has put NLP at the forefront of deep learning environments, ChatGPT allowing important information to be extracted with minimal user input. This allows technology such as chatbots to be greatly improved, while also helping to develop a range of other tools, from image content queries to voice recognition.

The newspaper defined stability as the CPC’s instrument of dominance in a direct manner. In addition, the report criticized the CPC’s imprisonment of dissidents and highlighted international criticism of this practice in the non-restrictive attributive clause introduced by “which”. In Extract (3), the newspaper describes how strengthening China-US relations has contributed to peace and stability in Asia. The use of the superlative adjective “longest” to emphasize the positive effects of the Shanghai communique between China and the US is noteworthy.

Created by Facebook’s AI research team, the library enables you to carry out many different applications, including sentiment analysis, where it can detect if a sentence is positive or negative. An open-source NLP library, spaCy is another top option for sentiment analysis. The library enables developers to create applications that can process and understand massive volumes of text, and it is used to construct natural language understanding systems and information extraction systems. Many sentiment analysis tools use a combined hybrid approach of these two techniques to mix tools and create a more nuanced sentiment analysis portrait of the given subject.

A Practitioner’s Guide to Natural Language Processing (Part I) — Processing & Understanding Text

It’s a Stanford-developed unsupervised learning system for producing word embedding from a corpus’s global phrase co-occurrence matrix. The essential objective behind the GloVe embedding is to use statistics to derive the link between the words. BERT can take one or two sentences as input and differentiate them using the special token [SEP]. You can foun additiona information about ai customer service and artificial intelligence and NLP. The [CLS] token, which is unique to classification tasks, always appears at the beginning of the text17.

Latent Semantic Analysis & Sentiment Classification with Python – Towards Data Science

Latent Semantic Analysis & Sentiment Classification with Python.

Posted: Tue, 11 Sep 2018 04:25:38 GMT [source]

As shown above, quantum modeling approach has unique advantage in addressing this challenge. Volumes of textual data, piling beyond capacity of human cognition, motivate development of automated methods extracting relevant information from corpuses of unstructured texts. As ensuring relevance requires prognosis of the user’s judgment, effective algorithms are bound, in some form, to simulate human-kind linguistic practice. This is an unsolved challenge, complexity of which was recognized long before computer age1,2,3,4. Now, with reading and writing texts turned into a massive and influencing part of creative human behavior, the problem is brought to the forefront of information technologies.

Innovative approaches to sentiment analysis leveraging attention mechanisms

Its deep learning capabilities are also robust, making it a powerful option for businesses needing to analyze sentiments from niche datasets or integrate this data into a larger AI solution. Substantial evidence for syntactic-semantic explicitation, simplification, and levelling out is found in CT, validating that translation universals are found not only at the lexical and grammatical levels but also at the syntactic-semantic level. On the other hand, explicitations are also found consistently as both S-universal and T-universal for certain specific semantic roles (A0 and DIS), which reflects the influence of socio-cultural factors in addition to the impact of language systems. These findings have further proved that translation is a complex system formed by the interplay of multiple factors (Han & Jiang, 2017; Sang, 2023), resulting in the diversity and uniqueness of translated language. With the development of social media and video websites, user comments are rapidly increasing in quantity and diversity of forms. Danmaku is a new type of user-generated comment that scrolls in different positions on the video screen1, Users communicate with video producers and other users by posting danmakus containing emotions such as praise, sarcasm, ridicule, criticism, and compliments2,3.

semantic analysis of text

Consequently, these two roles are found to be shorter and less frequent in both argument structures and sentences in CT, which is in line with the above-assumed “unpacking” process. For semantic subsumption, verbs that serve as the roots of argument structures are evaluated based on their semantic depth, which is assessed through a textual entailment analysis based on WordNet. The identification of semantic similarity or distance between two words mainly relies on WordNet’s subsumption hierarchy (hyponymy and hypernymy) (Budanitsky & Hirst, 2006; Reshmi & Shreelekshmi, 2019). Therefore, each verb is compared with its root hypernym and the semantic distance between them can be interpreted as the explicitness of the verb.

The findings of this investigation suggest that the successful transfer of sentiment through machine translation can be accomplished by utilizing Google and Google Neural Network in conjunction with Geofluent. This achievement marks a pivotal milestone in establishing a multilingual sentiment platform within the financial domain. Future endeavours will further integrate language-specific processing rules to enhance machine translation performance, thus advancing the project’s overarching objectives. HyperGlue is a US-based startup that develops an analytics solution to generate insights from unstructured text data.

As a result, using a bidirectional RNN with a CNN classifier is more appropriate and recommended for the classification of YouTube comments used in this paper. Search engines use semantic analysis to understand better and analyze user intent as they search for information on the web. Moreover, with the ability to capture semantic analysis of text the context of user searches, the engine can provide accurate and relevant results. All in all, semantic analysis enables chatbots to focus on user needs and address their queries in lesser time and lower cost. Relationship extraction is a procedure used to determine the semantic relationship between words in a text.

semantic analysis of text

The study found that the strongest biases were observed in areas related to immigration, the environment, and specific types of regulation. By mining the comments that customers post about the brand, the sentiment analytics tool can surface social media sentiments for natural language processing, yielding insights. Sentence-level perception and semantic analysis described above can be scaled to paragraphs, chapters, whole texts, and even larger structures, addressing the problem of computational scalability95,148,149. For example, perception of the text as a bag of paragraphs can be accounted by exactly the same model that works with words and sentences. In that way, hierarchical semantic structure of information representation, typical to human cognition9,150, can be accessed. Quantitative models of natural language are applied in information retrieval industry as methods for meaning-based processing of textual data.

  • Each element is designated a grammatical role, and the whole structure is processed to cut down on any confusion caused by ambiguous words having multiple meanings.
  • The findings of this research can be valuable into various domains, such as multilingual marketing campaigns, cross-cultural analysis, and international customer service, where understanding sentiment in foreign languages is of utmost importance.
  • The top two entries are original data, and the one on the bottom is synthetic data.

The data that support the findings of this study are available from the author Barbara Guardabascio upon reasonable request. This refers to the numerical data resulting from the analysis of the news articles and the trained BERT models. However, the authors are not allowed to share the raw news data provided by Telpress International B.V. These data are the property of the company, and the authors have deleted them after the analysis.

Scroll to Top