The chess domain is well-suited for creating an artificial intelligence (AI) system that mimics real-world challenges, including decision-making. Throughout the years, minimal attention has been paid to investigating insights derived from unstructured chess data sources. In this study, we examine the complicated relationships between multiple referenced moves in a chess-teaching textbook, and propose a novel method designed to encapsulate chess knowledge derived from move-action phrases. This study investigates the feasibility of using a modified sentiment analysis method as a means for evaluating chess moves based on text. Our proposed Aspect-Based Sentiment Analysis (ABSA) method represents an advancement in evaluating the sentiment associated with referenced chess moves. By extracting insights from move-action phrases, our approach aims to provide a more fine-grained and contextually aware `chess move'-based sentiment classification. Through empirical experiments and analysis, we evaluate the performance of our fine-tuned ABSA model, presenting results that confirm the efficiency of our approach in advancing aspect-based sentiment classification within the chess domain. This research contributes to the area of game-playing by machines and shows the practical applicability of leveraging NLP techniques to understand the context of strategic games.
棋域非常适合创建一个模仿现实世界挑战的人工智能(AI)系统,包括决策。多年来,在研究棋类数据源的非结构化数据时, minimal attention has been paid to investigating insights derived from unstructured chess data sources. 在这项研究中,我们研究了多个参考移动之间的复杂关系,并提出了一种新方法,旨在封装从移动-动作短语中得出的象棋知识。这项研究调查了使用修正的情感分析方法作为评估基于文本的棋移动可行性的手段。我们提出的基于方面的情感分析(ABSA)方法在评估参考移动的情感方面取得了进展。通过提取来自移动-动作短语的见解,我们的方法旨在提供更加细粒度和具有上下文意识的自定义棋移动情感分类。通过实验和分析,我们评估了我们微调过的ABSA模型的性能,结果证实了我们在棋领域中通过NLP技术推动面向方面的情感分类的有效性。这项研究为机器游戏领域的研究贡献了什么?它展示了利用自然语言处理技术理解战略游戏背景的实际应用。
https://arxiv.org/abs/2405.06499
Inspired by the 'Bias Considerations in Bilingual Natural Language Processing' report by Statistics Canada, this study delves into potential biases in multilingual sentiment analysis between English and French. Given a 50-50 dataset of French and English, we aim to determine if there exists a language bias and explore how the incorporation of more diverse datasets in the future might affect the equity of multilingual Natural Language Processing (NLP) systems. By employing Support Vector Machine (SVM) and Naive Bayes models on three balanced datasets, we reveal potential biases in multilingual sentiment classification. Utilizing Fairlearn, a tool for assessing bias in machine learning models, our findings indicate nuanced outcomes. With French data outperforming English across accuracy, recall, and F1 score metrics in both models, hinting at a language bias favoring French. However, Fairlearn's metrics suggest that the SVM approaches equitable levels with a demographic parity ratio of 0.963, 0.989, and 0.985 for the three separate datasets, indicating near-equitable treatment across languages. In contrast, Naive Bayes demonstrates greater disparities, evidenced by a demographic parity ratio of 0.813, 0.908, and 0.961. These findings reveal the importance of developing equitable multilingual NLP systems, particularly as we anticipate the inclusion of more datasets in various languages in the future.
受到加拿大统计局发布的《双语自然语言处理中的偏见考虑》报告的启发,这项研究深入探讨了英语和法语多语言情感分析之间存在的潜在偏见。在法语和英语各50%的数据集上,我们旨在确定是否存在语言偏见,并探讨在未来的多语言自然语言处理(NLP)系统中,使用更多多样化的数据集可能会如何影响系统的公平性。通过在三个平衡数据集上使用支持向量机(SVM)和朴素贝叶斯模型,我们揭示了多语言情感分类中的潜在偏见。利用Fairlearn,一个用于评估机器学习模型偏见的工具,我们的研究结果表明, nuanced的结论。在两个模型中,法语数据在准确率、召回率和F1分数指标上都优于英语数据,这表明偏好法语的偏见。然而,Fairlearn的指标表明,在三个不同的数据集上,SVM方法具有接近均等对待的语言平等度比值,分别为0.963、0.989和0.985,表明在语言之间存在近乎均等对待的待遇。相反,朴素贝叶斯表现出更大的差异,这通过具有0.813、0.908和0.961的 demographic parity ratio 得出。这些发现表明,开发公平的多语言NLP系统尤为重要,尤其是在我们预计在未来的各种语言数据中包含更多数据时。
https://arxiv.org/abs/2405.06692
Although pre-trained language models have exhibited great flexibility and versatility with prompt-based few-shot learning, they suffer from the extensive parameter size and limited applicability for inference. Recent studies have suggested that PLMs be used as dataset generators and a tiny task-specific model be trained to achieve efficient inference. However, their applicability to various domains is limited because they tend to generate domain-specific datasets. In this work, we propose a novel approach to universal domain generalization that generates a dataset regardless of the target domain. This allows for generalization of the tiny task model to any domain that shares the label space, thus enhancing the real-world applicability of the dataset generation paradigm. Our experiments indicate that the proposed method accomplishes generalizability across various domains while using a parameter set that is orders of magnitude smaller than PLMs.
尽管预训练语言模型在基于提示的少样本学习方面表现出巨大的灵活性和多样性,但它们仍然受到参数数量庞大和推理应用受限的问题。最近的研究表明,PLM可以作为数据集生成器,并训练一个任务特定的微小模型来实现高效的推理。然而,由于它们倾向于生成特定领域的数据集,因此它们的适用性在各个领域都有限。在本文中,我们提出了一种通用的领域泛化方法,可以生成与目标领域不同的数据集。这使得将微小任务模型应用于任何具有共同标签空间的所有领域成为可能,从而提高了数据生成范式在现实世界的应用价值。我们的实验结果表明,与PLM相比,所提出的方法在各个领域都实现了泛化能力,同时使用了参数集 orders of magnitude 更小。
https://arxiv.org/abs/2405.01022
In this work we investigate the capability of Graph Attention Network for extracting aspect and opinion terms. Aspect and opinion term extraction is posed as a token-level classification task akin to named entity recognition. We use the dependency tree of the input query as additional feature in a Graph Attention Network along with the token and part-of-speech features. We show that the dependency structure is a powerful feature that in the presence of a CRF layer substantially improves the performance and generates the best result on the commonly used datasets from SemEval 2014, 2015 and 2016. We experiment with additional layers like BiLSTM and Transformer in addition to the CRF layer. We also show that our approach works well in the presence of multiple aspects or sentiments in the same query and it is not necessary to modify the dependency tree based on a single aspect as was the original application for sentiment classification.
在这项工作中,我们研究了图注意力网络在提取方面和观点标签的能力。方面和观点标签提取被视为与命名实体识别类似的标记级分类任务。我们在图注意力网络的输入查询的依赖树中添加了词和词性特征,并通过引入CRF层进一步增强了网络的特征。我们证明了依赖关系结构是一个强大的特征,在存在CRF层的情况下,可以显著提高性能,并在常用的数据集SemEval 2014、2015和2016上产生最佳结果。我们在CRF层之外还实验了其他层,如BiLSTM和Transformer。我们还证明了我们的方法在同一查询中存在多个方面或情感时表现良好,并且不需要根据单个方面修改依赖树,就像原来的情感分类应用一样。
https://arxiv.org/abs/2404.19260
This paper presents KazSAnDRA, a dataset developed for Kazakh sentiment analysis that is the first and largest publicly available dataset of its kind. KazSAnDRA comprises an extensive collection of 180,064 reviews obtained from various sources and includes numerical ratings ranging from 1 to 5, providing a quantitative representation of customer attitudes. The study also pursued the automation of Kazakh sentiment classification through the development and evaluation of four machine learning models trained for both polarity classification and score classification. Experimental analysis included evaluation of the results considering both balanced and imbalanced scenarios. The most successful model attained an F1-score of 0.81 for polarity classification and 0.39 for score classification on the test sets. The dataset and fine-tuned models are open access and available for download under the Creative Commons Attribution 4.0 International License (CC BY 4.0) through our GitHub repository.
本文介绍了KazSAnDRA,一个为哈萨克斯坦情感分析而创建的数据集,是现有公共可用数据集中最大的一个。KazSAnDRA包括来自各种来源的广泛收集的180,064条评论,包括从1到5的数值评分,为客户态度提供了定量表示。研究还通过开发和评估四种针对极性和分数分类的机器学习模型,实现了对哈萨克斯坦情感分类的自动化。实验分析包括考虑平衡和不平衡场景的结果评估。最成功的模型在偏置和不平衡场景下的F1分数分别为0.81和0.39。数据集和预训练模型均通过 Creative Commons Attribution 4.0 International License (CC BY 4.0) 在GitHub存储库中公开访问并可下载。
https://arxiv.org/abs/2403.19335
Aspect-Based Sentiment Analysis (ABSA) aims to identify terms or multiword expressions (MWEs) on which sentiments are expressed and the sentiment polarities associated with them. The development of supervised models has been at the forefront of research in this area. However, training these models requires the availability of manually annotated datasets which is both expensive and time-consuming. Furthermore, the available annotated datasets are tailored to a specific domain, language, and text type. In this work, we address this notable challenge in current state-of-the-art ABSA research. We propose a hybrid approach for Aspect Based Sentiment Analysis using transfer learning. The approach focuses on generating weakly-supervised annotations by exploiting the strengths of both large language models (LLM) and traditional syntactic dependencies. We utilise syntactic dependency structures of sentences to complement the annotations generated by LLMs, as they may overlook domain-specific aspect terms. Extensive experimentation on multiple datasets is performed to demonstrate the efficacy of our hybrid method for the tasks of aspect term extraction and aspect sentiment classification. Keywords: Aspect Based Sentiment Analysis, Syntactic Parsing, large language model (LLM)
Aspect-Based Sentiment Analysis(ABSA)旨在识别表达情感的术语或多词表达(MWE),以及它们所关联的情感极性。在这个领域,有监督模型的开发始终处于研究的最前沿。然而,为了训练这些模型,需要提供手动标注的数据,这既耗资又耗时。此外,已有的标注数据都是针对特定领域、语言和文本类型的。在这篇论文中,我们着手解决当前状态下的ABSA研究中的一个重要挑战。我们提出了一种使用迁移学习进行 aspects-based sentiment analysis 的混合方法。该方法利用大型语言模型的优势,同时利用传统语法的句法结构来补充LLM生成的标注。我们利用LLM的句法结构来补充生成的标注,因为它们可能忽视了领域特定的 aspect terms。在多个数据集上进行大量实验,以证明我们的混合方法在 aspect term 提取和 aspect sentiment classification 等方面的有效性。关键词:Aspect-Based Sentiment Analysis,Syntactic Parsing,large language model (LLM)
https://arxiv.org/abs/2403.17254
A community needs assessment is a tool used by non-profits and government agencies to quantify the strengths and issues of a community, allowing them to allocate their resources better. Such approaches are transitioning towards leveraging social media conversations to analyze the needs of communities and the assets already present within them. However, manual analysis of exponentially increasing social media conversations is challenging. There is a gap in the present literature in computationally analyzing how community members discuss the strengths and needs of the community. To address this gap, we introduce the task of identifying, extracting, and categorizing community needs and assets from conversational data using sophisticated natural language processing methods. To facilitate this task, we introduce the first dataset about community needs and assets consisting of 3,511 conversations from Reddit, annotated using crowdsourced workers. Using this dataset, we evaluate an utterance-level classification model compared to sentiment classification and a popular large language model (in a zero-shot setting), where we find that our model outperforms both baselines at an F1 score of 94% compared to 49% and 61% respectively. Furthermore, we observe through our study that conversations about needs have negative sentiments and emotions, while conversations about assets focus on location and entities. The dataset is available at this https URL.
社区需求评估是一种由非营利组织和政府机构使用的工具,用于量化社区的优势和问题,使他们能够更有效地分配资源。这种方法正朝着利用社交媒体对话分析社区需求和已有资产的方向发展。然而,对指数级增长的社交媒体对话进行手动分析具有挑战性。目前文献中在计算分析社区成员如何讨论社区的优势和需求方面存在空白。为了填补这一空白,我们引入了从会话数据中识别、提取和分类社区需求和资产的任务,使用了先进的自然语言处理方法。为了方便这个任务,我们引入了第一个社区需求和资产的数据集,由Reddit上的3,511个对话组成,并通过民间工作者的标注。利用这个数据集,我们比较了基于语义级的分类模型与情感分类和流行的大语言模型(在零击环境中)的性能。我们发现,与基线相比,我们的模型在F1分数上分别比49%和61%更优异。此外,通过我们的研究,我们观察到关于需求 conversations 具有消极情感和情绪,而关于资产 conversations 则关注位置和实体。数据集可在这个链接处访问:https://www.academia.edu/39411841/Community_Needs_Assessment_Using_Advanced_Natural_Language_Processing_Methods.
https://arxiv.org/abs/2403.13272
There are multiple sources of financial news online which influence market movements and trader's decisions. This highlights the need for accurate sentiment analysis, in addition to having appropriate algorithmic trading techniques, to arrive at better informed trading decisions. Standard lexicon based sentiment approaches have demonstrated their power in aiding financial decisions. However, they are known to suffer from issues related to context sensitivity and word ordering. Large Language Models (LLMs) can also be used in this context, but they are not finance-specific and tend to require significant computational resources. To facilitate a finance specific LLM framework, we introduce a novel approach based on the Llama 2 7B foundational model, in order to benefit from its generative nature and comprehensive language manipulation. This is achieved by fine-tuning the Llama2 7B model on a small portion of supervised financial sentiment analysis data, so as to jointly handle the complexities of financial lexicon and context, and further equipping it with a neural network based decision mechanism. Such a generator-classifier scheme, referred to as FinLlama, is trained not only to classify the sentiment valence but also quantify its strength, thus offering traders a nuanced insight into financial news articles. Complementing this, the implementation of parameter-efficient fine-tuning through LoRA optimises trainable parameters, thus minimising computational and memory requirements, without sacrificing accuracy. Simulation results demonstrate the ability of the proposed FinLlama to provide a framework for enhanced portfolio management decisions and increased market returns. These results underpin the ability of FinLlama to construct high-return portfolios which exhibit enhanced resilience, even during volatile periods and unpredictable market events.
网上有很多金融新闻来源,它们会影响市场运动和交易者的决策。这表明,除了适当的算法交易策略外,还需要进行准确的 sentiment 分析,才能做出更好的交易决策。基于标准的 lexicon 的 sentiment 方法已经在帮助金融决策方面展现出其力量。然而,它们已知存在关于上下文敏感性和词序的问题。在這種情況下,也可以使用大型语言模型(LLMs)。但是,它们不是专门针对金融领域的,并且通常需要大量的计算资源。为了促进金融特定的 LLM 框架,我们引入了一种基于 Llama 2 7B 基础模型的全新方法,以利用其生成性质和全面的语言操作能力。这是通过在少量监督金融情感分析数据上微调 Llama2 7B 模型,从而共同处理金融词汇表的复杂性和上下文,并进一步配备基于神经网络的决策机制来实现的。这种生成分类器-分类器方案,被称为 FinLlama,旨在不仅对情感极性进行分类,而且计量其强度,从而为交易者提供对金融新闻文章的细微洞察。此外,通过使用参数高效的微调通过 LoRA 优化可训练参数,从而最小化计算和内存需求,同时保持准确性。模拟结果证实了所提出的 FinLlama 能够为增强型组合管理决策和提高市场回报提供一个框架。这些结果进一步加强了 FinLlama 构建高回报组合并展示出增强弹性的能力,即使在波动时期和不可预测的市场事件中也是如此。
https://arxiv.org/abs/2403.12285
This manuscript presents a methodical examination of the utilization of Artificial Intelligence in the assessment of emotions in texts related to healthcare, with a particular focus on the incorporation of Natural Language Processing and deep learning technologies. We scrutinize numerous research studies that employ AI to augment sentiment analysis, categorize emotions, and forecast patient outcomes based on textual information derived from clinical narratives, patient feedback on medications, and online health discussions. The review demonstrates noteworthy progress in the precision of algorithms used for sentiment classification, the prognostic capabilities of AI models for neurodegenerative diseases, and the creation of AI-powered systems that offer support in clinical decision-making. Remarkably, the utilization of AI applications has exhibited an enhancement in personalized therapy plans by integrating patient sentiment and contributing to the early identification of mental health disorders. There persist challenges, which encompass ensuring the ethical application of AI, safeguarding patient confidentiality, and addressing potential biases in algorithmic procedures. Nevertheless, the potential of AI to revolutionize healthcare practices is unmistakable, offering a future where healthcare is not only more knowledgeable and efficient but also more empathetic and centered around the needs of patients. This investigation underscores the transformative influence of AI on healthcare, delivering a comprehensive comprehension of its role in examining emotional content in healthcare texts and highlighting the trajectory towards a more compassionate approach to patient care. The findings advocate for a harmonious synergy between AI's analytical capabilities and the human aspects of healthcare.
本文对将人工智能(AI)应用于评估文本中情感的方法进行了系统审查,特别关注将自然语言处理(NLP)和深度学习技术应用于此目的。我们详细审查了使用AI增强情感分析、分类情感和预测患者结果的研究。评论表明,用于情感分类的算法的精确度、AI模型对神经退行性疾病的有预测能力以及基于AI的系统的临床决策支持方面的进展是显著的。值得注意的是,AI应用在个性化治疗计划方面的使用已经通过将患者情感融入其中,帮助早期识别心理健康障碍而表现出增强。仍然存在一些挑战,包括确保AI应用的伦理应用、保护患者隐私以及解决算法过程中的偏见。然而,AI在医疗保健实践中的潜在革命性变革是不容忽视的,为未来提供了一个更具有知识和效率的医疗保健体系,同时也更加关注患者的需要。这次调查突显了AI在医疗保健中的 transformative 影响,全面阐述了其在检查医疗保健文本情感内容方面以及在患者护理过程中更富有同情心的趋势。研究结果主张在AI的分析能力与人类医疗保健方面实现和谐协同。
https://arxiv.org/abs/2403.09762
As more than 70$\%$ of reviews in the existing opinion summary data set are positive, current opinion summarization approaches are reluctant to generate negative summaries given the input of negative texts. To address such sentiment bias, a direct approach without the over-reliance on a specific framework is to generate additional data based on large language models to balance the emotional distribution of the dataset. However, data augmentation based on large language models faces two disadvantages: 1) the potential issues or toxicity in the augmented data; 2) the expensive costs. Therefore, in this paper, we propose a novel data augmentation framework based on both large and small language models for debiasing opinion summarization. In specific, a small size of synthesized negative reviews is obtained by rewriting the positive text via a large language model. Then, a disentangle reconstruction model is trained based on the generated data. After training, a large amount of synthetic data can be obtained by decoding the new representation obtained from the combination of different sample representations and filtering based on confusion degree and sentiment classification. Experiments have proved that our framework can effectively alleviate emotional bias same as using only large models, but more economically.
由于现有观点总结数据集中的超过70%好评,现有的观点总结方法不愿意根据负面文本生成负面摘要。为了解决这种情感偏见,一种不依赖特定框架的直接方法是根据大型语言模型生成额外数据来平衡数据集的情感分布。然而,基于大型语言模型的数据增强存在两个缺点:1)增强数据的潜在问题或毒性;2)昂贵的成本。因此,在本文中,我们提出了一个基于大型和小型语言模型的观点总结去偏新方法。具体来说,通过大型语言模型重新编写积极文本可以获得小的负面评论数量。然后,基于生成的数据训练解离重构模型。训练后,可以通过解码不同样本表示的组合以及根据混淆程度和情感分类进行过滤来获得大量合成数据。实验证明,我们的框架可以有效地消除仅使用大型模型时存在的情感偏见,而且更加经济实惠。
https://arxiv.org/abs/2403.07693
This article presents a comprehensive sentiment analysis (SA) of comments on YouTube videos related to Sidewalk Delivery Robots (SDRs). We manually annotated the collected YouTube comments with three sentiment labels: negative (0), positive (1), and neutral (2). We then constructed models for text sentiment classification and tested the models' performance on both binary and ternary classification tasks in terms of accuracy, precision, recall, and F1 score. Our results indicate that, in binary classification tasks, the Support Vector Machine (SVM) model using Term Frequency-Inverse Document Frequency (TF-IDF) and N-gram get the highest accuracy. In ternary classification tasks, the model using Bidirectional Encoder Representations from Transformers (BERT), Long Short-Term Memory Networks (LSTM) and Gated Recurrent Unit (GRU) significantly outperforms other machine learning models, achieving an accuracy, precision, recall, and F1 score of 0.78. Additionally, we employ the Latent Dirichlet Allocation model to generate 10 topics from the comments to explore the public's underlying views on SDRs. Drawing from these findings, we propose targeted recommendations for shaping future policies concerning SDRs. This work provides valuable insights for stakeholders in the SDR sector regarding social perception, interaction, and safety.
这篇文章全面分析了与Sidewalk Delivery Robots (SDRs)相关的YouTube视频评论。我们手动为收集的YouTube评论分配了三种情感标签:负面(0),正面(1)和中性(2)。然后我们构建了用于文本情感分类的模型,并在二进制和三元分类任务上测试了模型的性能。我们的结果表明,在二进制分类任务中,使用Term Frequency-Inverse Document Frequency(TF-IDF)和N-gram的SVM模型具有最高的准确率。在三元分类任务中,使用双向编码器表示的Transformer(BERT),长短期记忆网络(LSTM)和门控循环单元(GRU)模型显著优于其他机器学习模型,实现了准确度、精度、召回率和F1分数为0.78。此外,我们还使用潜在主题模型从评论中生成10个主题,以探索公众对SDRs的潜在观点。基于这些发现,我们提出了针对SDR未来政策制定的目标建议。这项工作为SDR领域的利益相关者提供了有关社会感知、互动和安全的宝贵见解。
https://arxiv.org/abs/2405.00688
The lack of a suitable tool for the analysis of conversational texts in the Persian language has made various analyses of these texts, including Sentiment Analysis, difficult. In this research, we tried to make the understanding of these texts easier for the machine by providing PSC, Persian Slang Converter, a tool for converting conversational texts into formal ones, and by using the most up-to-date and best deep learning methods along with the PSC, the sentiment learning of short Persian language texts for the machine in a better way. be made More than 10 million unlabeled texts from various social networks and movie subtitles (as Conversational texts) and about 10 million news texts (as formal texts) have been used for training unsupervised models and formal implementation of the tool. 60,000 texts from the comments of Instagram social network users with positive, negative, and neutral labels are considered supervised data for training the emotion classification model of short texts. Using the formal tool, 57% of the words of the corpus of conversation were converted. Finally, by using the formalizer, FastText model, and deep LSTM network, an accuracy of 81.91 was obtained on the test data.
波斯语对话文本的分析缺乏适当的工具,包括情感分析,使得各种分析变得困难。在这项研究中,我们试图通过提供PSC(波斯语俚语转换器)、一个将对话文本转换为正式文本的工具,以及使用最先进的和最优秀的深度学习方法和PSC,更好地理解这些文本,使得机器更容易理解。已经使用了超过1000万无标签的社交媒体文本和电影字幕(作为对话文本)以及大约1000万正式文本(作为正式文本)进行训练,并正式发布了该工具。60,000篇来自Instagram社交网络用户正面、负面和中立标签的文本被认为是训练短文本情感分类模型的有监督数据。使用正式工具,将数据集的57%的单词转换为正式文本。最后,通过使用正式化器、FastText模型和深度LSTM网络,在测试数据上获得了81.91%的准确率。
https://arxiv.org/abs/2403.06023
Introduction: Microblogging websites have massed rich data sources for sentiment analysis and opinion mining. In this regard, sentiment classification has frequently proven inefficient because microblog posts typically lack syntactically consistent terms and representatives since users on these social networks do not like to write lengthy statements. Also, there are some limitations to low-resource languages. The Persian language has exceptional characteristics and demands unique annotated data and models for the sentiment analysis task, which are distinctive from text features within the English dialect. Method: This paper first constructs a user opinion dataset called ITRC-Opinion in a collaborative environment and insource way. Our dataset contains 60,000 informal and colloquial Persian texts from social microblogs such as Twitter and Instagram. Second, this study proposes a new architecture based on the convolutional neural network (CNN) model for more effective sentiment analysis of colloquial text in social microblog posts. The constructed datasets are used to evaluate the presented architecture. Furthermore, some models, such as LSTM, CNN-RNN, BiLSTM, and BiGRU with different word embeddings, including Fasttext, Glove, and Word2vec, investigated our dataset and evaluated the results. Results: The results demonstrate the benefit of our dataset and the proposed model (72% accuracy), displaying meaningful improvement in sentiment classification performance.
简介:微博网站聚集了大量的情感分析和意见挖掘数据资源。在这方面,情感分类常常因为微博帖子通常缺乏句法一致的词汇和代表而证明效率低下。此外,低资源语言也存在一些限制。波斯语具有独特的特点,需要为情感分析任务提供独特的注释数据和模型,这与英式英语方言中的文本特征不同。方法:本文首先在一个合作和资源的环境中构建了一个用户意见数据集 called ITRC-Opinion。我们的数据集包含来自推特和Instagram等社交微博的60,000个非正式和俚语波斯语文本。接着,本研究提出了一种基于卷积神经网络(CNN)模型的新的架构,以更有效地分析社交微博中的流行文本的情感。构建的数据集用于评估所提出的架构。此外,一些模型,如LSTM、CNN-RNN、BiLSTM和BiGRU,使用不同的词向量,包括Fasttext、Glove和Word2vec,对数据集进行了调查并评估了结果。结果:结果表明,我们的数据集和所提出的模型的价值(72%的准确性),在情感分类性能上具有显著的提高。
https://arxiv.org/abs/2306.12679
In this study, ChatGPT is utilized to create streamlined models that generate easily interpretable features. These features are then used to evaluate financial outcomes from earnings calls. We detail a training approach that merges knowledge distillation and transfer learning, resulting in lightweight topic and sentiment classification models without significant loss in accuracy. These models are assessed through a dataset annotated by experts. The paper also delves into two practical case studies, highlighting how the generated features can be effectively utilized in quantitative investing scenarios.
在这项研究中,我们使用ChatGPT来创建具有清晰可解释特性的简化模型。这些特性然后用于从电话会议中评估财务结果。我们详细介绍了一种结合知识蒸馏和迁移学习的方法,导致没有显著准确度损失的轻量级主题和情感分类模型。这些模型通过由专家标注的数据集进行评估。此外,本文还深入研究了两个实际案例,阐明生成的特征如何有效地用于量化投资场景。
https://arxiv.org/abs/2403.02185
In this paper we explore the challenges of measuring sentiment in relation to Environmental, Social and Governance (ESG) social media. ESG has grown in importance in recent years with a surge in interest from the financial sector and the performance of many businesses has become based in part on their ESG related reputations. The use of sentiment analysis to measure ESG related reputation has developed and with it interest in the use of machines to do so. The era of digital media has created an explosion of new media sources, driven by the growth of social media platforms. This growing data environment has become an excellent source for behavioural insight studies across many disciplines that includes politics, healthcare and market research. Our study seeks to compare human performance with the cutting edge in machine performance in the measurement of ESG related sentiment. To this end researchers classify the sentiment of 150 tweets and a reliability measure is made. A gold standard data set is then established based on the consensus of 3 researchers and this data set is then used to measure the performance of different machine approaches: one based on the VADER dictionary approach to sentiment classification and then multiple language model approaches, including Llama2, T5, Mistral, Mixtral, FINBERT, GPT3.5 and GPT4.
在本文中,我们探讨了在环境、社会和治理(ESG)社交媒体上衡量情感的挑战。近年来,ESG的重要性随着金融部门兴趣的增加和许多企业的业绩部分基于其ESG相关声誉而增加。使用情感分析来衡量ESG相关声誉的发展,以及机器在这方面应用的兴趣不断增加。数字媒体时代的爆炸性增长带来了大量新的媒体来源,主要由社交媒体平台的增长推动。这个不断增长的数据环境已经成为许多学科领域行为洞察力研究的重要来源,包括政治、医疗和市场研究。我们的研究旨在比较人类表现与机器在测量ESG相关情感方面的尖端表现。为此,研究人员将150条推文的情绪进行了分类,并进行了可靠性度量。然后,基于三名研究人员的共识,建立了一个黄金标准数据集。接着,将这个数据集用于衡量不同机器方法的性能:基于VADER词典方法的情绪分类和多种语言模型方法,包括Llama2、T5、Mistral、Mixtral、FINBERT、GPT3.5和GPT4。
https://arxiv.org/abs/2402.16650
This paper explores the challenges posed by aspect-based sentiment classification (ABSC) within pretrained language models (PLMs), with a particular focus on contextualization and hallucination issues. In order to tackle these challenges, we introduce CARBD-Ko (a Contextually Annotated Review Benchmark Dataset for Aspect-Based Sentiment Classification in Korean), a benchmark dataset that incorporates aspects and dual-tagged polarities to distinguish between aspect-specific and aspect-agnostic sentiment classification. The dataset consists of sentences annotated with specific aspects, aspect polarity, aspect-agnostic polarity, and the intensity of aspects. To address the issue of dual-tagged aspect polarities, we propose a novel approach employing a Siamese Network. Our experimental findings highlight the inherent difficulties in accurately predicting dual-polarities and underscore the significance of contextualized sentiment analysis models. The CARBD-Ko dataset serves as a valuable resource for future research endeavors in aspect-level sentiment classification.
本文探讨了在预训练语言模型(PLMs)中,面向 aspect 的情感分类(ASBC)所面临的挑战,特别是上下文理解和虚构问题。为解决这些挑战,我们引入了 CARBD-Ko(一种关注于具有上下文注释的韩国 aspect 情感分类基准数据集),作为具有 aspect 和双重标记极性的基准数据集,用于区分面向特定和面向无关情感分类。数据集包括带有特定 aspects、aspect 极性、aspect-agnostic 极性和极值强度的句子注释。为了应对双重标记极性的问题,我们提出了一个新的采用 Siamese 网络的方法。我们的实验结果突出了准确预测双重极性的固有困难,并强调了上下文情感分析模型的必要性。CARBD-Ko 数据集成为未来研究在 aspect 级别情感分类方面的重要资源。
https://arxiv.org/abs/2402.15046
In the rapidly evolving landscape of social media, the introduction of new emojis in Unicode release versions presents a structured opportunity to explore digital language evolution. Analyzing a large dataset of sampled English tweets, we examine how newly released emojis gain traction and evolve in meaning. We find that community size of early adopters and emoji semantics are crucial in determining their popularity. Certain emojis experienced notable shifts in the meanings and sentiment associations during the diffusion process. Additionally, we propose a novel framework utilizing language models to extract words and pre-existing emojis with semantically similar contexts, which enhances interpretation of new emojis. The framework demonstrates its effectiveness in improving sentiment classification performance by substituting unknown new emojis with familiar ones. This study offers a new perspective in understanding how new language units are adopted, adapted, and integrated into the fabric of online communication.
在社交媒体快速发展的环境中,Unicode 发布版本中引入新的 emoji 提供了一个有结构的机会来探讨数字语言演变。分析了一个大量的英文推特样本数据集,我们研究了的新发布的 emoji 如何获得关注并演变其意义。我们发现,社区规模和 emoji 语义是决定其流行度的重要因素。在扩散过程中,某些 emoji 的意义和情感关联经历了一些显著的变化。此外,我们提出了一种利用语言模型提取具有相似语义上下文的单词和预先存在的 emoji 的框架,这有助于解释新 emoji 的含义。该框架通过用熟悉的词汇替换未知的新 emoji 来提高情感分类性能,证明了其有效性。本研究提供了一种新的理解,即新的语言单位是如何被采用、适应和融入在线交流的织物。
https://arxiv.org/abs/2402.14187
Aspect-Based Sentiment Analysis (ABSA) is a fine-grained linguistics problem that entails the extraction of multifaceted aspects, opinions, and sentiments from the given text. Both standalone and compound ABSA tasks have been extensively used in the literature to examine the nuanced information present in online reviews and social media posts. Current ABSA methods often rely on static hyperparameters for attention-masking mechanisms, which can struggle with context adaptation and may overlook the unique relevance of words in varied situations. This leads to challenges in accurately analyzing complex sentences containing multiple aspects with differing sentiments. In this work, we present adaptive masking methods that remove irrelevant tokens based on context to assist in Aspect Term Extraction and Aspect Sentiment Classification subtasks of ABSA. We show with our experiments that the proposed methods outperform the baseline methods in terms of accuracy and F1 scores on four benchmark online review datasets. Further, we show that the proposed methods can be extended with multiple adaptations and demonstrate a qualitative analysis of the proposed approach using sample text for aspect term extraction.
面向方面的情感分析(ASSA)是一个细粒度的语言学问题,旨在从给定文本中提取多方面的内容、意见和情感。离散和组合ASSA任务在文献中得到了广泛应用,以研究在线评论和社交媒体帖子中存在的微妙的上下文信息。当前的ASSA方法通常依赖于静态超参数的注意力遮蔽机制,这可能会在上下文适应方面遇到困难,并可能忽视不同情况中单词的独特相关性。这导致在分析复杂句子中多个方面情感存在差异时存在挑战。在本文中,我们提出了适应性遮蔽方法,根据上下文删除无关词,以协助进行ASSA的方面词提取和情感分类子任务。我们通过实验证明,与基线方法相比,所提出的方法在四个基准在线评论数据集上的准确性和F1分数方面具有优势。此外,我们还证明了所提出的方法可以通过多个自定义进行扩展,并且通过使用样本文本进行方面词提取的定性分析展示了所提出方法的有效性。
https://arxiv.org/abs/2402.13722
The ability to generate sentiment-controlled feedback in response to multimodal inputs, comprising both text and images, addresses a critical gap in human-computer interaction by enabling systems to provide empathetic, accurate, and engaging responses. This capability has profound applications in healthcare, marketing, and education. To this end, we construct a large-scale Controllable Multimodal Feedback Synthesis (CMFeed) dataset and propose a controllable feedback synthesis system. The proposed system includes an encoder, decoder, and controllability block for textual and visual inputs. It extracts textual and visual features using a transformer and Faster R-CNN networks and combines them to generate feedback. The CMFeed dataset encompasses images, text, reactions to the post, human comments with relevance scores, and reactions to the comments. The reactions to the post and comments are utilized to train the proposed model to produce feedback with a particular (positive or negative) sentiment. A sentiment classification accuracy of 77.23% has been achieved, 18.82% higher than the accuracy without using the controllability. Moreover, the system incorporates a similarity module for assessing feedback relevance through rank-based metrics. It implements an interpretability technique to analyze the contribution of textual and visual features during the generation of uncontrolled and controlled feedback.
能够针对多模态输入生成情感控制反馈,包括文本和图像,解决了一个关键的人机交互缺口,使得系统能够提供体贴、准确、引人入胜的回应。这种能力在医疗、营销和教育等领域具有深刻的应用。为此,我们构建了一个大规模可控制多模态反馈合成(CMFeed)数据集,并提出了一个可控制反馈合成系统。所提出的系统包括编码器、解码器和一个可控制块,用于处理文本和视觉输入。它使用Transformer和Faster R-CNN网络提取文本和视觉特征,并将它们组合生成反馈。CMFeed数据集包括图像、文本、对帖子及其评论的反应、以及对这些评论的反应。用于训练所提出的模型产生具有特定(积极或消极)情感的反馈。情感分类准确度为77.23%,比没有使用控制权的有18.82%的提高。此外,系统还包括一个相似度模块,通过基于排名的指标评估反馈的相关性。它采用了一种解释性技术来分析在生成未控制和控制反馈过程中文本和视觉特征的贡献。
https://arxiv.org/abs/2402.07640
Recent work has shown the defense of 01 loss sign activation neural networks against image classification adversarial attacks. A public challenge to attack the models on CIFAR10 dataset remains undefeated. We ask the following question in this study: are 01 loss sign activation neural networks hard to deceive with a popular black box text adversarial attack program called TextFooler? We study this question on four popular text classification datasets: IMDB reviews, Yelp reviews, MR sentiment classification, and AG news classification. We find that our 01 loss sign activation network is much harder to attack with TextFooler compared to sigmoid activation cross entropy and binary neural networks. We also study a 01 loss sign activation convolutional neural network with a novel global pooling step specific to sign activation networks. With this new variation we see a significant gain in adversarial accuracy rendering TextFooler practically useless against it. We make our code freely available at \url{this https URL} and \url{this https URL}. Our work here suggests that 01 loss sign activation networks could be further developed to create fool proof models against text adversarial attacks.
最近的工作表明,01损失符号激活神经网络对图像分类对抗攻击具有一定的防御能力。然而,在CIFAR10数据集上攻击这些模型 remains 未经挑战。在本研究中,我们问以下问题:01损失符号激活神经网络是否容易被名为TextFooler的流行黑盒文本对抗攻击程序欺骗?我们在四个流行的文本分类数据集上研究这个问题:IMDb评论、Yelp评论、MR情感分类和AG新闻分类。我们发现,与sigmoid激活交叉熵和二进制神经网络相比,我们的01损失符号激活网络在TextFooler上的攻击难度更大。我们还研究了一个新颖的全局池化步长的01损失符号激活卷积神经网络。通过这种新颖的变体,我们看到了显著的增加对抗准确率,使得TextFooler对它几乎没有任何用处。我们的代码目前可以从以下网址免费获取:\url{this https URL} 和 \url{this https URL}。本研究的结果表明,01损失符号激活神经网络可以进一步开发,以创建对文本对抗攻击具有充分保护的模型。
https://arxiv.org/abs/2402.07347