EMOTION DETECTION AND CLASSIFICATION ON TIGRIGNA SOCIAL MEDIA TEXTS USING TRANSFORMER MODELS

Rahel Gebru

EMOTION DETECTION AND CLASSIFICATION ON TIGRIGNA SOCIAL MEDIA TEXTS USING TRANSFORMER MODELS

dc.contributor.author	Rahel Gebru
dc.date.accessioned	2025-09-26T07:15:37Z
dc.date.issued	2025-09-23
dc.description.abstract	The rapid growth of social media has reshaped emotional expression, producing large-scale digital data for social, cultural, and political analysis, thereby highlighting the importance of reliable automated emotion detection tools. Despite advances in Natural Language Processing (NLP), Tigrigna remains underrepresented, with existing multilingual models often underperforming due to limited annotated data, lack of tailored resources, and linguistic complexity. To address this gap, this study introduces transformer-based models tailored for emotion detection and classification in Tigrigna social media texts, focusing on four emotion categories: happiness, sadness, neutral, and disgust. A total of 4,000 Tigrigna sentences were collected from Facebook and YouTube and manually annotated with a high Inter-Annotator Agreement. To expand and balance the corpus, 6,000 additional sentences were generated using data augmentation techniques, including backtranslation and synonym replacement, resulting in a final dataset of 10,000 sentences. Following preprocessing, including normalization, tokenization, and cleaning, the data was split into training (8,000), validation (1,000), and testing (1,000) subsets. Three transformer-based models namely XLM-RoBERTa, tiBERT, and the Tigrigna-specific tiRoBERTa were fine-tuned and evaluated using Macro-F1, precision, and recall metrics to address class imbalance. The results demonstrated progressive improvements across models: XLM-R achieved an F1-score of 81%, tiBERT 84.4%, and tiRoBERTa 88%, with tiRoBERTa outperforming the others across all emotion categories, particularly in distinguishing subtle distinctions between sadness and happiness. Misclassifications between neutral and disgust persisted, reflecting data-related issues, model-specific challenges, and the low-resource nature of Tigrigna. Data augmentation improved F1-scores by 2–10% across models, underscoring its crucial role in enhancing performance in low-resource NLP tasks. The study concludes that transformer models, when culturally and linguistically adapted, are highly effective for Tigrigna emotion detection. Future research should expand Tigrigna-specific pretraining corpora, explore advanced augmentation, investigate hybrid architectures, and integrate multimodal data (e.g., combining text with images or videos). Applying these findings via APIs and dashboards can support researchers, policymakers, and organizations in leveraging Tigrigna social media for informed decision-making.
dc.identifier.uri	https://repository.mu.edu.et/handle/123456789/931
dc.identifier.uri	https://doi.org/10.82589/muir-829
dc.language.iso	en
dc.publisher	Mekelle University
dc.subject	Data augmentation
dc.subject	Emotion detection
dc.subject	Low-resource NLP
dc.subject	Tigrigna
dc.subject	Social media
dc.subject	Transformer models
dc.subject	tiRoBERTa
dc.title	EMOTION DETECTION AND CLASSIFICATION ON TIGRIGNA SOCIAL MEDIA TEXTS USING TRANSFORMER MODELS
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Rahel Gebru.pdf
Size:: 5.33 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

Electrical and Computer Engineering