EMOTION DISCOVERY IN HINDI-ENGLISH CODE-MIXED CONVERSATIONS
This thesis delves into emotion recognition in Hindi-English code-mixed dialogues, particularly focusing on romanized text, which is essential for understanding multilingual communication dynamics. Using a dataset from bilingual television shows, the study employs machine learning and natural language processing techniques, with models like Support Vector Machine, Logistic Regression, and XLM-Roberta tailored to handle the nuances of code-switching and transliteration in romanized Hindi-English. To combat challenges such as data imbalance, SMOTE (Synthetic Minority Over-sampling Technique) is utilized, enhancing model training and generalization. The research also explores ensemble learning with methods like VotingClassifier to improve emotional classification accuracy. Logistic regression stands out for its high accuracy and robustness, demonstrated through rigorous cross-validation. The findings underscore the potential of advanced machine learning models and advocate for further exploration of deep learning and multimodal data to enhance emotion detection in diverse linguistic settings.
History
Degree Type
- Master of Science
Department
- Computer Science
Campus location
- Fort Wayne