Detection of Confirmation Bias in Horoscope Texts Using Support Vector Machine

Authors

  • Arun Padmanabhan Research Scholar, Computer Science, Karpagam Academy of Higher Education, Coimbatore, Tamil Nadu, India. Author
  • Dr. K. Devasenapathy 2Associate Professor, Computer Science, Karpagam Academy of Higher Education, Coimbatore, Tamil Nadu, India. Author

DOI:

https://doi.org/10.47392/IRJASH.2025.127

Keywords:

Case Law Analytics, Legal Confirmation Bias, Support Vector Machine, Text Classification, Natural Language Processing, Cognitive Bias Detection-IDF Feature Extraction, Machine Learning, SMOTE Oversampling

Abstract

The study addresses the challenging task of identification of confirmation bias in horoscope texts, using machine learning techniques. Bias confirmation, a cognitive bias in which the information that confirms one's prior beliefs is preferred, is a widely used personalized media such as horoscopes and has implications for both mental health and digital content analysis. For this research, a carefully selected set of horoscope responses was compiled and annotated for the occurrence of confirmation bias or its absence. Data pre-processing included methodical text cleaning—removal of unnecessary columns, normalization, whitespace trimming, and imbalanced class analysis—to make it possible to build strong predictive models. Feature extraction involved a high-dimensional TF-IDF (Term Frequency-Inverse Document Frequency) vectorization that was able to capture relevant linguistic patterns as well as n-gram structures that are highly indicative of biased content. To address the issue of class imbalance, oversampling methods like SMOTE were used together with class weighting in the Support Vector Machine (SVM) learning framework. The SVM model was adjusted for the best kernel parameters and probabilistic output calibration, while stratified train-test data splitting was used to ensure representative evaluation across bias classes. Baseline model Naïve Bayes and Logistic Regression were also set up for comparative analysis, but SVM’s margin-based classification was able to deliver competitive performance, especially for minority bias detection. Deep emphasis was placed on the model evaluation to ensure the metrics used were appropriate for an imbalanced classification such as: accuracy, precision, recall, F1-score, and ROC-AUC, with a detailed examination through confusion matrices and threshold tuning curves. Besides, the interpretability layer was also present in the study by means of feature importance visualization, thus giving a clear indication of the textual elements that influenced bias predictions the most.

Downloads

Published

2025-12-26