Reducing the Effect of Imbalance in Text Classification Using SVD and GloVe with Ensemble and Deep Learning

Tajbia Hossain; Humaira Zahin Mauni; Raqeebir Rab

doi:10.31577/cai_2022_1_98

Authors

Tajbia Hossain Department of Computer Science and Engineering, Ahsanullah University of Science and Technology, Dhaka, Bangladesh
Humaira Zahin Mauni Department of Computer Science and Engineering, Ahsanullah University of Science and Technology, Dhaka, Bangladesh
Raqeebir Rab Department of Computer Science and Engineering, Ahsanullah University of Science and Technology, Dhaka, Bangladesh

DOI:

https://doi.org/10.31577/cai_2022_1_98

Keywords:

Deep learning, ensemble learning, machine learning, text classification, imbalanced data, singular value decomposition, global vectors

Abstract

Due to the recent escalation in the amount of text data available and used online, text classification has become a staple for data analysts when extracting relevant information. Yet, machine learning algorithms are susceptible to biases when implemented on any large-scale automated task, especially in text analysis. With the popularization of newer branches of study emerging from the field of machine learning – such as ensemble and deep learning – we must analyze the potential pitfalls in the common experimental setup centered around learning algorithms. Imbalance in text data is one such pitfall – when data is not equally distributed across all categories in a dataset, it can influence and undermine the classification of underrepresented categories. In our research, we have proposed several techniques and unique approaches to tackle this obstacle. We prepared four datasets of varying degrees of imbalance to conduct our experimentation. We proved that feature extraction techniques singular value decomposition (SVD) and GloVe are the key to reducing the effect of imbalance in text classification, especially in ensemble and deep learning. Using the result of our research, we have also proposed a modified ensemble classifier that can classify imbalanced and balanced data alike.

Downloads

Download data is not yet available.

Reducing the Effect of Imbalance in Text Classification Using SVD and GloVe with Ensemble and Deep Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

Information

Make a Submission

Keywords