Learning an Efficient Text Augmentation Strategy: A Case Study in Sentiment Analysis

سال انتشار: 1402
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 35

فایل این مقاله در 9 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_IJWR-6-2_006

تاریخ نمایه سازی: 27 فروردین 1403

چکیده مقاله:

Contemporary machine learning models, like deep neural networks, require substantial labeled datasets for proper training. However, in areas such as natural language processing, a shortage of labeled data can lead to overfitting. To address this challenge, data augmentation, which involves transforming data points to maintain class labels and provide additional valuable information, has become an effective strategy. In this paper, a deep reinforcement learning-based text augmentation method for sentiment analysis was introduced, combining reinforcement learning with deep learning. The technique uses Deep Q-Network (DQN) as the reinforcement learning method to search for an efficient augmentation strategy, employing four text augmentation transformations: random deletion, synonym replacement, random swapping, and random insertion. Additionally, various deep learning networks, including CNN, Bi-LSTM, Transformer, BERT, and XLNet, were evaluated for the training phase. Experimental findings show that the proposed technique can achieve an accuracy of ۶۵.۱% with only ۲۰% of the dataset and ۶۹.۳% with ۴۰% of the dataset. Furthermore, with just ۱۰% of the dataset, the method yields an F۱-score of ۶۲.۱%, rising to ۶۹.۱% with ۴۰% of the dataset, outperforming previous approaches. Evaluation on the SemEval dataset demonstrates that reinforcement learning can efficiently augment text datasets for improved sentiment analysis results.

نویسندگان

Mehdy Roayaei

Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :
  • Raileanu, M. Goldstein, D. Yarats, I. Kostrikov and R. Fergus, ...
  • Wang, K. Wang and S. Lian, “A survey on face ...
  • Ratner, H. R. Ehrenberg, Z. Hussain, J. Dunnmon and C. ...
  • Shorten and T. M. Khoshgoftaar, “A survey on Image Data ...
  • Li, Y. Hou, and W. Che, “Data augmentation approaches in ...
  • Y. Feng et al., “A Survey of Data Augmentation Approaches ...
  • Kobayashi, “Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations,” ...
  • Wei and K. Zou, “EDA: Easy data augmentation techniques for ...
  • F. A. O. Pellicer, T. M. Ferreira and A. H. ...
  • Shorten, T. M. Khoshgoftaar and B. Furht, “Text Data Augmentation ...
  • Liu, G. Xu, C. Jia, W. Ma, L. Wang and ...
  • Raille, S. Djambazovska, & C. Musat, “Fast cross-domain data augmentation ...
  • Hataya, J. Zdenek, K. Yoshizoe and H. Nakayama, “Faster AutoAugment: ...
  • Daval-Frerot and Y. Weis, “WMD at SemEval-۲۰۲۰ Tasks ۷ and ...
  • Dao, A. Gu, A. J. Ratner, V. Smith, C. De ...
  • Zhang, T. Li, H. Zhang, and B. Yin, “On Data ...
  • Zuo, Y. Chen, K. Liu and J. Zhao, “KnowDis: Knowledge ...
  • Dai and H. Adel, “An Analysis of Simple Data Augmentation ...
  • Longpre, Y. Wang and C. DuBois, “How effective is task-agnostic ...
  • Rastogi, N. Mofid and F.-I. Hsiao, “Can We Achieve More ...
  • Peng, C. Zhu, M. Zeng and J. Gao, “Data Augmentation ...
  • Yan, Y. Li, S. Zhang, and Z. Chen, “Data Augmentation ...
  • Coulombe, “Text Data Augmentation Made Simple By Leveraging NLP Cloud ...
  • Regina, M. Meyer and S. Goutal, “Text Data Augmentation: Towards ...
  • Min, R. Thomas McCoy, D. Das, E. Pitler and T. ...
  • Zhang, T. Ge and X. Sun, “Parallel data augmentation for ...
  • Anaby-Tavor et al., “Do not have enough data? Deep learning ...
  • Thakur, N. Reimers, J. Daxenberger and I. Gurevych, “Augmented SBERT: ...
  • Guo, Y. Mao and R. Zhang, “Augmenting Data with Mixup ...
  • Yu, R. Zhang, Y. Zhao, Y. Zhang, C. Li and ...
  • Fang and P. Li, “Data Augmentation with Reinforcement Learning for ...
  • Kim and K. E. Kim, “Data Augmentation for Learning to ...
  • Mnih et al., “Human-level control through deep reinforcement learning,” Nature, ...
  • J. C. H. Watkins and P. Dayan, “Q-learning,” Machine learning, ...
  • “SemEval ۲۰۱۷ Task ۴A.” [Online]. Available: https://alt.qcri.org/semeval۲۰۱۷/task۴ ...
  • Niu and M. Bansal, “Adversarial Over-Sensitivity and Over-Stability Strategies for ...
  • Y. Wang and D. Yang, “That’s So Annoying!!!: A Lexical ...
  • Yu et al., “QANet: Combining Local Convolution with Global Self-Attention ...
  • Wang, J. He, X. Zhang and S. Liu, “A short ...
  • Jang, M. Kim, G. Harerimana, S. U. Kang and J. ...
  • Singh, Sushant, and Ausif Mahmood. “The NLP cookbook: modern recipes ...
  • Li, Y. Ma, Z. Ma and H. Zhu, “Weibo text ...
  • Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, ...
  • نمایش کامل مراجع