CLC number: TP311
On-line Access: 2025-07-02
Received: 2024-03-31
Revision Accepted: 2025-07-02
Crosschecked: 2024-09-22
Cited: 0
Clicked: 558
Citations: Bibtex RefMan EndNote GB/T7714
Shufeng XIONG, Guipei ZHANG, Xiaobo FAN, Wenjie TIAN, Lei XI, Hebing LIU, Haiping SI. MAL: multilevel active learning with BERT for Chinese textual affective structure analysis[J]. Frontiers of Information Technology & Electronic Engineering, 2025, 26(6): 833-846.
@article{title="MAL: multilevel active learning with BERT for Chinese textual affective structure analysis",
author="Shufeng XIONG, Guipei ZHANG, Xiaobo FAN, Wenjie TIAN, Lei XI, Hebing LIU, Haiping SI",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="26",
number="6",
pages="833-846",
year="2025",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2400242"
}
%0 Journal Article
%T MAL: multilevel active learning with BERT for Chinese textual affective structure analysis
%A Shufeng XIONG
%A Guipei ZHANG
%A Xiaobo FAN
%A Wenjie TIAN
%A Lei XI
%A Hebing LIU
%A Haiping SI
%J Frontiers of Information Technology & Electronic Engineering
%V 26
%N 6
%P 833-846
%@ 2095-9184
%D 2025
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2400242
TY - JOUR
T1 - MAL: multilevel active learning with BERT for Chinese textual affective structure analysis
A1 - Shufeng XIONG
A1 - Guipei ZHANG
A1 - Xiaobo FAN
A1 - Wenjie TIAN
A1 - Lei XI
A1 - Hebing LIU
A1 - Haiping SI
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 26
IS - 6
SP - 833
EP - 846
%@ 2095-9184
Y1 - 2025
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2400242
Abstract: Chinese textual affective structure analysis (CTASA) is a sequence labeling task that often relies on supervised deep learning methods. However, acquiring a large annotated dataset for training can be costly and time-consuming. Active learning offers a solution by selecting the most valuable samples to reduce labeling costs. Previous approaches focused on uncertainty or diversity but faced challenges such as biased models or selecting insignificant samples. To address these issues, multilevel active learning (MAL) is introduced, which leverages deep textual information at both the sentence and word levels, taking into account the complex structure of the Chinese language. By integrating the sentence-level features extracted from bidirectional encoder representations from Transformers (BERT) embeddings and the word-level probability distributions obtained through a conditional random field (CRF) model, MAL comprehensively captures the Chinese textual affective structure (CTAS). Experimental results demonstrate that MAL significantly reduces annotation costs by approximately 70% and achieves more consistent performance compared to baseline methods.
[1]Alamoodi AH, Zaidan BB, Zaidan AA, et al., 2021. Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: a systematic review. Expert Syst Appl, 167: 114155.
[2]Angluin D, 1988. Queries and concept learning. Mach Learn, 2(4):319-342.
[3]Ash JT, Zhang CC, Krishnamurthy A, et al., 2020. Deep batch active learning by diverse, uncertain gradient lower bounds. Proc 8th Int Conf on Learning Representations.
[4]Barnes J, Kurtz R, Oepen S, et al., 2021. Structured sentiment analysis as dependency graph parsing. Proc 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int Joint Conf on Natural Language Processing, p.3387-3402.
[5]Basiri ME, Nemati S, Abdar M, et al., 2021. ABCDM: an attention-based bidirectional CNN-RNN deep model for sentiment analysis. Fut Gener Comput Syst, 115:279-294.
[6]Bishop CM, 2006. Pattern Recognition and Machine Learning. Springer, New York, USA.
[7]Bodó Z, Minier Z, Csató L, 2011. Proc Mach Learn Res, 16:127-139.
[8]Brinker K, 2003. Incorporating diversity in active learning with support vector machines. Proc 20th Int Conf on Machine Learning, p.59-66.
[9]Chen YK, Lasko TA, Mei QZ, et al., 2015. A study of active learning methods for named entity recognition in clinical text. J Biomed Inform, 58:11-18.
[10]Cohn DA, Ghahramani Z, Jordan MI, 1996. Active learning with statistical models. J Artif Intell Res, 4:129-145.
[11]Culotta A, McCallum A, 2005. Reducing labeling effort for structured prediction tasks. Proc 20th National Conf on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conf, p.746-751.
[12]Dagan I, Engelson SP, 1995. Committee-based sampling for training probabilistic classifiers. Proc 12th Int Conf on Machine Learning, p.150-157.
[13]Dasgupta S, 2011. Two faces of active learning. Theor Comput Sci, 412(19):1767-1781.
[14]Devlin J, Chang MW, Lee K, et al., 2019. BERT: pre-training of deep bidirectional transformers for language understanding. Proc Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, p.4171-4186.
[15]Dor LE, Halfon A, Gera A, et al., 2020. Active learning for BERT: an empirical study. Proc Conf on Empirical Methods in Natural Language Processing, p.7949-7962.
[16]Ducoffe M, Precioso F, 2018. Adversarial active learning for deep networks: a margin based approach. https://arxiv.org/abs/1802.09841
[17]Gal Y, Islam R, Ghahramani Z, 2017. Deep Bayesian active learning with image data. Proc 34th Int Conf on Machine Learning, p.1183-1192.
[18]Geifman Y, El-Yaniv R, 2017. Deep active learning over the long tail. https://arxiv.org/abs/1711.00941
[19]Hanneke S, 2014. Theory of disagreement-based active learning. Found Trends Mach Learn, 7(2-3):131-309.
[20]Houlsby N, Huszár F, Ghahramani Z, et al., 2011. Bayesian active learning for classification and preference learning. https://arxiv.org/abs/1112.5745
[21]Hu R, Mac Namee B, Delany SJ, 2016. Active learning for text classification with reusability. Expert Syst Appl, 45:438-449.
[22]Huang TK, Li LH, Vartanian A, et al., 2016. Active learning with oracle epiphany. Proc 30th Int Conf on Neural Information Processing Systems, p.2828-2836.
[23]Kirsch A, Van Amersfoort J, Gal Y, 2019. BatchBALD: efficient and diverse batch acquisition for deep Bayesian active learning. Proc 33rd Int Conf on Neural Information Processing Systems, Article 631.
[24]Konyushkova K, Sznitman R, Fua P, 2017. Learning active learning from data. Proc 31st Int Conf on Neural Information Processing Systems, p.4228-4238.
[25]Lewis DD, 1995. A sequential algorithm for training text classifiers: corrigendum and additional data. ACM SIGIR Forum, 29(2):13-19.
[26]Liu M, Buntine W, Haffari G, 2018. Learning how to actively learn: a deep imitation learning approach. Proc 56th Annual Meeting of the Association for Computational Linguistics, p.1874-1883.
[27]Liu MY, Tu ZY, Zhang T, et al., 2022. LTP: a new active learning strategy for CRF-based named entity recognition. Neur Process Lett, 54(3):2433-2454.
[28]Margatina K, Vernikos G, Barrault L, et al., 2021a. Active learning by acquiring contrastive examples. Proc Conf on Empirical Methods in Natural Language Processing, p.650-663.
[29]Margatina K, Barrault L, Aletras N, 2021b. On the importance of effectively adapting pretrained language models for active learning. Proc 60th Annual Meeting of the Association for Computational Linguistics, p.825-836.
[30]McCallum A, Nigam K, 1998. Employing EM and pool-based active learning for text classification. Proc 15th Int Conf on Machine Learning, p.350-358.
[31]Medhat W, Hassan A, Korashy H, 2014. Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J, 5(4):1093-1113.
[32]Qiu XP, Sun TX, Xu YG, et al., 2020. Pre-trained models for natural language processing: a survey. Sci China Technol Sci, 63(10):1872-1897.
[33]Samuel D, Barnes J, Kurtz R, et al., 2022. Direct parsing to sentiment graphs. Proc 60th Annual Meeting of the Association for Computational Linguistics, p.470-478.
[34]Sener O, Savarese S, 2018. Active learning for convolutional neural networks: a core-set approach. Proc 6th Int Conf on Learning Representations.
[35]Settles B, 2010. Active Learning Literature Survey. Computer Sciences Technical Report 1648, University of Wisconsin-Madison, Madison, WI, USA.
[36]Settles B, Craven M, 2008. An analysis of active learning strategies for sequence labeling tasks. Proc Conf on Empirical Methods in Natural Language Processing, p.1070-1079.
[37]Settles B, Craven M, Ray S, 2007. Multiple-instance active learning. Proc 20th Int Conf on Neural Information Processing Systems, p.1289-1296.
[38]Shelmanov A, Puzyrev D, Kupriyanova L, et al., 2021. Active learning for sequence tagging with deep pre-trained models and Bayesian uncertainty estimates. Proc 16th Conf of the European Chapter of the Association for Computational Linguistics, p.1698-1712.
[39]Shen YY, Yun H, Lipton Z, et al., 2017. Deep active learning for named entity recognition. Proc 2nd Workshop on Representation Learning for NLP, p.252-256.
[40]Shi WX, Li F, Li JY, et al., 2022. Effective token graph modeling using a novel labeling strategy for structured sentiment analysis. Proc 60th Annual Meeting of the Association for Computational Linguistics, p.4232-4241.
[41]Siddhant A, Lipton ZC, 2018. Deep Bayesian active learning for natural language processing: results of a large-scale empirical study. Proc Conf on Empirical Methods in Natural Language Processing, p.2904-2909.
[42]Smailović J, Grčar M, Lavrač N, et al., 2014. Stream-based active learning for sentiment analysis in the financial domain. Inform Sci, 285:181-203.
[43]Tong SM, Koller D, 2001. Support vector machine active learning with applications to text classification. J Mach Learn Res, 2:45-66.
[44]Venugopalan M, Gupta D, 2015. Exploring sentiment analysis on Twitter data. Proc 8th Int Conf on Contemporary Computing, p.241-247.
[45]Wu X, Chen C, Zhong MY, et al., 2021. HAL: hybrid active learning for efficient labeling in medical domain. Neurocomputing, 456:563-572.
[46]Xiong SF, Fan XB, Batra V, et al., 2023. An entropy-based method with a new benchmark dataset for Chinese textual affective structure analysis. Entropy, 25(5):794.
[47]Yuan M, Lin HT, Boyd-Graber J, 2020. Cold-start active learning through self-supervised language modeling. Proc Conf on Empirical Methods in Natural Language Processing, p.7935-7948.
[48]Zhai ZP, Chen H, Li RF, et al., 2023. USSA: a unified table filling scheme for structured sentiment analysis. Proc 61st Annual Meeting of the Association for Computational Linguistics, p.14340-14353.
[49]Zhang HT, Huang ML, Zhu XY, 2012. A unified active learning framework for biomedical relation extraction. J Comput Sci Technol, 27(6):1302-1313.
[50]Zhang MK, Plank B, 2021. Cartography active learning. Proc Findings of the Association for Computational Linguistics, p.395-406.
[51]Zhou CJ, Li BB, Fei H, et al., 2024. Revisiting structured sentiment analysis as latent dependency graph parsing. Proc 62nd Annual Meeting of the Association for Computational Linguistics, p.10178-10191.
Open peer comments: Debate/Discuss/Question/Opinion
<1>