CLC number: TP391.4
On-line Access: 2025-04-03
Received: 2024-05-17
Revision Accepted: 2024-09-18
Crosschecked: 2025-04-07
Cited: 0
Clicked: 318
Citations: Bibtex RefMan EndNote GB/T7714
Yuxi HAN, Dequan LI, Yang YANG. Significance extraction based on data augmentation for reinforcement learning[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2400406 @article{title="Significance extraction based on data augmentation for reinforcement learning", %0 Journal Article TY - JOUR
基于数据增强的显著性提取强化学习安徽理工大学人工智能学院,中国淮南市,232000 摘要:深度强化学习在视觉任务中展现了显著的能力,但在输入图像受到干扰信号的情况下,其泛化能力较弱,因此难以将训练有素的智能体应用于新环境中。为了让智能体能区分图像中的噪声信号和重要像素,数据增强技术和辅助网络的建立是有效的解决方案。提出一种新的算法,即增强提取显著性Q值(SEQA),该算法鼓励智能体全面探索未知状态,并将注意力集中在重要信息上。具体来说,SEQA屏蔽干扰特征,提取显著特征,使用评论家损失更新掩码解码网络,从而促使智能体关注重要特征并做出正确决策。在DeepMind控制泛化基准上评估该算法,实验结果表明,该算法极大提高了训练效率和稳定性。同时,在大多数DeepMind控制泛化基准任务中,我们的算法在样本效率和泛化能力方面优于最先进的强化学习方法。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]Almuzairee A, Hansen N, Christensen HI, 2024. A recipe for unbounded data augmentation in visual reinforcement learning. https://arxiv.org/abs/2405.17416 ![]() [2]Antotsiou D, Ciliberto C, Kim TK, 2021. Adversarial imitation learning with trajectorial augmentation and correction. IEEE Int Conf on Robotics and Automation, p.4724-4730. ![]() [3]Arulkumaran K, Deisenroth MP, Brundage M, et al., 2017. Deep reinforcement learning: a brief survey. IEEE Signal Process Mag, 34(6):26-38. ![]() [4]Bertoin D, Zouitine A, Zouitine M, et al., 2022. Look where you look! Saliency-guided Q-networks for generalization in visual reinforcement learning. Proc 36th Int Conf on Neural Information Processing Systems, Article 2225. ![]() [5]Chen T, Kornblith S, Norouzi M, et al., 2020. A simple framework for contrastive learning of visual representations. Proc 37th Int Conf on Machine Learning, p.1597-1607. ![]() [6]Cobbe K, Klimov O, Hesse C, et al., 2019. Quantifying generalization in reinforcement learning. Proc 36th Int Conf on Machine Learning, p.1282-1289. ![]() [7]Farebrother J, Machado MC, Bowling M, 2018. Generalization and regularization in DQN. https://arxiv.org/abs/1810.00123 ![]() [8]Fu X, Yang G, Agrawal P, et al., 2021. Learning task informed abstractions. Proc 38th Int Conf on Machine Learning, p.3480-3491. ![]() [9]Gamrian S, Goldberg Y, 2019. Transfer learning for related reinforcement learning tasks via image-to-image translation. Proc 36th Int Conf on Machine Learning, p.2063-2072. ![]() [10]Gelada C, Kumar S, Buckman J, et al., 2019. DeepMDP: learning continuous latent space models for representation learning. Proc 36th Int Conf on Machine Learning, p.2170-2179. ![]() [11]Grooten B, Tomilin T, Vasan G, et al., 2024. MaDi: learning to mask distractions for generalization in visual deep reinforcement learning. Proc 23rd Int Conf on Autonomous Agents and Multiagent Systems, p.733-742. ![]() [12]Hansen N, Wang XL, 2021. Generalization in reinforcement learning by soft data augmentation. IEEE Int Conf on Robotics and Automation, p.13611-13617. ![]() [13]Hansen N, Jangir R, Sun Y, et al., 2021a. Self-supervised policy adaptation during deployment. Proc 9th Int Conf on Learning Representations. ![]() [14]Hansen N, Su H, Wang XL, 2021b. Stabilizing deep Q-learning with ConvNets and vision Transformers under data augmentation. Proc 35th Int Conf on Neural Information Processing Systems, Article 281. ![]() [15]Hansen N, Yuan ZC, Ze YJ, et al., 2023. On pre-training for visuo-motor control: revisiting a learning-from-scratch baseline. Proc 40th Int Conf on Machine Learning, Article 506. ![]() [16]Henderson P, Islam R, Bachman P, et al., 2017. Deep reinforcement learning that matters. Proc 32nd AAAI Conf on Artificial Intelligence, Article 392. ![]() [17]Kaelbling LP, Littman ML, Cassandra AR, 1998. Planning and acting in partially observable stochastic domains. Artif Intell, 101(1-2):99-134. ![]() [18]Kalashnikov D, Irpan A, Pastor P, et al., 2018. Scalable deep reinforcement learning for vision-based robotic manipulation. Proc 2nd Conf on Robot Learning, p.651-673. ![]() [19]Khraishi R, Okhrati R, 2023. Simple noisy environment augmentation for reinforcement learning. https://arxiv.org/abs/2305.02882 ![]() [20]Kirk R, Zhang A, Grefenstette E, et al., 2023. A survey of zero-shot generalisation in deep reinforcement learning. J Artif Intell Res, 76:201-264. ![]() [21]Kurniawati H, 2022. Partially observable Markov decision processes and robotics. Ann Rev Contr Rob Auton Syst, 5:253-277. ![]() [22]Laskin M, Srinivas A, Abbeel P, 2020a. CURL: contrastive unsupervised representations for reinforcement learning. Proc 37th Int Conf on Machine Learning, Article 523. ![]() [23]Laskin M, Lee K, Stooke A, et al., 2020b. Reinforcement learning with augmented data. Proc 34th Int Conf on Neural Information Processing Systems, Article 1669. ![]() [24]Lee K, Lee K, Shin J, et al., 2020. Network randomization: a simple technique for generalization in deep reinforcement learning. Proc 8th Int Conf on Learning Representations. ![]() [25]Levine S, Finn C, Darrell T, et al., 2016. End-to-end training of deep visuomotor policies. J Mach Learn Res, 17(1):1334-1373. ![]() [26]Lin X, Baweja HS, Kantor GA, et al., 2019. Adaptive auxiliary task weighting for reinforcement learning. Proc 33rd Conf on Neural Information Processing Systems, p.4772-4783. ![]() [27]Luketina J, Nardelli N, Farquhar G, et al., 2019. A survey of reinforcement learning informed by natural language. Proc 28th Int Joint Conf on Artificial Intelligence, p.6309-6317. ![]() [28]Mnih V, Kavukcuoglu K, Silver D, et al., 2013. Playing Atari with deep reinforcement learning. https://arxiv.org/abs/1312.5602 ![]() [29]Mnih V, Kavukcuoglu K, Silver D, et al., 2015. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533. ![]() [30]Nair A, Pong VH, Dalal M, et al., 2018. Visual reinforcement learning with imagined goals. Proc 32nd Int Conf on Neural Information Processing Systems, p.9209-9220. ![]() [31]OpenAI, Akkaya I, Andrychowicz M, et al., 2019. Solving Rubik’s cube with a robot hand. https://arxiv.org/abs/1910.07113 ![]() [32]Pinto L, Andrychowicz M, Welinder P, et al., 2018. Asymmetric actor critic for image-based robot learning. https://arxiv.org/abs/1710.06542 ![]() [33]Sinha S, Mandlekar A, Garg A, 2022. S4RL: surprisingly simple self-supervision for offline reinforcement learning in robotics. Proc 5th Conf on Robot Learning, p.907-917. ![]() [34]Song XY, Jiang YD, Tu S, et al., 2020. Observational overfitting in reinforcement learning. Proc 8th Int Conf on Learning Representations. ![]() [35]Sutton RS, Barto AG, 2018. Reinforcement learning: an introduction. IEEE Trans Neur Netw, 9:1054. ![]() [36]Tassa Y, Doron Y, Muldal A, et al., 2018. DeepMind Control Suite. https://arxiv.org/abs/1801.00690 ![]() [37]Tobin J, Fong R, Ray A, et al., 2017. Domain randomization for transferring deep neural networks from simulation to the real world. IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.23-30. ![]() [38]Wang XD, Lian L, Yu SX, 2021. Unsupervised visual attention and invariance for reinforcement learning. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.6673-6683. ![]() [39]Xing JW, Nagata T, Chen KX, et al., 2021. Domain adaptation in reinforcement learning via latent unified state representation. Proc 35th AAAI Conf on Artificial Intelligence, p.10452-10459. ![]() [40]Yang SZ, Ze YJ, Xu HZ, 2023. MoVie: visual model-based policy adaptation for view generalization. Proc 37th Int Conf on Neural Information Processing Systems, Article 940. ![]() [41]Yang W, Wang XL, Farhadi A, et al., 2019. Visual semantic navigation using scene priors. Proc 7th Int Conf on Learning Representations. ![]() [42]Yarats D, Zhang A, Kostrikov I, et al., 2019. Improving sample efficiency in model-free reinforcement learning from images. Proc 35th AAAI Conf on Artificial Intelligence, p.10674-10681. ![]() [43]Yarats D, Kostrikov I, Fergus R, 2021. Image augmentation is all you need: regularizing deep reinforcement learning from pixels. Proc 9th Int Conf on Learning Representations. ![]() [44]Yu T, Zhang ZZ, Lan CL, et al., 2022. Mask-based latent reconstruction for reinforcement learning. Proc 36th Conf on Neural Information Processing Systems, p.25117-25131. ![]() [45]Ze YJ, Hansen N, Chen YB, et al., 2023. Visual reinforcement learning with self-supervised 3D representations. IEEE Rob Autom Lett, 8(5):2890-2897. ![]() [46]Zhang A, Ballas N, Pineau J, 2018. A dissection of overfitting and generalization in continuous reinforcement learning. https://arxiv.org/abs/1806.07937 ![]() [47]Zhang A, McAllister RT, Calandra R, et al., 2021. Learning invariant representations for reinforcement learning without reconstruction. Proc 9th Int Conf on Learning Representations. ![]() [48]Zhang H, Chen HG, Xiao CW, et al., 2020. Robust deep reinforcement learning against adversarial perturbations on state observations. Proc 34th Int Conf on Neural Information Processing Systems, Article 1765. ![]() [49]Zhao J, Zhao YP, Wang WX, et al., 2022. Coach-assisted multi-agent reinforcement learning framework for unexpected crashed agents. Front Inform Technol Electron Eng, 23(7):1032-1042. ![]() [50]Zhou ZH, 2024. Continuous control reinforcement learning: distributed distributional DrQ algorithms. https://arxiv.org/abs/2404.10645 ![]() [51]Zhu YK, Mottaghi R, Kolve E, et al., 2016. Target-driven visual navigation in indoor scenes using deep reinforcement learning. IEEE Int Conf on Robotics and Automation, p.3357-3364. ![]() Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2025 Journal of Zhejiang University-SCIENCE |
Open peer comments: Debate/Discuss/Question/Opinion
<1>