- 无标题文档
查看论文信息

中文题名:

 

基于强化学习的干扰环境下无人机集群通信研究

    

姓名:

 冯博豪    

学号:

 1049722002491    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 081202    

学科名称:

 工学 - 计算机科学与技术(可授工学、理学学位) - 计算机软件与理论    

学生类型:

 硕士    

学校:

 武汉理工大学    

院系:

 计算机与人工智能学院    

专业:

 计算机科学与技术    

研究方向:

 强化学习    

第一导师姓名:

 章阳    

第一导师院系:

 计算机与人工智能学院    

完成日期:

 2023-03-29    

答辩日期:

 2023-05-11    

中文关键词:

 

无人机集群 ; 无人系统 ; 强化学习 ; 多智能体系统

    

中文摘要:

近年来,无人机依靠成本低廉,控制灵活等特点迅速发展,尤其在军事领域,无人机成为了许多国家军事国防的重要组成部分,无人机集群协同作战更是其中的重点。由于战场的复杂性以及易变性,战场环境中存在着大量的干扰,无人机之间的通信极易受到破坏,但受限于无人机体积,增加抗干扰设备既会占用无人机其他功能的资源,也会提高无人机成本。所以在这种通信受限环境下,如何利用无人机集群本身的算力和感知抵御通信干扰带来的影响,实现无人机集群协同作战,成为了急需解决的问题。本文从多智能体强化学习方法的角度出发,探讨了在通信受限的战场环境下作战无人机集群系统如何实现有效通信以协同作战。 首先,针对作战无人机集群系统在复杂通信干扰的战场环境下如何准确接收协同通信信息的问题,本文提出了基于想象与注意机制的无人机通信协同信息容错方法,在以Actor-Critic框架设计的消息共享网络上增加了信息筛选注意力机制,增强了无人机自主过滤干扰信息并从嘈杂的环境中获取重要信息的能力;同时在消息共享网络中还添加了想象力机制,提高了无人机集群控制模型在发生通信数据失真情况下的信息利用能力。实验证明本文所提通信协同信息容错方法在通信干扰环境下能够准确接收协同通信信息。 进一步的,针对作战无人机集群系统在通信资源有限的复杂战场干扰环境下如何进行简洁有效的信息表达的问题,为减少无效信息流的传递以减轻通信压力,本文提出了基于有效表达网络的无人机通信协同信息自适应调节方法。以信息论优化技术为基础,在基于想象与注意机制的无人机通信协同信息容错方法上增加了通信数据正则化优化器,实现了无人机集群间通信资源的优化,降低了通信数据量。实验证明:基于有效表达网络的无人机通信协同信息自适应调节方法在通信资源有限的干扰环境下能够高效传达协同通信信息。 最后,对因受敌方通信静默干扰而导致与机群失去联系的无人机该如何继续执行任务的问题,本文提出了基于语义通信与想象机制的通信丢失下无人机智能跟随方法,在无人机原有模型的基础上增添了语义通信模型和同伴无人机行为学习模型,使得无人机在发生通信干扰的情况下无需增加额外的通信抗干扰设备也能抵御通信干扰的影响,以对机群进行不间断的跟随,提升通信受限环境下完成任务的概率。实验证明,基于语义通信与想象机制的通信丢失下无人机智能跟随方法能够有效应对干扰环境下无人机通信丢失的问题。

参考文献:

[1] Qin B, Zhang D, Tang S, et al. Distributed Grouping Cooperative Dynamic Task Assignment Method of UAV Swarm[J]. Applied Sciences, 2022, 12(6): 2865.

[2] Harikumar K, Senthilnath J, Sundaram S. Multi-UAV Oxyrrhis Marina-inspired Search and Dynamic Formation Control for Forest Firefighting[J]. IEEE Transactions on Automation Science and Engineering, 2018, 16(2): 863-873.

[3] Zhou W, Li J, Zhang Q. Joint Communication and Action Learning in Multi-target Tracking of UAV Swarms with Deep Reinforcement Learning[J]. Drones, 2022, 6(11): 339.

[4] Xu S, Li L, Zhou Z, et al. A Task Allocation Strategy of the UAV Swarm based on Multi-discrete Wolf Pack Algorithm[J]. Applied Sciences, 2022, 12(3): 1331.

[5] 张安,杨咪,毕文豪等.基于多策略GWO算法的不确定环境下异构多无人机任务分配[J/OL].航空学报:1-16

[6] Zhao J, Sun J M, Cai Z H, et al. Distributed Coordinated Control Scheme of UAV Swarm based on Heterogeneous Roles[J]. Chinese Journal of Aeronautics, 2022, 35(1): 81-97.

[7] Rothmann M, Porrmann M. A Survey of Domain-specific Architectures for Reinforcement Learning[J]. IEEE Access, 2022, 10: 13753-13767.

[8] Zhou W, Li J, Liu Z, et al. Improving Multi-target Cooperative Tracking Guidance for UAV Swarms using Multi-agent Reinforcement Learning[J]. Chinese Journal of Aeronautics, 2022, 35(7): 100-112.

[9] Ge H, Ge Z, Sun L, et al. Enhancing Cooperation by Cognition Differences and Consistent Representation in Multi-agent Reinforcement Learning[J]. Applied Intelligence, 2022: 1-16.

[10] Liang W, Wang J, Bao W, et al. Qauxi: Cooperative Multi-agent Reinforcement Learning with Knowledge Transferred from Auxiliary Task[J]. Neurocomputing, 2022, 504: 163-173.

[11] Bai Y, Gong C, Zhang B, et al. Cooperative Multi-Agent Reinforcement Learning with Hypergraph Convolution[C]. 2022 International Joint Conference on Neural Networks(IJCNN), IEEE, 2022: 1-8.

[12] Yang G, Chen H, Zhang J, et al. Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning[C]. 2022 International Joint Conference on Neural Networks(IJCNN), IEEE, 2022: 1-8.

[13] Tong W and Li G. Nine Challenges in Artificial Intelligence and Wireless Communications for 6G, IEEE Wireless Communications, pp. 1–10, 2022

[14] Shannon C E and Weaver W. The Mathematical Theory of Communication. The University of Illinois Press, 1949.

[15] 刘传宏,郭彩丽,杨洋等.面向智能任务的语义通信:理论、技术和挑战[J].通信学报,2022,43(06):41-57.

[16] Luo X, Chen H, Guo Q. Semantic Communications: Overview, Open Issues, and Future Research Directions[J]. IEEE Wireless Communications, 2022, 29(1): 210-219.

[17] Qin Z, Ye H, Li G, et al. Deep Learning in Physical Layer Communications[J]. IEEE Wireless Communications, 2019, 26(2): 93-99.

[18] Chattopadhyay A, Haeffele B D, Geman D, et al. Quantifying Task Complexity through Generalized Information Measures[J]. International Conference on Learning Representations, 2020.

[19] Farsad N, Rao M, Goldsmith A. Deep Learning for Joint Source-channel Coding of Text[C]. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), IEEE, 2018: 2326-2330.

[20] Bourtsoulatze E, Kurka D B, Gündüz D. Deep Joint Source-channel Coding for Wireless Image Transmission[J]. IEEE Transactions on Cognitive Communications and Networking, 2019, 5(3): 567-579.

[21] Xie H, Qin Z, Li G, et al. Deep Learning Enabled Semantic Communication Systems[J]. IEEE Transactions on Signal Processing, 2021, 69: 2663-2675.

[22] Weng Z, Qin Z. Semantic Communication Systems for Speech Transmission[J]. IEEE Journal on Selected Areas in Communications, 2021, 39(8): 2434-2444.

[23] Xie H, Qin Z. A Lite Distributed Semantic Communication System for Internet of Things[J]. IEEE Journal on Selected Areas in Communications, 2022, 39(1): 142-153.

[24] Xie H, Qin Z, Li G Y. Task-oriented Multi-user Semantic Communications for VQA[J]. IEEE Wireless Communications Letters, 2022, 11(3): 553-557.

[25] Kalfa M, Gok M, Atalik A, et al. Towards Goal-oriented Semantic Signal Processing: Applications and Future Challenges[J]. Digital Signal Processing, 2021, 119: 103134.

[26] Strinati E C, Barbarossa S. 6G Networks: Beyond Shannon towards Semantic and Goal-oriented Communications[J]. Computer Networks, 2021, 190: 107930.

[27] Lan Q, Wen D, Zhang Z, et al. What is Semantic Communication? A View on Conveying Meaning in the Era of Machine Intelligence[J]. Journal of Communications and Information Networks, 2021, 6(4): 336-371.

[28] Shi G, Xiao Y, Li Y, et al. From Semantic Communication to Semantic-aware Networking: Model, Architecture, and Open Problems[J]. IEEE Communications Magazine, 2021, 59(8): 44-50.

[29] Zhang P, Xu W, Gao H, et al. Toward Wisdom-evolutionary and Primitive-concise 6G: A New Paradigm of Semantic Communication Networks[J]. Engineering, 2022, 8: 60-73.

[30] Kountouris M, Pappas N. Semantics-empowered Communication for Networked Intelligent Systems[J]. IEEE Communications Magazine, 2021, 59(6): 96-102.

[31] Uysal E, Kaya O, Ephremides A, et al. Semantic Communications in Networked Systems: A Data Significance Perspective[J]. IEEE Network, 2022, 36(4): 233-240.

[32] Sutton R S, Barto A G. Reinforcement Learning: An Introduction[M]. MIT Press, 2018.

[33] Gronauer S, Diepold K. Multi-agent Deep Reinforcement Learning: A Survey[J]. Artificial Intelligence Review, 2022: 1-49

[34] Matsuo Y, LeCun Y, Sahani M, et al. Deep Learning, Reinforcement Learning, and World Models[J]. Neural Networks, 2022.

[35] Puterman M L. Markov Decision Processes[J]. Handbooks in Operations Research and Management Science, 1990, 2: 331-434.

[36] Mao W, Yang L, Zhang K, et al. On Improving Model-free Algorithms for Decentralized Multi-agent Reinforcement Learning[C]. International Conference on Machine Learning, PMLR, 2022: 15007-15049.

[37] Jeon J, Kim W, Jung W, et al. Maser: Multi-agent Reinforcement Learning with Sub-goals Generated from Experience Replay Buffer[C]. International Conference on Machine Learning. PMLR, 2022: 10041-10052.

[38] Alibabaei K, Gaspar P D, Assunção E, et al. Comparison of On-policy Deep Reinforcement Learning A2C with Off-policy DQN in Irrigation Optimization: A Case Study at A Site in Portugal[J]. Computers, 2022, 11(7): 104.

[39] Cheng C A, Xie T, Jiang N, et al. Adversarially Trained Actor Critic for Offline Reinforcement Learning[C]. International Conference on Machine Learning, PMLR, 2022: 3852-3878.

[40] Lapso J, Peterson G L. Factored Beliefs for Machine Agents in Decentralized Partially Observable Markov Decision Processes[J].The International FLAIRS Conference Proceedings, 2022, 35.

[41] Chen F H, Huang S C, Lu Y C, et al. Reducing NEXP-complete Problems to DQBF[C]. Conference on Formal Methods in Computer-aided Design-FMCAD, 2022: 199.

[42] Sadhukhan P, Selmic R R. Proximal Policy Optimization for Formation Navigation and Obstacle Avoidance[J]. International Journal of Intelligent Robotics and Applications, 2022, 6(4): 746-759.

[43] Guo M, Xu T, Liu J, et al. Attention Mechanisms in Computer Vision: A survey[J]. Computational Visual Media, 2022, 8(3): 331-368.

[44] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need [C]. NIPS 2017: Proceedings of the 30th Conference and Workshop on Neural Information Processing Systems, Long Beach: MIT Press, 2017: 5998-6008.

[45] Dey R, Salem F M. Gate-variants of Gated Recurrent Unit(GRU)Neural Networks[C]. IEEE 60th International Midwest Symposium on Circuits and Systems(MWSCAS), IEEE, 2017: 1597-1600.

[46] Karabag M O, Neary C, Topcu U. Planning Not to Talk: Multiagent Systems that are Robust to Communication Loss[J]. arXiv preprint arXiv:2201.06619, 2022.

[47] Bansal M, Goyal A, Choudhary A. A comparative Analysis of K-Nearest Neighbour, Genetic, Support Vector Machine, Decision Tree, and Long Short Term Memory algorithms in machine learning[J]. Decision Analytics Journal, 2022: 100071.

[48] Su J, Adams S, Beling P. Value-decomposition Multi-agent Actor-Critics[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(13): 11352-11360.

[49] Liang Y, Wu H, Wang H. ASM-PPO: Asynchronous and Scalable Multi-Agent PPO for Cooperative Charging[C]. Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022: 798-806.

[50] Chen S, Yang Y, Su R. Deep Reinforcement Learning with Emergent Communication for Coalitional Negotiation Games[J]. Mathematical Biosciences and Engineering, 2022, 19: 4592-4609.

[51] Shannon C E. A Mathematical Theory of Communication[J]. The Bell System Technical Journal, 1948, 27(3): 379-423.

[52] Wang K, Wang J, Zeng B, et al. An Integrated Power Load Point-interval Forecasting System based on Information Entropy and Multi-objective Optimization[J]. Applied Energy, 2022, 314: 118938.

[53] Zhai Y, Yang B, Xi Z. Belavkin–Staszewski Relative Entropy, Conditional Entropy, and Mutual Information[J]. Entropy, 2022, 24(6): 837.

[54] Li Z, Zhao Y, Hu X, et al. Ecod: Unsupervised Outlier Detection using Empirical Cumulative Distribution Functions[J]. IEEE Transactions on Knowledge and Data Engineering, 2022.

[55] Knoblauch J, Jewson J, Damoulas T. An Optimization-centric View on Bayes’ Rule: Reviewing and Generalizing Variational Inference[J]. Journal of Machine Learning Research, 2022, 23(132): 1-109.

[56] Chobtham K, Constantinou A C. Discovery and Density Estimation of Latent Confounders in Bayesian Networks with Evidence Lower Bound[C]. International Conference on Probabilistic Graphical Models, PMLR, 2022: 121-132.

[57] Zhang J, Li C, Yin Y, et al. Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to Popular Convolutional Neural Network and Potential Visual Transformer[J]. Artificial Intelligence Review, 2022: 1-58.

[58] Hu J, Zhong Y, Shang X. A Versatile and Scalable Single-cell Data Integration Algorithm based on Domain-adversarial and Variational Approximation[J]. Briefings in Bioinformatics, 2022, 23(1): bbab400.

[59] Jones G L, Qin Q. Markov Chain Monte Carlo in Practice[J]. Annual Review of Statistics and Its Application, 2022, 9: 557-578.

[60] Yu Y, Guo J, Ahn C K, et al. Neural Adaptive Distributed Formation Control of Nonlinear Multi-UAVs with Unmodeled Dynamics[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022.

[61] Zhi Y, Liu L, Guan B, et al. Distributed Robust Adaptive Formation Control of Fixed-wing UAVs with Unknown Uncertainties and Disturbances[J]. Aerospace Science and Technology, 2022, 126: 107600.

[62] Dou L, Cai S, Zhang X, et al. Event-triggered-based Adaptive Dynamic Programming for Distributed Formation Control of Multi-UAV[J]. Journal of the Franklin Institute, 2022, 359(8): 3671-3691.

[63] Yan J, Yu Y, Wang X. Distance-based Formation Control for Fixed-wing UAVs with Input Constraints: A Low Gain Method[J]. Drones, 2022, 6(7): 159.

[64] Suo W, Wang M, Zhang D, et al. Formation Control Technology of Fixed-wing UAV Swarm based on Distributed Ad Hoc Network[J]. Applied Sciences, 2022, 12(2): 535.

[65] Liu B, Li A, Guo Y, et al. Adaptive Distributed Finite-time Formation Control for Multi-UAVs under Input Saturation without Collisions[J]. Aerospace Science and Technology, 2022, 120: 107252.

[66] Souza F C, Dos S R B, Oliveira A M, et al. Influence of Network Topology on UAVs Formation Control based on Distributed Consensus[C]. 2022 IEEE International Systems Conference, IEEE, 2022: 1-8.

[67] Uysal E, Kaya O, Ephremides A, et al. Semantic Communications in Networked Systems: A Data Significance Perspective[J]. IEEE Network, 2022, 36(4): 233-240.

[68] Li Q, Xie F, Zhao J, et al. FPS: Fast Path Planner Algorithm Based on Sparse Visibility Graph and Bidirectional Breadth-First Search[J]. Remote Sensing, 2022, 14(15): 3720.

[69] Gu Y, Cheng Y, Yu K, et al. Anti-martingale Proximal Policy Optimization[J]. IEEE Transactions on Cybernetics, 2022.

[70] Zhang J, Zhang Z, Han S, et al. Proximal Policy Optimization via Enhanced Exploration Efficiency[J]. Information Sciences, 2022, 609: 750-765.

[71] Sharma K, Singh B, Herman E, et al. Maximum In-formation Measure Policies in Reinforcement Learning with Deep Energy-Based Model[C]. 2021 International Conference on Computational Intelligence and Knowledge Economy, IEEE, 2021: 19-24.

[72] Ye D, Chen G, Zhang W, et al. Towards Playing Full MOBA Games with Deep Reinforcement Learning[J]. Advances in Neural Information Processing Systems, 2020, 33: 621-632.

[73] Fu Q, Qiu T, Pu Z, et al. A Cooperation Graph Approach for Multiagent Sparse Reward Reinforcement Learning[C]. 2022 International Joint Conference on Neural Networks, IEEE, 2022: 1-8.

中图分类号:

 V279    

条码号:

 002000073591    

馆藏号:

 YD10001713    

馆藏位置:

 203    

备注:

 403-西院分馆博硕论文库;203-余家头分馆博硕论文库    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式