地月环境下航天器近距离接近自主决策

  • 打印
  • 收藏
收藏成功


打开文本图片集

中图分类号:V448. 234 文献标识码:Adoi:10. 37188/OPE. 20253306. 0979

Autonomous decision-making for spacecraft close approaches in the Earth-Moon environment

HUANG Cheng*,QIU Zhicong,XU Jiazhong

(College 150080,China) * Corresponding author,E-mail:huangchengsunxi@163. com

Abstract:Aiming at the autonomous decision-making problem spacecraft approaching closely in the Earth-Moon environment,a decision-making method based on an improved Proximal Policy Optimization (PPO) algorithm was proposed to enable the tracking spacecraft to reach the state required for docking with the target spacecraft within a specified time. First,an LSTM network was introduced into the strate⁃ gic network structure the PPO algorithm to hle state inputs increase the robustness the algo⁃ rithm in learning tasks with rom parameters. Secondly,a state-based internal reward exploration mech⁃ anism was proposed to improve the algorithm's exploration ability by linearly superimposing it with the al⁃ gorithm's basic reward. In addition,an importance sampling ratio constraint was designed introduced into the strategy loss function to prevent high variance objective estimation from endangering the optimiza⁃ tion the objective function. Finally,the effectiveness the proposed method was verified by comparing the learning reward task execution results with other learning algorithms. The simulation results show that the learning reward value the improved PPO algorithm is increased by 15% ,the fuel consumption performing close tasks is reduced by 57% , the mission success rate is increased by 1% when there is unmodelled interference. This method can significantly improve the spacecraft's autonomous decisionmaking capabilities when performing close missions.

Key words:spacecraft proximity approach;autonomous decision making;deep reinforcement learning; proximal policy optimization

1 引 言

随着航天技术的进步,各航天强国积极发展下一代智能化航天器,并开展交会与接近、对接、追逃等新技术试验。(剩余15857字)

monitor
客服机器人