Finite-time tracking control for serial manipulators using reinforcement learning-based active disturbance rejection

Haiyan Wang, Sotirios Spanogianopoulos, Baojiang Li, Xichao Wang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Serial manipulators play a critical role in various applications, where accurate trajectory tracking is essential. This work presents a finite-time tracking control approach based on reinforcement learning (RL) and active disturbance rejection control (ADRC) for serial manipulators with unknown, bounded uncertainties. The control structure is built upon the ADRC framework, in which system uncertainties and external disturbances are collectively treated as a total disturbance and estimated by an extended state observer (ESO). To enhance disturbance estimation, an actor-critic RL agent is incorporated into the ESO, where the actor neural network models the total disturbance, and the critic neural network evaluates the trajectory tracking cost. Through the interaction between the actor and critic networks, a precise disturbance estimate is achieved. Additionally, to ensure fast and accurate trajectory tracking, a finite-time controller based on a non-singular fast terminal sliding mode is introduced, replacing the state error feedback controller in ADRC. The control stability is analyzed using Lyapunov theory. Simulation results show that the proposed control method offers superior tracking performance, along with enhanced uncertainty suppression and robustness.

Original languageEnglish
Pages (from-to)2759-2768
Number of pages10
JournalInternational Journal of Control, Automation and Systems
Volume23
Issue number9
DOIs
Publication statusPublished - 9 Sept 2025

Keywords

  • Active disturbance rejection control
  • finite-time control
  • reinforcement learning
  • uncertain serial manipulators

Fingerprint

Dive into the research topics of 'Finite-time tracking control for serial manipulators using reinforcement learning-based active disturbance rejection'. Together they form a unique fingerprint.

Cite this