Skip to content

a comprehensive approach to integrate formation tracking control and optimal control for a fleet of multiple surface vehicles (SVs), accounting for both kinematic and dynamic models of each SV agent

Notifications You must be signed in to change notification settings

duongdinhph/Formation_RL_for_multiagents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 

Repository files navigation

Formation control scheme with Reinforcement Learning for Multiple Surface vehicles (SV)

A combination of formation control and optimal control based on reinforcement learning for multiple SVs
Full report: link

1. Introduction

This article presents a comprehensive approach to integrating formation tracking control and optimal control for a fleet of multiple surface vehicles (SVs), accounting for both kinematic and dynamic models of each SV agent. The proposed control framework comprises two core components: a high-level displacement-based formation controller and a low-level reinforcement learning (RL)-based optimal control strategy for individual SV agents. The high-level formation control law, employing a modified gradient method, is introduced to guide the SVs in achieving desired formations. Meanwhile, the low-level control structure, featuring time-varying references, incorporates the RL algorithm by transforming the time-varying closed agent system into an equivalent autonomous system. The application of Lyapunov’s direct approach, along with the existence of the Bellman function, guarantees the stability and optimality of the proposed design. Through extensive numerical simulations, encompassing various comparisons and scenarios, this study demonstrates the efficacy of the novel formation control strategy for multiple SV agent systems, showcasing its potential for real-world applications.

2. The proposed control scheme

control_scheme

2.1. The high-level displacement-based controller

The high-level formation control law, employing a modified gradient method, translates the desired formation and trajectory into individual reference trajectories that are feasible. $$\dot{\bar{p_j}} = h_j h_j^T f_j,$$ $$\dot{h_j} = (I - h_j h_j^T) f_j, j \in S $$ The high-level displacement-based formation control protocol can be implemented for each SV: $$\dot{\bar{x_j}} = \bar{v_j} cos \bar{\psi_j}, $$ $$\dot{\bar{y_j}} = \bar{v_j} sin \bar{\psi_j}, $$ $$\bar{\omega_j} = [-sin \bar{\psi_j},cos \bar{\psi_j}](-(\mathcal{L} \otimes I)(\bar{p_j} - \bar{p_j}^*)), $$

$$\bar{v_j} = [cos \bar{\psi_j}, sin \bar{\psi_j}](-(\mathcal{L} \otimes I)(\bar{p_j} - \bar{p_j}^*))$$

After that, we can obtain the desired trajectory for the low-level controller by integrating these derivatives.

2.2. Low-level RL-based control design for each SV

We approximate the Bellman function and the Optimal controller using a critic NN and an actor NN:

$$\widehat V_i(X_i) = {\widehat {W}_{ci}}^T\Psi_i (X_i)$$

$$\widehat u_i(X_i) = - \frac{1}{2}{R^{ - 1}}{G_i^T}(X_i){(\frac{{\partial \Psi_i }}{{\partial x_i}})^T}{\widehat {W}_{ci}}$$

3. Multi-agent formation controller verification

3.1. Flower-shaped formation

The communication graph:

communication graph

Tracking trajectories of four agents following a straight line:

tracking_case1 flower_shape_illu

3.2. Square formation

The communication graph:

communication graph

Tracking trajectories of four agents following a circle line:

tracking_case2 square_illu

3.3. Diamond formation with more agents

The communication graph:

communication_graph_2

Tracking trajectories of eight agents following a circle line:

diamond_illu

3.4. Advantages of the RL-based method compared to a non-RL policy

The metric is formulated as follows:

$$J_\Sigma = \int\limits_0^T\left( {\eta_i^T}Q\eta_i + {\tau_i^T}R\tau_i \right)dt$$

The cumulative cost with RL is consistently smaller than that without RL:

cost_func_compare

4. Conclusion

Project development direction:

  • The authors plan to conduct experimental validation and extend the low-level tracking controller with model-free RL algorithms that do not necessarily require complete system dynamics.
  • Direct implementation of RL algorithms to solve multi-agent control problems in nonlinear systems with uncertainty and disturbance is considered as a feasible approach for further research.

About

a comprehensive approach to integrate formation tracking control and optimal control for a fleet of multiple surface vehicles (SVs), accounting for both kinematic and dynamic models of each SV agent

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published