Research on Mathematical Modeling and Quantitative Analysis of Robots Collective Behaviors
|Course||Pattern Recognition and Intelligent Systems|
|Keywords||mobile robots collective behavior mathematical modeling quantitative analysis reinforcement learning|
The collective behaviors of mobile robots emerge from the interaction between robots and environment. The evolution of behaviors is a highly complex dynamic process. The movement form of behavior is often chaotic. So most existing modeling and design methods of robots behaviors are insufficient to describe the complexity of robots collective behaviors on mechanism. The scientific methods in mobile robots collective behaviors are to realize the mathematical modeling and quantitative analysis of mobile robots behaviors. And this is a processing theroy and techology problem in the practical applications of robots behavior learning.This dissertation gives the mathematical description of the relevant parameters via tasks modeling and robot-environment interaction modeling. The chaos dynamics model of robots collective behavior is established. This dissertation will make analysis on the mathematical model to understand the acting law within the system better. The robots collective behavior learning is largely concerned with social interaction mechanism between robots and environment. The comlex collective behavior is emerged via the social interaction. This dissertation gioves a method of quantitative analysis and mathematical modeling for robots collective behavior, and basically sets up the complete theory frame system of social interaction among robots, tasks and environment. The main works and contributions are summarized as follows:(1) An initialization method is proposed for mobile robots path planning reinforcement learning based on neural network to improve the speed and the combination explosion of standard Q-learning algorithm. The neural network has the same topography as robots work space. Each neuron corresponds to a certain discrete state. The neural network will evolve and reach an equilibrium state according to the initial environment information. The activity of the special neuron denotes the maximum cumulative reward by following the optimal policy from the corresponding state. Then the initial Q values are defined as the immediate reward plus the maximum cumulative reward by following the optimal policy beginning at the succeeding state. The prior knowledge can be incorporated into the learning system by the initialization method of Q values. In this way, we can optimize the learning in the initial stage and give robots a better learning foundation.(2) Reinforcement learning algorithm for multi-robot will become very slow when the number of robots is increasing resulting in an exponential increase of state space. A sequential Q-learning base on knowledge sharing is presented. In the process of sequential Q-learning the pursuers firstly form teams based on clustering method. Each teammate evolves in sequence. The rule repository of robots behaviors is initialized in the process of reinforcement learning. Mobile robots obtain present environmental state by sensors. Then the state will be matched to determine if the relevant behavior rule has been stored in database. If the rule is present, an action will be chosen in accordance with the knowledge and the rules, and the corresponding weight will be refined. Otherwise the new rule will be joined into the database. In reinforcement learning of behavior weitht the learner assignes a corresponding weight to every robot based on weighted strategy sharing. The behavior weight will be refined based on the weighted sum of all robots experience value.(3) The robots behaviors in the previous two parts are modelled. The complete mathematical model of multi-robot cooperative capture behavior is established based on the fractal modeling thought. There exists certain degree of self-similarity between the whole and the part of multi-robot cooperative capture behavior system in the process of robots behavior modeling. The robots behavior model at different levels is explicitly established from macro to micro aspects. The overall system goal is firstly determined based on the specific task. Then the mathematical model of multi-robot cooperative capture behavior is established at state level based on the macroscopic modeling. In addition, the individual parameters that influence robots collective behaviors performance are analyzed theoretically. The mathematical model of interaction between robots and environment at behavior level is finally established based on the polynomial modeling. The critical parameters that influence robots collective behaviors performance are analyzed via mathematical model of robots collective behavior. The optimal parameters of system are chosen via mathematical analysis. This will provide the essential theroy basis for the design and analysis of robots collective behaviors.(4) The interaction between robots and environment is analyzed based on the dynamics system theory. The system law in the multidimensional phase space is investigated based on the evolution track of a special robot. The data points in different time are firstly obtained for the special robot. The appropriate embedding dimension and delay time are chosen. The phase space equivalent to the original system is reconstructed. The multi-robot system can be adequately described based on the information of the phase space. The dynamics system states can be forecasted based on the information. Then the property of the attractor in the phase space is analyzed. The eigenvalues of the attractor, including Lyapunov exponent, correlation dimension and Kolmogorov entropy, are calculated. The quantitative description and analysis of robots collective behavior is performed based on the eigenvalues. The key that influences the interaction of robots is finally investigated based on the quantized parameters. The analysis will make us understand robots interaction mechanism better.