Multiagent Q-learning by context-specific coordination graphs
Jelle R. Kok and Nikos Vlassis. Multiagent Q-learning by context-specific coordination graphs. In Proceedings of the International Conference on Intelligent Autonomous Systems, pp. 317–324, IOS Press, Amsterdam, The Netherlands, March 2004.
Download
Abstract
One of the main problems in cooperative multiagent learning is that the joint action space is exponential in the number of agents. In this paper, we investigate a sparse representation of the joint action space in which value rules specify the coordination dependencies between the different agents for a particular state. Each value rule has an associated payoff which is part of the global Q-function. We will discuss a Q-learning method that updates these context-specific rules based on the optimal joint action found with the coordination graph algorithm. We apply our method to the pursuit domain and compare it with other multiagent reinforcement learning methods.
BibTeX Entry
@InProceedings{Kok04ias, author = {Jelle R. Kok and Nikos Vlassis}, title = {Multiagent {Q}-learning by context-specific coordination graphs}, address = {Amsterdam, The Netherlands}, booktitle = {Proceedings of the International Conference on Intelligent Autonomous Systems}, year = {2004}, pages = {317-324}, editor = {Frans Groen and Nancy Amato and Andrea Bonarini and Eiichi Yoshida and Ben Kr\"ose}, publisher = {IOS Press}, month = mar, postscript = {2004/Kok04ias.ps.gz}, pdf = {2004/Kok04ias.pdf}, abstract = { One of the main problems in cooperative multiagent learning is that the joint action space is exponential in the number of agents. In this paper, we investigate a sparse representation of the joint action space in which value rules specify the coordination dependencies between the different agents for a particular state. Each value rule has an associated payoff which is part of the global Q-function. We will discuss a Q-learning method that updates these context-specific rules based on the optimal joint action found with the coordination graph algorithm. We apply our method to the pursuit domain and compare it with other multiagent reinforcement learning methods. } }
Generated by bib2html.pl (written by Patrick Riley) on Tue Oct 31, 2006 19:33:42 UTC