COSYNELog in


Cosyne 2008 Workshops


March 3-4, 2008

Snow Bird, Utah


Speaker Name

Aristodemos Cleanthous, Department of Computer Science, University of Cyprus, Nicosia, Cyprus

Talk Title

Can Networks of Leaky Integrate-and-Fire Neurons with Spike-based Reinforcement Learning Play Games?

Talk Abstract

It has been suggested by Seung [1] that microscopic randomness at the synaptic level may be utilized by the brain for optimization and in particular that synaptic stochasticity can be exploited for the purposes of learning. In Seung’s spiking neural models [1], learning is achieved by reward maximization through the effective communication of a single global reinforcement signal. The dependence on this global reinforcement in reaching an optimal performance is crucial [2]. Using the same learning algorithm, we developed a computational model where we implement two spiking neural networks with leaky integrate-and-fire neurons as two players, learning simultaneously but independently, competing in the Iterated Prisoner’s Dilemma (IPD) game. In the iterated version of the Prisoner’s Dilemma game [3], each player can either cooperate or defect; defection yields the higher payoff for the individual player but if both players defect, the resulting payoff for both is worse than if they had both cooperated. The goal is to maximize the total payoff. To the best of our knowledge, this is the first time that a spiking neural model simulates a game theoretical situation.

Each of the two decision networks “chooses” its action at the end of each round depending on the firing rate of two output neurons. Preliminary results show that a single global reinforcement signal, for each network, is not able to drive the firing rates of the two output neurons such that the agent maximizes the reward, possibly due to saturation. This problem is overcome by applying concurrently a positive global reinforcement to one output and a negative global reinforcement to the other output. It could therefore be proposed, that in cases where more than one neuron compete for reinforcement in a network, the global evaluation signal of Seung’s reinforcement of stochastic transmission [1], should consist of global reward and penalty accordingly, for avoidance of possible saturation. In addition, the effect of combining global and local reinforcements to the players’ performance is investigated, as well as the training with other types of spike-based reinforcement learning (like for example the ones proposed by Pfister et al., [4] and Florian [5]) in this game theoretic scenario.


References

[1] Learning in Spiking Neural Networks by Reinforcement of Synaptic Transmission. H. S. Seung, Neuron, 40: 1063-1073, 2003.

[2] Neural Networks and Perceptual Learning, M. Tsodyks and C. Gilbert, Nature, 431: 775-781, 2004.

[3] Prisoner’s Dilemma, A. Rappoport and A. M. Chammah, Ann Arbor: Univ. of Michigan Press, 1965.

[4] Optimal Spike-Timing-Dependent Plasticity for Precise Action Potential Firing in Supervised Learning. J-P. Pfister, T. Toyoizumi, D. Barber and W. Gerstner, Neural Computation, 18: 1318-1348, 2006.

[5] Reinforcement Learning Through Modulation of Spike-Timing Dependent Synaptic Plasticity. R. V. Florian. Neural Computation, 19: 1468-1502, 2007.

Retrieved from "http://www.cosyne.org/wiki/Workshop_speaker_Aristodemos_Cleanthous"

This page has been accessed 651 times. This page was last modified 19:34, 5 February 2008.


Cosyne 10
Meeting program
Workshops
Hotels
Transportation
Abstracts
Registration
Volunteers
Mailing list

Cosyne 09
Cosyne 08
Cosyne 07
Cosyne 06
Cosyne 05
Cosyne 04