SARSA (State-Action-Reward-State-Action) is an algorithm for learning a Markov Decision Process policy, used in the Reinforcement Learning area of Machine Learning. It was introduced in the technical note 'Online Q-Learning using Connectionist Systems' by Rummery & Niranjan (1994) where the alternative name SARSA was only mentioned as a footnote.

This name simply reflects the fact that the main function for updating the Q-value depends on the current state of the agent "S1", the action the agent choses "A1", the reward "R" the agent gets for choosing this action, the state "S2" that the agent will now be in after taking that action, and finally the next action "A2" the agent will choose in its new state. Taking every letter in the quintuple (st , at , rt+1 , st+1 , at+1) yields the word 'SARSA'.


Q(s_t,a_t) leftarrow Q(s_t,a_t) + alpha [r_{t+1} + phi Q(s_{t+1}, a_{t+1})-Q(s_t,a_t)]

A SARSA agent will interact with the environment and update the policy based on actions taken, known as an on-policy learning algorithm. As expressed above, the Q value for a state-action is updated by an error, adjusted by the learning rate alpha. Q values represent the possible reward received in the next time step for taking action "a" in state "s", plus the discounted future reward received from the next state-action observation. Created as an alternative to the existing temporal difference technique, Watkin's Q-Learning, which updates the policy based on the maximum reward of available actions. The difference may be explained as SARSA learns the Q values associated with taking the policy it follows itself, while Watkin's Q-Learning learns the Q values associated with taking the exploitation policy while following an exploration/exploitation policy. For further information on the exploration/exploitation trade off, see Reinforcement Learning.

Some optimizations of Watkin's Q-Learning may also be applied to SARSA, for example in the paper 'Fast Online Q(λ)' (Wiering and Schmidhuber, 1998) the small differences needed for SARSA(λ) implementations are described as they arise.

See also

* Reinforcement learning
* Temporal difference learning
* Q-learning

Sarsa is also a village in Anand District of Gujarat in India.

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • šarša — šaršà sf. (4) žr. 1 šaršas 1: Pasikėlė ant vandenio šaršà didelė Brb. Ant šaršos žuvys neršti Lp …   Dictionary of the Lithuanian Language

  • Sarsa — Sar sa, n. Sarsaparilla. [Written also {sarza}.] [1913 Webster] …   The Collaborative International Dictionary of English

  • sarsa — sar·sa (sahrґsə) gen. sarґsae Sarsaparilla …   Medical dictionary

  • sarsa — sar|sa Mot Pla Nom femení …   Diccionari Català-Català

  • Sarsa —   A misspelling of Sarga …   Etymological dictionary of grasses

  • sarsa — (Sp). A bramble …   Dictionary of word roots and combining forms

  • sarsa — …   Useful english dictionary

  • Sarsa Dengel — (Ge ez ሠረጸ ድንግል śarṣa dingil , Amh. serṣe dingil Sprout of the Virgin , 1550 4 October 1597) was IPA|nəgusä nägäst (throne name Malak Sagad I, Ge ez መልአክ ሰገድ mal ak sagad , Amh. mel āk seged , to whom the angel bows ) (1563 1597) of Ethiopia, and …   Wikipedia

  • Sarsa Dengel — (Thronname Malak Sagad I., „vor dem sich der Engel verneigt“) (* 1550; † 4. Oktober 1597) war von 1563 bis 1597 Negus Negest (Kaiser) von Äthiopien und ein Mitglied der Solomonischen Dynastie. Er war der Sohn von Kaiser Minas. Die Befehlshaber… …   Deutsch Wikipedia

  • Sarsa-Dengel — d Ethiopie Sarsa Dengel (né en 1563 et mort en 1597) fut négus d’Éthiopie. Biographie Il succède à Menas et prend le nom Malak Sagad Ier en 1578. Sarsa Dengel trouve un royaume aux bourgades dévastées, aux frontières mouvantes. Son règne se… …   Wikipédia en Français