# Bayesian game

In game theory, a Bayesian game is a game in which the players have incomplete information on the other players (e.g. on their available strategies or payoffs), but, they have beliefs with known probability distribution.

A Bayesian game can be converted into a game of complete but imperfect information under the "common prior assumption". John C. Harsanyi describes a Bayesian game in the following way.[1] In addition to the actual players in the game, there is a special player called Nature. Nature assigns a random variable to each player which could take values of types for each player and associating probabilities or a probability mass function with those types (in the course of the game, Nature randomly chooses a type for each player according to the probability distribution across each player's type space). Harsanyi's approach to modeling a Bayesian game in such a way allows games of incomplete information to become games of imperfect information (in which the history of the game is not available to all players). The type of a player determines that player's payoff function. The probability associated with a type is the probability that the player, for whom the type is specified, is that type. In a Bayesian game, the incompleteness of information means that at least one player is unsure of the type (and so the payoff function) of another player.

Such games are called Bayesian because of the probabilistic analysis inherent in the game. Players have initial beliefs about the type of each player (where a belief is a probability distribution over the possible types for a player) and can update their beliefs according to Bayes' rule as play takes place in the game, i.e. the belief a player holds about another player's type might change on the basis of the actions they have played. The lack of information held by players and modeling of beliefs mean that such games are also used to analyse imperfect information scenarios.

## Specification of games

The normal form representation of a non-Bayesian game with perfect information is a specification of the strategy spaces and payoff functions of players. A strategy for a player is a complete plan of action that covers every contingency of the game, even if that contingency can never arise. The strategy space of a player is thus the set of all strategies available to a player. A payoff function is a function from the set of strategy profiles to the set of payoffs (normally the set of real numbers), where a strategy profile is a vector specifying a strategy for every player.

In a Bayesian game, one has to specify strategy spaces, type spaces, payoff functions and beliefs for every player. A strategy for a player is a complete plan of action that covers every contingency that might arise for every type that player might be. A strategy must not only specify the actions of the player given the type that he is but must specify the actions that he would take if he were of another type. Strategy spaces are defined as above. A type space for a player is just the set of all possible types of that player. The beliefs of a player describe the uncertainty of that player about the types of the other players. Each belief is the probability of the other players having particular types, given the type of the player with that belief (i.e. the belief is ${\displaystyle p({\mbox{types of other players}}|{\mbox{type of this player}})}$). A payoff function is a 2-place function of strategy profiles and types. If a player has payoff function ${\displaystyle U(x,y)}$ and he has type t, the payoff he receives is ${\displaystyle U(x^{*},t)}$, where ${\displaystyle x^{*}}$ is the strategy profile played in the game (i.e. the vector of strategies played).

One of the formal definitions of such a game looks like the following:

The game is defined as: ${\displaystyle G=\langle N,\Omega ,\langle A_{i},u_{i},T_{i},\tau _{i},p_{i},C_{i}\rangle _{i\in N}\rangle }$, where

1. ${\displaystyle N}$ is the set of players.
2. ${\displaystyle \Omega }$ is the set of states of nature. For instance, in a card game, it can be any order of the cards.
3. ${\displaystyle A_{i}}$ is the set of actions for player ${\displaystyle i}$. Let ${\displaystyle A=A_{1}\times A_{2}\times \dotsb \times A_{N}}$.
4. ${\displaystyle T_{i}}$ is the type of player ${\displaystyle i}$, decided by the function ${\displaystyle \tau _{i}\colon \Omega \rightarrow T_{i}}$. So for each state of the nature, the game will have different types of players. The outcome of the players is what determines its type. Players with the same outcome belong to the same type.
5. ${\displaystyle C_{i}\subseteq A_{i}\times T_{i}}$ defines the available actions for player ${\displaystyle i}$ of some type in ${\displaystyle T_{i}}$.
6. ${\displaystyle u_{i}\colon \Omega \times A\rightarrow R}$ is the payoff function for player ${\displaystyle i}$. More formally, let ${\displaystyle L=\{(\omega ,a_{1},\dotsc ,a_{N})\mid \omega \in \Omega ,\forall i,(a_{i},\tau _{i}(\omega ))\in C_{i}\}}$, and ${\displaystyle u_{i}\colon L\rightarrow R}$.
7. ${\displaystyle p_{i}}$ is the probability distribution over ${\displaystyle \Omega }$ for each player ${\displaystyle i}$, that is to say, each player has different views of the probability distribution over the states of the nature. In the game, they never know the exact state of the nature.

The pure strategy ${\displaystyle s_{i}\colon T_{i}\rightarrow A_{i}}$ should satisfy ${\displaystyle (s_{i}(t_{i}),t_{i})\in C_{i}}$ for all ${\displaystyle t_{i}}$. So the strategy for each player only depends on his type, since he may not have any knowledge about other players' types. And the expected payoff to player ${\displaystyle i}$ for such a strategy profile is ${\displaystyle u_{i}(S)=E_{\omega \sim p_{i}}[u_{i}(\omega ,s_{1}(\tau _{1}(\omega )),\dotsc ,s_{N}(\tau _{N}(\omega )))]}$.

Let ${\displaystyle S_{i}}$ be the set of pure strategies, ${\displaystyle S_{i}=\{s_{i}\colon T_{i}\rightarrow A_{i}\mid (s_{i}(t_{i}),t_{i})\in C_{i},\forall t_{i}\}.}$

A Bayesian Equilibrium of the game ${\displaystyle G}$ is defined to be a (possibly mixed strategy) Nash equilibrium of the game ${\displaystyle {\hat {G}}=\langle N,{\hat {A}}=S_{1}\times S_{2}\times \dotsb \times S_{N},{\hat {u}}=u\rangle }$. So for any finite game ${\displaystyle G}$, Bayesian Equilibria always exist.

The definition of Bayesian games has been combined with stochastic games to allow for environment states (e.g. physical world states) and stochastic transitions between states.[2] The resulting "stochastic Bayesian game" model is solved via a recursive combination of the Bayesian Nash equilibrium (see below) and the Bellman optimality equation.

## Signaling

A signaling game is a Bayesian game in which the informed party (the “agent”) knows their type, whereas the uninformed party (the “principal”) does not know the agent’s type. In some such games, it is possible for the principal to deduce the agent's type based on the actions the agent takes (in the form of a signal sent to the principal) in what is known as a “separating equilibrium”.

A specific example of a signaling game is a model of the job market. The players are the applicant (agent) and the employer (principal). There are two types of applicant, skilled and unskilled. The employer does not know which the applicant is, but he does know that 90% of applicants are unskilled and 10% are skilled (type 'skilled' has a probability of 0.1 and type 'unskilled' has a 0.9 probability).

The employer's action space is the set of natural numbers, representing wages—these are used to form a contract based on how productive the applicant is expected to be. Paying larger wages to skilled workers will generate larger payoffs for employers, while wages given to unskilled workers will have a less pronounced effect. The payoff of the employer is determined thus by the skill of the applicant (if the applicant accepts a contract) and the wage paid. Crucially, the employer chooses his or her action (the wage offered) according to his or her belief as to how skilled the applicant is and this belief is largely determined through signals sent by the applicant.

The applicant's action space consists of two actions: either obtain a university education or abstain from the university. Obtaining an education is less costly for a skilled worker than for an unskilled worker, as a skilled worker may receive scholarships, find classes less taxing, and so on. University education, therefore, serves as a signal, a means by which the applicant may communicate to the employer that he or she is, in fact, skilled. Thus, it may be rational for an employer to prefer to employ university graduates, even if their studies are not related at all to the work they will do at the company.

One strategy the employer may use is to give all applicants a wage such that skilled applicants may attend university (due to its lower cost) but which is insufficient to provide university education for unskilled applicants. This creates a separating equilibrium: skilled applicants can now signify their skill by going to university, and unskilled applicants cannot. The employer can observe which workers can go to university, and can then maximize his or her payoff by providing high wages to skilled workers and low wages to unskilled.

## Bayesian Nash equilibrium

In a non-Bayesian game, a strategy profile is a Nash equilibrium if every strategy in that profile is a best response to every other strategy in the profile; i.e., there is no strategy that a player could play that would yield a higher payoff, given all the strategies played by the other players. In a Bayesian game (where players are modeled as risk-neutral), rational players are seeking to maximize their expected payoff, given their beliefs about the other players (in the general case, where players may be risk-averse or risk-loving, the assumption is that players are expected utility-maximizing).

A Bayesian Nash equilibrium is defined as a strategy profile and beliefs specified for each player about the types of the other players that maximizes the expected payoff for each player given their beliefs about the other players' types and given the strategies played by the other players.

This solution concept yields an abundance of equilibria in dynamic games when no further restrictions are placed on players' beliefs. This makes Bayesian Nash equilibrium an incomplete tool with which to analyze dynamic games of incomplete information.

## Perfect Bayesian equilibrium

Bayesian Nash equilibrium results in some implausible equilibria in dynamic games, where players take turns sequentially rather than simultaneously. Similarly, implausible equilibria might arise in the same way that implausible Nash equilibria arise in games of perfect and complete information, such as incredible threats and promises. Such equilibria might be eliminated in perfect and complete information games by applying subgame perfect Nash equilibrium. However, it is not always possible to avail oneself of this solution concept in incomplete information games because such games contain non-singleton information sets and since subgames must contain complete information sets, sometimes there is only one subgame—the entire game—and so every Nash equilibrium is trivially subgame perfect. Even if a game does have more than one subgame, the inability of subgame perfection to cut through information sets can result in implausible equilibria not being eliminated.

To refine the equilibria generated by the Bayesian Nash solution concept or subgame perfection, one can apply the Perfect Bayesian equilibrium solution concept. PBE is in the spirit of subgame perfection in that it demands that subsequent play be optimal. However, it places player beliefs on decision nodes that enables moves in non-singleton information sets to be dealt more satisfactorily.

So far in discussing Bayesian games, it has been assumed that information is perfect (or if imperfect, play is simultaneous). In examining dynamic games, however, it might be necessary to have the means to model imperfect information. PBE affords this means: players place beliefs on nodes occurring in their information sets, which means that the information set can be generated by nature (in the case of incomplete information) or by other players (in the case of imperfect information).

### Belief systems

The beliefs held by players in Bayesian games can be approached more rigorously in PBE. A belief system is an assignment of probabilities to every node in the game such that the sum of probabilities in any information set is 1. The beliefs of a player are exactly those probabilities of the nodes in all the information sets at which that player has the move (a player belief might be specified as a function from the union of his information sets to [0,1]). A belief system is consistent for a given strategy profile if and only if the probability assigned by the system to every node is computed as the probability of that node being reached given the strategy profile, i.e. by Bayes' rule.

### Sequential rationality

The notion of sequential rationality is what determines the optimality of subsequent play in PBE. A strategy profile is sequentially rational at particular information set for a particular belief system if and only if the expected payoff of the player whose information set it is (i.e. who has the move at that information set) is maximal given the strategies played by all the other players. A strategy profile is sequentially rational for a particular belief system if it satisfies the above for every information set.

### Definition

A perfect Bayesian equilibrium is a strategy profile, and a belief system such that the strategies are sequentially rational given the belief system and the belief system is consistent, wherever possible, given the strategy profile.

It is necessary to stipulate the 'wherever possible' clause because some information sets might not be reached with the given strategy profile and hence Bayes' rule cannot be employed to calculate the probability at the nodes in those sets. Such information sets are said to be off the equilibrium path, and any beliefs can be assigned to them. Stronger notions of consistency further restrict the beliefs that can be assigned to off-equilibrium information sets to "reasonable" ones.

## Example

### Sheriff's Dilemma

A sheriff faces an armed suspect. Both must simultaneously decide whether to shoot the other or not.

The suspect can either be of type "criminal" or type "civilian". The sheriff has only one type. The suspect knows its type and the Sheriff's type, but the Sheriff does not know the suspect's type. Thus, there is incomplete information (because the suspect has private information), making it a Bayesian game. There is a probability p that the suspect is a criminal, and a probability 1-p that the suspect is a civilian; both players are aware of this probability (common prior assumption, which allows us to convert this into a complete-information game with imperfect information).

The sheriff would rather defend himself and shoot if the suspect shoots, or not shoot if the suspect does not (even if the suspect is a criminal). The suspect would rather shoot if he is a criminal, even if the sheriff does not shoot, but would rather not shoot if he is a civilian, even if the sheriff shoots. Thus, the payoff matrix of this Normal-form game for both players depends on the type of the suspect. We assume that payoffs are given as follows:

Type = "Civilian" Sheriff's action
Shoot Not
Suspect's action Shoot -3, -1 -1, -2
Not -2, -1 0, 0

Type = "Criminal" Sheriff's action
Shoot Not
Suspect's action Shoot 0, 0 2, -2
Not -2, -1 -1,1

If both players are rational and both know that both players are rational and everything that is known by any player is known to be known by every player (i.e. player 1 knows player 2 knows that player 1 is rational and player 2 knows this, etc. ad infinitumcommon knowledge), play in the game will be as follows according to perfect Bayesian equilibrium:[3][4]

When the type is "civilian", the dominant strategy for the suspect is not to shoot, and when the type is "criminal", the dominant strategy for the suspect is to shoot; we can thus remove the alternative strictly dominated strategy. Given this, if the sheriff shoots, he will have a payoff of 0 with probability p and a payoff of -1 with probability 1-p, i.e. an expected payoff of p-1; if the sheriff does not shoot, he will have a payoff of -2 with probability p and a payoff of 0 with probability 1-p, i.e. an expected payoff of -2p. Thus, the Sheriff will always shoot if p-1 > -2p, i.e. when p > 1/3.

## References

1. ^ Harsanyi, John C., 1967/1968. "Games with Incomplete Information Played by Bayesian Players, I-III." Management Science 14 (3): 159-183 (Part I), 14 (5): 320-334 (Part II), 14 (7): 486-502 (Part III).
2. ^ Albrecht, Stefano; Crandall, Jacob; Ramamoorthy, Subramanian (2016). "Belief and Truth in Hypothesised Behaviours". Artificial Intelligence. 235: 63–94. arXiv:. doi:10.1016/j.artint.2016.02.004.
3. ^ "Coursera". Coursera. Retrieved 2016-06-16.
4. ^ Hu, Yuhuang; Loo, Chu Kiong (2014-03-17). "A Generalized Quantum-Inspired Decision Making Model for Intelligent Agent". The Scientific World Journal. 2014. doi:10.1155/2014/240983. ISSN 1537-744X. PMC . PMID 24778580.