Hi there! Welcome to the Game Theory Notes.
Before you continue, be sure to check out the Appendix of this notes document.
It contains examples, proofs, non-examinable content, and other optional stuff that would otherwise bloat these notes.
If I ever refer to a proof or example from the Appendix, I’ll link it, so don’t worry.

Normal / Extensive Form Games

Normal / Extensive Form Games were already covered in my Intelligent Agents Notes.
If you haven’t done Intelligent Agents, then they’re in sections Pure Strategy Games and Mixed Strategy Games. You can go to those notes here.
Here, I’ll just include a brief summary of the Game Theory slides.

Normal Form Games

Games are:

A set of n players (typically 2)
A set of (pure) strategies for player i

Countable:
Uncountable:

A set of outcomes:
Game rule:
Payoff:

Two ways to represent games:

Normal form (strategic-form): players move simultaneously
Extensive form: players move sequentially

Bimatrix game representation shows utility of both players:

Better response strategy: strategy which produces a more favourable outcome for a player than another strategy (given opponent strategy is already decided)
Best response strategy: strategy which produces the most favourable outcome for a player (given opponent strategy is already decided)

Strategy s1 strictly dominates s2 if it gives the player strictly better payoff
Strategy s1 is a dominant strategy if it strictly dominates all other strategies
They don’t always exist.

Pure Nash Equilibrium: Where each player has chosen a strategy in which no player can benefit by changing strategies (while other players keep theirs).
Mixed strategy: A probability distribution over pure strategies.
Mixed Nash Equilibrium: A Nash Equilibrium where players pick mixed strategies.

Expected utility (EU): the average utility yielded by a mixed strategy

	0	1
0	a0,0, b0,0	a0,1, b0,1
1	a1,0, b1,0	a1,1, b1,1

Payoff matrix for row player

Payoff matrix for column player

: row player strategy, probability of playing pure strategy

: column player strategy, probability of playing pure strategy

is Nash equilibrium if:

For any , then
For any , then

Support of a strategy: The set of pure strategies that the player can play with positive probability

⇒ pure strategy
⇒ mixed strategy

Support Lemma (Indifference Principle): is a mixed Nash equilibrium if every pure strategy in is a best response to and vice versa.

Minimax Theorem:

For any where ,
There exists (possibly mixed) such that
such that is a Nash equilibrium.

Saddle point: A deterministic Nash equilibrium in the context of a payoff matrix (A or B).

Linear programming duality:

Primal problem

Dual problem

max

s.t.

min

s.t

Duality theorem:

Weak duality:

For any feasible ,
Duality gap:

Strong duality:

If is the opt solution of Primal, and is the opt solution of Dual, then

Every finite game has a mixed Nash equilibrium.

Proved using Brouwer fixed-point theorem and Sperner’s Lemma.

is a Nash equilibrium of the game if:

For any , then
For any , then

is an ε-Nash equilibrium of the game if:

For any , then
For any , then

Computing Nash equilibrium:

2-player zero-sum games (polynomial time):

Eliminate dominated strategies
Look for dominant strategy
Look for saddle point (pure Nash)
Always solvable with linear programming (mixed Nash)

2-player non-zero-sum games:

Lemke-Howson algorithm

Linear complementarity problem
Exponential time

General games:

Porter-Nudelman-Shoham
Sandholm-Gilpin-Conitzer
Simplicial subdivision method (path-following method derived from Scarf’s algorithm)

von Neumann - Morgenstern Utility Theorem

This is covered under the Utility Theory section of the Intelligent Agents notes, but this is how the slides put it:
Four axioms:

Completeness
Transitivity
Continuity
Independence

If these four hold, there exists a unique function assigning each outcome a real number, so for any two outcomes,

From ordinal preference to cardinality preference

Extensive Form Games

Extensive Form Games are sequential.
Two variants:

Perfect information extensive-form games

All players know game structure

Imperfect-information extensive-form games

There is uncertainty about moves

“Turns” represented as a tree:

Pure strategy is a complete specification of which deterministic action to take at each node.

Can convert to normal form:

Nash Equilibrium of normal form is not very good.
Subgame Perfect Equilibrium: game where every subgame is a Nash equilibrium.
Find these using Backward Induction.

Finding NE of a normal-form game is exponential time.
Backward induction is linear time in the size of the game.

Intro to Algorithmic Mechanism Design

Every system we use was designed at one point.
Someone had you in mind when they designed that system.

They designed with goals in mind, but your (the user’s) goals might not align with theirs...
In other words, systems need to be designed so that users can’t exploit rules and go against their own interests and the designer’s wishes.

Example

1994 Carribean cup qualification

A good example is a football cup:

Goal of participating teams is to win the cup
Goal of designer is to make it fun

The designer’s goal is not to have participants scoring their own goals, because that’s no fun...
... but will that stop them?

Have you heard of the 1994 Carribean cup qualification?

Here’s how it went down:

To go through, Barbados needed to win with a goal difference of ≥ 2.
Barbados leads 2-0. So far, they go through, and Grenada is out!
Grenada scores, making it 2-1. So, Barbados is out!
Barbados deliberately scores into their own net, making it 2-2!
Grenada figures out what Barbados is planning to do, and they come up with a plan of their own: if they score a goal in either net before 90 minutes is up, they go through!
Barbados has to defend both nets from the goal! No goal is scored!
Barbados scores a goal in overtime, which counts twice, making it 4-2 and qualifies to the next round. Grenada is out!

If none of that meant anything to you, then just think of it like this:
Barbados exploited the rules of the game and scored in their own goal so they could increase their chances of winning.
Isn’t the whole point of football to score in your opponent’s goal? That’s how it was designed.
You shouldn’t have to score own-goals to try and win...

If you don’t like football, then perhaps watch some speedruns at GDQ. Speedrunners break design rules all the time!

Bahar likes football, so football examples it is.

Goal: design rules so that strategic behaviour by participants leads to a desirable outcome.
Roughly speaking, assuming unknown individual utilities, we ask whether we can design a game such that, no matter what the secret utilities of the agents actually are, the equilibrium of the game is guaranteed to have a (set of) certain desired properties.
Basically, players do what we want them to, to win.

Here’s some well-known systems where this applies:

Elections
Internet search auctions (ad auctions)
Wireless spectrum auctions
Matching medical residents or interns to hospitals
Matching children to schools
Kidney exchange markets

Mechanism: an algorithm, the input of which is withheld by strategic selfish agents.

Back to our football example, the football cup could be a mechanism. The agents are the football teams, and the input is their strategies.

Algorithmic Mechanism Design (AMD) lies at the intersection of economic game theory and computer science.

We create a mechanism because:

We want to solve an optimisation problem, but
the inputs to this problem are the private information of self-interested agents

We must design it such that:

the mechanism solves the optimisation problem while
inducing the agents to act as the mechanism designer wishes (ideally revealing their information truthfully)

AMD considers computational constraints:

Mechanisms that can’t be implemented in polynomial time are not considered as viable solutions.
Analytical tools, such as worst case analysis and approximation ratios, are employed.

Auctions as Mechanisms

An auction is a great example of a mechanism!
We have our optimisation problem: how to allocate resources, and our input comes from self-interested agents.

Again, I covered auctions in greater detail in the Intelligent Agents notes in the Auctions section.
But, as I’m so nice (and have nothing better to do) I’ll write a quick summary below.

summary pls

Auctions

Four types of auctions:

English (people shout ascending prices from a reservation price)
Dutch (price goes down until someone bids)
First-price sealed-bid (highest bidder pays their bid)
Vickrey (highest bidder pays second highest bid)

strategic bidders
Each bidder has private valuation
Bidder utility model is quasilinear utility model

If loses and pays , utility is
If wins and pays , utility is

Independent private value model: bidder valuation does not depend on other bidders’ valuations
Bidders cannot collude.

is the bid placed by bidder .
is the bid profile of all bidders.
The set of actions for each player is the set of possible bids they can place.
is the utility of player given bid profile .
The payoff (utility) of bidder depends allocation + payment, which depends on everyone’s bids.

Intro to Bayesian Games

When you have an auction (English, Vickrey, etc.) and bidders, you now have a Bayesian game.

So, we have your typical normal-form game:

A tuple
where:

is a finite set of agents
is the set of all action profiles, where is the set of actions for agent .
, where (reals) is the utility for agent .

However, Bayesian games take things a step further:

A tuple
where:

is a finite set of agents
, is the set of all action profiles, where is the set of actions for agent .
is the set of all type profiles, where is the type space of player .
is a common-prior probability distribution on .
, where (reals) is the utility for agent .

Whoa whoa... slow down there.

What are those new things in the Bayesian games? Type spaces?
Players can be an int or a char now?
Not quite. A player’s type represents their private information that they don’t share with the other players.
Sometimes, it’s the same as their valuation, but sometimes it’s not (it is when we’re talking about direct quasilinear mechanisms, but more on that later).

is the set of all possible types that player could have, so .
Player ’s type is chosen according to the probability distribution over .

💡 NOTE

There are three stages of a Bayesian game:

Ex-ante: the player knows nothing about anyone’s types, even their own

Interim: the player knows their own type, but nobody else’s.

Ex-post: the player knows all players’ types.

Typically, when we reason with Bayesian games, we’re referring to the interim stage.

For more information, I suggest checking out this cheat sheet.

Usually, agents don’t know the other agents’ types.
What they do know is the probability distribution, . So they could make educated guesses.
Sometimes is not known; that is, we have no probabilistic information in the model.
That would give us a Bayesian Game with Strict Incomplete Information.

Example

Sheriff’s Dilemma Part 1: Phantom Bayesian Game

Here is a simple example of a Bayesian game I found on Wikipedia.

A sheriff faces off against an armed suspect.
Both decide whether to shoot the other or not (we’ll call not shooting ‘idling’).

The suspect has two types: ‘civilian’ or ‘criminal’.
The sheriff only has one type: sheriff.

The suspect knows that the sheriff is of type ‘sheriff’, but the sheriff doesn’t know the suspect’s type.
They could either be civilian or criminal, for all they know.
The chance of the suspect being a criminal is .

(so the chance of being a civilian is )

The sheriff would rather shoot to defend themself if the suspect shoots, but will idle if the suspect idles (even if they’re a criminal).
The suspect would like to shoot if they’re a criminal, but would like to idle if they’re a civilian (even if the sheriff shoots).

← sheriff and suspect both shoot

← sheriff shoots, suspect does not

← sheriff does not shoot, suspect shoots

← sheriff and suspect don’t shoot

← sheriff vs civilian

← sheriff vs criminal

sheriff utility if suspect is civilian and both shoot
suspect utility if criminal and only suspect shoots
...
Here are the utility grids, given the two types for the suspect:

Type = “Civilian”		Sheriff action
Type = “Civilian”		Shoot	Idle
Suspect action	Shoot	-3,-1	-1,-2
Suspect action	Idle	-2,-1	0,0

Type = “Criminal”		Sheriff action
Type = “Criminal”		Shoot	Idle
Suspect action	Shoot	0,0	2,-2
Suspect action	Idle	-2,-1	-1,1

pew pew 🔫

Basics of Mechanisms

Bayesian Game Settings

A Bayesian Game Setting is a tuple where:

is a finite set of agents
is a set of outcomes
where i is the type space of player i.
is a common-prior probability distribution on
, where (reals) is the utility function for player .

What’s the difference between a Bayesian Game and a Bayesian Game Setting?
Bayesian Games have a set of actions, but Bayesian Game Settings have a set of outcomes.
In addition, the utility function is defined over the set of outcomes.
Settings define the players, and how they feel towards certain outcomes, but it doesn’t specify what they can do.

A (deterministic) mechanism (for a Bayesian game setting ) is a pair where:

where is the set of actions available to agent
maps each action profile to an outcome

More abstractly, a mechanism defines what each player can do (their actions), and what will happen when each player picks an action (resulting outcome from an action profile).
They don’t always have to be deterministic; you could have , which returns a probability distribution over the outcomes. However, we’ll focus on just deterministic mechanisms for now.

The only action available to these gears

in this mechanism is to keep on turning.

A direct mechanism is a mechanism in which the only action available to each agent is to announce their private information.
The types are the actions.
Whether they want to play truthfully, and pick the action that actually is their respective type, is up to them.
Direct mechanisms are just special kinds of mechanisms, where for each .
More formally, it’s a pair where:

where is the set of actions available to agent
maps each action profile to an outcome

In other words, your actions are like declarations of your opinion. You are declaring your opinion to the mechanism and other agents, and the mechanism makes a decision based on everyone’s declared opinions.
Examples of direct mechanisms are:

An auction (you are declaring how much you think something is worth)
A vote (you are declaring your preference over a set of choices)

In a quasilinear utility setting:

The set of outcomes is (realsn) for a finite set of choices
When outcome is chosen, the utility of an agent given type is

So, an outcome has two components: choice and payment.
The choice, , is the result of the optimisation problem of our mechanism.
The payment, , is the cost of this outcome for each player .

When we assume independent private values (IPVs), we can write agent ’s utility function as .
We can then write the valuation of agent for choice as .
Following on from this, the utility of agent of an outcome is their valuation minus their payment:

💡 NOTE

Remember that the type isn’t always just the valuation!

The valuation is the maximum price that player is willing to pay for the mechanism to choose choice .

The type represents the private information of the player and determines what utilities / valuations that player has.

The valuation isn’t the type, but the valuation does depend on the type.

A quasilinear mechanism is a triple is where

is the set of action profiles where is the set of actions available to player .
is the choice function that maps action profiles to choices.
(realsn) is the payment function that maps action profiles to payments for each agent.

So, in a normal mechanism with a quasilinear utility setting, we map action profiles to outcomes, and outcomes are made up of a choice component and a payment component.
In a quasilinear mechanism, we split the outcomes up: we have two functions that map action profiles to choices, and action profiles to payments.

Example

Sheriff’s Dilemma Part 2: Bayesian Game Setting and Mechanism Tendency

To better explain these concepts, I’ll “instantiate” them with the Sheriff’s Dilemma example we explained above.
Above, we modelled this game as a Bayesian Game.
Here, we’ll model it as a Bayesian Game Setting and mechanisms, with a quasilinear utility setting.
We’ll say someone is dead if they are shot, and alive if someone is not shot.

First, let’s model the Bayesian Game Setting:

← sheriff and suspect are dead

← sheriff is dead, suspect is alive

← sheriff is alive, suspect is dead

← sheriff and suspect are alive

← sheriff vs civilian

← sheriff vs criminal

Now, we have a setting.
The choice here is who lives and who dies.
The payment here is a guilt value that weighs down on a murderer. The sheriff, being a person of justice, feels more guilt than the suspect.

The sheriff and the suspect have utilities about being dead or alive, but there’s no way to actually kill anyone yet.
That’s where the mechanisms come in!

There’s many different mechanisms we can make to model this game.
We’ll start with a normal, deterministic mechanism:

← sheriff and suspect both shoot

← sheriff shoots, suspect does not

← sheriff does not shoot, suspect shoots

← sheriff and suspect don’t shoot

← sheriff and suspect kill each other
← sheriff kills the suspect
← suspect kills the sheriff
← nobody shoots, and nobody dies

We have now defined a mechanism in which players can kill each other: if you shoot, the other player dies. If the other player shoots, you die.

A direct mechanism doesn’t work in a game like this.
In a direct mechanism, actions directly correlate to types.
This could work with the suspect, by shooting the “criminal” action and idling the “civilian” action.
However, the sheriff must also choose between shooting or idling. The sheriff only has one type, and since the action set and the type set must be the same, there are too many actions to the sheriff’s one type.
Direct mechanisms only apply to certain games, for example: auctions, where your type is your valuation.
This doesn’t apply here, so we’ll move on.

We could change our mechanism to a quasilinear mechanism:

← sheriff and suspect both shoot

← sheriff shoots, suspect does not

← sheriff does not shoot, suspect shoots

← sheriff and suspect don’t shoot

Phew... that’s a lot to take in.
So, you can think of a Bayesian Game Setting as a situation where everyone wants to win, but they don’t know how to play (they don’t have any actions).
Also, you can think of a mechanism as a game that nobody’s playing at the moment.

Players with no game, and a game with no players, huh...
What if we put them together? Bayesian Game Setting + Mechanism?

Bayesian Game Setting: (where )
Mechanism:
Combination: where:

is a utility function defined as

Why, that combination tuple looks so familiar...
... it’s a Bayesian Game!

Bayesian Game Setting + Mechanism = Bayesian Game

Bayes-Nash Equilibrium

In normal-form games, we have Nash equilibrium.
In Bayesian games, we have Bayes-Nash equilibrium.

What’s the difference?
Not much, to be honest!
A Bayes-Nash equilibrium is a mixed-strategy profile such that each is a best response to .
It’s a little more complicated to calculate, because you need to take into account the probability distributions over the types .

Let’s have a think about what we’re actually doing.
When we make a mechanism, we are:

Performing an optimisation problem, given that the values of inputs are unknown (held by agents).

This is equivalent to:

Designing a game that implements a particular social choice function in equilibrium, given that the designer does not know agents’ preferences and the agents might lie.

I covered social choice functions in the Voting section of the Intelligent Agents notes.
However, here’s a quick run-down:

summary pls

Social Choice setting

A social choice setting involves:

- a set of possible outcomes
- a set of agents who have preferences over outcomes
- a function that picks one outcome based on the agents’ preferences

The goal is to design a function that maps agents’ preferences to an outcome fairly.

Bayesian Game settings and social choice settings are very similar.
The difference is, in a social choice setting, agents can only provide ordinal preferences (e.g. strawberry vanilla chocolate) and the result is based purely off of those preferences.
Bayesian Game settings extend social choice settings; agents have actions and types, can be strategic, and hence, can be dishonest (agents can be dishonest in social choice settings too, but only if the social choice function picked is manipulable).

There are also dominant strategies in Bayesian games. They work much in the same way as normal-form games:

strictly dominates if for all it is the case that
weakly dominates if for all it is the case that , and for at least one it is the case that
very weakly dominates if for all it is the case that

💡 NOTE

For all the above cases, we say that is a

strictly dominated strategy if another strategy strictly dominates it.

A strategy is strictly dominant for an agent if it strictly dominates all other strategies for that agent.

This is the same for weakly and very weakly dominance.

Example

Sheriff’s Dilemma Part 3: Bayes-Nash Equilibrium Crusaders

Let’s take a look at our utilities for our Sheriff’s Dilemma again:

Type = “Civilian”		Sheriff action
Type = “Civilian”		Shoot	Idle
Suspect action	Shoot	-3,-1	-1,-2
Suspect action	Idle	-2,-1	0,0

Type = “Criminal”		Sheriff action
Type = “Criminal”		Shoot	Idle
Suspect action	Shoot	0,0	2,-2
Suspect action	Idle	-2,-1	-1,1

Here, we can identify some strictly dominated strategies.
For example, when the suspect is a civilian, idling strictly dominates shooting.
When the suspect is a criminal, shooting strictly dominates idling.
So, we can eliminate those:

Type = “Civilian”		Sheriff action
		Shoot	Idle
Suspect action	Idle	-2,-1	0,0

Type = “Criminal”		Sheriff action
		Shoot	Idle
Suspect action	Shoot	0,0	2,-2

Now that we’ve eliminated those, we can decide what’s the sheriff’s best strategy.
When the suspect is a civilian, they won’t shoot, so it’s best to idle.
When the suspect is a criminal, they’re going to shoot, so the sheriff should defend themselves and shoot too.
The sheriff’s best choice depends on the civilian’s type.
Let’s calculate the average payoff of each action for the sheriff.
If the sheriff shoots:

(utility of sheriff shooting civilian * probability of civilian)
(utility of sheriff shooting criminal * probability of criminal)

If the sheriff idles:

(utility of sheriff idling to civilian * probability of civilian)
(utility of sheriff idling to criminal * probability of criminal)

If the average payoff of shooting is greater than the average payoff of idling, the sheriff should shoot.
Therefore, if , then the sheriff should shoot.
Rearranging this, the sheriff should only shoot if .
In layman’s terms, only shoot if there is greater than a third chance that the suspect could be a criminal.
This makes up our Bayes-Nash equilibrium.

A mechanism implements a social choice function in dominant strategies if for some dominant strategy equilibrium of the induced game, we have that the mechanism chooses the same outcome (given action profile of the agents) as does (given true utilities of the agents).
What does that even mean?
Let’s say we have a Bayesian game. We also have a social choice setting that is equivalent to our Bayesian game setting, with a social choice function.
If we take the mechanism from our Bayesian game, and strip all the strictly dominated strategies, then if the mechanism picks all the same outcomes as our social choice function through equilibria, then we say that the mechanism implements the social choice function in dominant strategies.

Dominant-Strategy Truthfulness

A strategy profile in which every is a dominant strategy for agent is called an equilibrium in dominant strategies.

Dominant-strategy (DS) truthfulness: A direct mechanism is dominant-strategy truthful if, for every agent , telling the truth (i.e. revealing the true valuation) maximises ’s utility, no matter what strategy the other players pick.

This is also known as a:

strategy-proof mechanism
dominant-strategy incentive-compatible mechanism
truthful mechanism

If mechanism is dominant-strategy truthful, then truth-telling (the action that correlates to your actual type) is always at least a very weakly dominant strategy.

Let’s have a look at a single-item auction .
For each bidder , bidding maximises ’s utility, no matter what bids the other bidders place:

Uh... English please?
This means our utility by picking will always be greater than the utility of any bid other than , no matter what bids our opponents make.
We can rewrite this as:

Ooh boy. What does this one mean?
It’s a slightly more restrictive version of the equation before it.
This means our utility by picking will always be greater than the utility of any bid other than , even if all the opponents play truthfully.

If you’re playing a Bayesian game with a dominant-strategy

truthful mechanism, you should tell the truth.

Alright, we’ve just seen thetas and bi’s and argh, where’s the examples?
How about I suggest a mechanism that is dominant-strategy truthful?
If you’ve done Intelligent Agents, you know this already.
It’s the Vickrey auction!
I’ve already proved that the Vickrey auction is truthful here.
If you’re lazy, then just remember that in a Vickrey auction, you will maximise your utility by playing truthfully (bidding your valuation), making it dominant-strategy truthful.

Vickrey-Clarke Groves (VCG) Mechanism

We know what a direct mechanism is.
We know what a quasilinear mechanism is.
What if we put them together?

Direct Quasilinear Mechanisms

A direct quasilinear mechanism is a tuple where:

is the choice function that maps type spaces to choices.
(realsn) is the payment function that maps type spaces to payments for each agent.

Because direct mechanisms define , we can substitute the two in a quasilinear mechanism to get a direct quasilinear mechanism.
So, more succinctly, a direct quasilinear mechanism is just a quasilinear mechanism defined as .

With direct quasilinear mechanisms, our types are a bit more specific.
Our types are our valuations .
In other words, our type / valuation tells us how much the agent is willing to pay for a given outcome.
Because it’s a direct mechanism, agents are asked to declare their type. In other words, declare their valuation for every choice .
These declared valuations are labelled , whereas our actual, true valuations are labelled .
If we’re lying about our valuation to the mechanism, then our declared valuation will be different to our true valuation .
denotes the declared valuations of everyone except agent .

Because valuations are types, and we’re mapping types to choices and payments, we can map valuations to choices and payments.

There’s different properties that a direct quasilinear mechanism can have:

Property	Definition	English pls
(Strict Pareto) Efficiency Social-welfare maximising	A quasilinear mechanism is efficient (or social-welfare maximising) if, in equilibrium, it selects a choice that maximises .	It picks the choice that maximises the sum of everyone’s valuations for that choice.
Dominant-strategy truthful Tell the truth	A direct quasilinear mechanism is dominant-strategy truthful (or just truthful) if for each agent , declaring maximises ’s utility, no matter what other agents declare.	Agents should declare their true valuation to maximise their utility.
Ex interim individual rationality I won’t lose	A mechanism is ex interim individually rational if, in equilibrium, each agent ’s utility is at least 0, given an estimation of using a known distribution. Formally, it’s when where is the equilibrium strategy profile.	No agent will receive negative utility, calculated using an average estimate of everyone else’s valuations. This is weaker than ex post individual rationality, so: Ex-post IR ⇒ Ex-interim IR. Why do we estimate? The “interim” stage means the player knows their own type (valuation), but not anybody else’s. Therefore, to calculate their utility in this stage, they must estimate everyone else’s valuations using a known distribution.
Ex post individual rationality No-one will lose	A mechanism is ex post individually rational if, in equilibrium, the utility of each agent is at least 0. Formally, it’s when where is the equilibrium strategy profile.	No agent will receive negative utility and lose by participating in the mechanism. In the “ex-post” stage, everyone knows everybody’s valuations. Therefore, true utilities can be calculated without estimations.
Budget balance Mechanism breaks even	A quasilinear mechanism is budget balanced when , where is the equilibrium strategy profile.	The mechanism won’t receive any profits or losses; it’ll break even.
Weakly budget balance Mechanism breaks even or profits	A quasilinear mechanism is weakly budget balanced when , where is the equilibrium strategy profile.	The mechanism won’t lose money; it’ll only profit or break even.
Tractability Feasible to run	A quasilinear mechanism is tractable if for all , and can be computed in polynomial time.	It’s computationally feasible; the algorithm complexities are fast enough to run and be used in-practice.

Groves Mechanisms

There is a class of mechanisms called Grove mechanisms that are both:

efficient and
dominant-strategy truthful.

Grove mechanisms are direct quasilinear mechanisms defined as follows:

So, what does this mean?
These two lines define the choice function and payment function for Groves mechanisms.
The choice function uses to select the choice that maximises the sum of all declared valuations for that choice (a.k.a, the social-welfare), making Groves mechanisms efficient.
According to the payment function agent :

Pays Clarke tax (that’s the term)
Is paid the social welfare of other agents for the chosen choice

The Clarke tax is some amount that agent must pay that doesn’t depend on their own valuation.
Because of this, the agents’ costs depend on other agents’ valuations, making Grove mechanisms truthful (a proper proof of this later).

Theorem

Green-Laffont

In settings where agents may have unrestricted quasilinear utilities, Groves mechanisms are the only mechanisms that are both:

efficient and
dominant-strategy truthful.

Here’s the formal statement if you want:

Suppose that for all agents any (reals) is a feasible preference. Then an efficient quasilinear mechanism has truthful reporting as a dominant strategy for all agents and preferences only if it is a Groves mechanism.

Vickrey-Clarke-Groves (VCG) Mechanisms

The Vickrey-Clarke-Groves (VCG) mechanism is a direct quasilinear mechanism where:

In other words, VCG is a Groves mechanism where .

💡 NOTE

VCG is a specific mechanism within the class of Groves mechanisms (the most famous mechanism within this class).

Some people refer to Groves mechanisms as VCG mechanisms (which is confusing IMO).

Vickrey auction is a special case of VCG, and hence VCG is sometimes known as the “generalised Vickrey auction”.

In fact, single-item VCGs are literally just Vickrey auctions!

The choice function hasn’t changed, but the payment has.
In simpler terms, the payment function is basically:

Optimal social welfare

of other agents

if agent didn’t exist

Optimal social welfare

of other agents

if agent exists

Every agent pays their social cost (their cost for existing).
VCG is also called a pivotal mechanism.

Agents whose payment is greater than 0:

They make things worse for others by existing (relatable), and have to pay .

Agents whose payment is 0:

They don’t affect the choice of the mechanism.

Agents whose payment is less than 0:

They make things better for others by existing, and get paid .

in terms of social welfare

Remember from quasilinear utility settings?

Utility is:

If we substitute everything with VCG stuff, we can represent utility in terms of social welfare.

Outcome

Choice

Type (we assume agent is being truthful, because it’s a dominant strategy)

The utility is the social welfare of the current choice minus the social welfare if agent didn’t exist.

VCG Examples

These should probably go in the Appendix 🤔
Eh, whatever.

Example

VCG In Action: Auction

Let’s say we have an auction, and we want to run it using the VCG mechanism.
There’s two items:

A cheese grilled sandwich		A Bahar wig

We have two bidders:

Jie Zhang
Bahar Rastegari (not sure why she’d want a wig of her own hair but whatever)

Let’s assume that bidders only care about goods they receive, so their valuation is 0 when they get nothing.
Therefore, we can specify agent ’s valuation using three values:

(S for sandwich and W for wig)

What choice will VCG pick, and what payments?

First of all, let’s define the valuations for Jie and Bahar :

To get the choice, we need to find the choice that maximises social welfare.
That choice would be Jie getting the sandwich and Bahar getting the wig:

So now we know who gets what, but what’s the payments?
Let’s do Jie first.

(others’ social welfare if Jie didn’t exist) - (others’ social welfare if Jie exists)
What would be the social welfare if Jie didn’t exist?
It’d be 5, because the choice that maximises social welfare would be Bahar getting both the sandwich and the wig.
What would be the social welfare for others if Jie did exist?
It’d be 4, because Bahar is getting the wig.
Now we can calculate the payment:

We can do the same with Bahar.

Others’ social welfare if Bahar didn’t exist is 6, with Jie getting both items.
Others’ social welfare if Bahar exists is 3, with Jie getting the sandwich.
Therefore: .

It seems Jie has to pay £1 for the sandwich, and Bahar has to pay £3 for the wig.

Example

VCG In Action: Vickrey Auction

Let’s look at an example of a single-item VCG being Vickrey.
We’re selling a single item, uhh...

A Greggs veggie bake

We’ll have three agents this time:

Jie Zhang
Bahar Rastegari
Markus Brede

Let’s declare their valuations for our veggie bake:

Jie had a big breakfast earlier, and Markus is a little short on cash at the moment.
But Bahar is pretty hungry, and just received a pay bonus the week before!

Let’s calculate the choice.
The choice that maximises social welfare is giving the veggie bake to Bahar, making the welfare 6.
So, Bahar wins.

Let’s calculate the payments:
(Bahar)

The others’ welfare without Bahar is 4, because Markus would get the bake.
The others’ welfare with Bahar is 0, because nobody else is getting anything.
Therefore .

(Markus)

The others’ welfare without Markus is 6, because Bahar would get the bake.
The others’ welfare with Markus is 6, because Bahar.
Therefore .

(Jie)

For the same reason as Markus, .

In conclusion, Bahar wins the veggie bake, and has to pay Markus’ valuation.
Seems awfully familiar to Vickrey, doesn’t it?

Example

VCG In Action: Shortest Path

You know, VCG can do more than just auctions.
If you have an optimisation problem that can be solved with money, VCG can be applied.
This includes the shortest path problem.

Our optimization problem is finding the shortest path from A to F.
The set of outcomes is all possible paths from A to F.
Each arc is an agent, and the arc weights are negative valuations for that arc agent if they exist in the shortest path.
For example, AB has a negative valuation of -3 for the choice ABDF, but CE would have a valuation of 0 for it (because it’s not in the path).

What path does VCG pick, and what are the payments?

It’s VCG, so the choice will maximise social welfare.
Because all the valuations are negative, the choice will minimise how much the agents have to collectively pay.
The choice should be ABEF, with a social welfare of -5.

We won’t go over everyone’s payments; only a few.
How much does AC have to pay?
Well, what’s the others’ social welfare when AC doesn’t exist?

The choice when AC doesn’t exist is still ABEF, so it’s -5.
What’s the others’ social welfare when AC exists?
Again, it’s -5.
Would you look at that? Their payment is zero!
That means AB has no effect on the mechanism. Whether they exist or not would not change the fact that ABEF is the chosen choice.
Every arc that’s not in the shortest path has a payment of 0 (that includes CE, CF, and DF).

How much would AB have to pay?
First, what’s the others’ social welfare when AB doesn’t exist?

The new shortest path is ACEF, which imposes a cost of -6 on everyone else.
What’s the others’ social welfare when AB exists?
That’d be the valuation of BE + EF, which is -2.
So, AB is actually being paid 4 to exist.

Spoilers: and .
BE and EF have the same costs; why does EF earn more than BE?
That’s because EF has more market power than BE.
The other agents would be far worse off without EF than without BE.

That’s a bit awkward for BE and EF. I wonder if they can still be friends.

Properties of VCG

Truthfulness

VCG is a Groves mechanism.
Since Grove mechanisms are truthful, VCG is also truthful.
But why are they truthful?
Let’s prove that now!

Every other agent picks some arbitrary strategy (they don’t have to be truthful).
Agent is still deciding in their strategy .
The best strategy for is one that solves:

In other words, a strategy that maximises agent ’s utility.
Substituting in the payment function from the Groves mechanism, we have:

That has nothing to do with , so it’s virtually a constant in this setting. It’s sufficient just to solve:

The only way influences this maximisation is through the choice the mechanism picks. So would like to pick a strategy that will lead the mechanism to pick a choice that solves:

But look! If we turn our attention to the choice function in Groves mechanisms, and rearrange it a little:

Hey, look! The Groves mechanism already picks a choice that maximises our function, so long as we declare our true valuation.
Instead of finding the choice and maximising that function ourselves, we let the mechanism itself do that by declaring .
This proof does not depend on declarations of other agents, so truth-telling is a dominant strategy for agent .

Because agents are paid the valuation of other agents under an alternative choice, they have an incentive to maximise everyone’s utility rather than just their own.
In DS truthful mechanisms, agents’ declarations are used to choose the choice, and set other agents’ payments.

💡 NOTE

VCGs may seem like the best thing ever, right?

Well, it’s rarely used in practice.

This is because it has a long list of weaknesses.

One of its biggest weaknesses is that finding the social-welfare maximising choice might be NP-hard, so it doesn’t have tractability (not feasible). This is especially true in combinatorial auctions.

Individual Rationality

VCG is not ex post individual rational.
This means, generally speaking, agents in VCG might not always have utilities greater than 0 in equilibrium.
The utilities could be less than 0, in which case, it’s better to just not take part.

However, VCG can be individually rational if the setting satisfies two constraints:

Property

Definition

English pls

Choice-set monotonicity

Less people, same/less choices

A setting exhibits choice-set monotonicity if .

Removing an agent will never grant more choices.

No negative externalities

No participation? No worries!

A setting exhibits no negative externalities if .

Every agent has zero or positive utility for choices made without their participation.

Theorem

The VCG mechanism is ex-post individual rational when the choice-set monotonicity and no negative externalities properties hold.

Let’s prove this theorem!
If you remember from the “ in terms of social welfare” note above, our utility can be expressed as:

If we can prove that this is always positive or zero, then we can prove that VCG is individually rational.
maximises social welfare, meaning its social welfare is at least as big as ’s social welfare. Because of choice-set monotonicity, no better choice would appear without agent . We can represent that as:

Because of no negative externalities, agent has zero / positive utility on a choice made without their participation. We can represent that as:

Therefore, if we split up the equation before:

Because we know is positive, we can subtract it from the right side and the equality will still hold:

Boom! We’ve proved the above, showing that VCG is individually rational when choice-set monotonicity and no negative externalities is satisfied by the setting.

Budget Balance

VCG is not budget balanced.
This means VCG may have profits or losses.

VCG is not weakly budget balanced.
This means VCG may incur losses.

Here’s another setting property:

Property

Definition

English pls

No single-agent effect

Please leave.

A setting exhibits no single-agent effect if there exists a choice that is feasible without and that has .

Welfare of agents other than is increased by dropping .

Example:

In an auction setting, agents are better off when one other agent drops out, because there’s less competition.

Therefore, auction settings exhibit no single-agent effect.

Theorem

The VCG mechanism is weakly budget-balanced when the no single-agent effect property holds.

Let’s prove this!

To prove that VCG is weakly budget-balanced, we need to show that the sum of all payments (the mechanism’s income) is greater than or equal to zero:

From the no single-agent effect, the social welfare of the choice without should be greater than the social welfare of the choice with , because everyone is better off removing other agents:

If the left is greater than the right, then that means left - right must always be non-negative.
left - right appears in the sum of all payments above. That means, to get the mechanism income, we’re summing all zero/positive numbers. Therefore, the mechanism income will also be zero / positive.
This proves VCG is weakly budget-balanced when no single-agent effect holds!

Wanna hear some good news?
What? No! It’s not the end of the module yet!
I meant this:

Theorem

Krishna & Perry, 1998

In any Bayesian game setting in which VCG is ex post individually rational...

... VCG collects at least as much revenue as any other efficient and ex interim individually-rational mechanism.

This means if we have a VCG that is ex-post individually rational, then it’ll collect at least as much revenue as any mechanism that is efficient and ex-interim individually rational.
ANY mechanism.
That includes Bayes-Nash mechanisms.

Here’s another: VCG is as budget balanced as any efficient mechanism can be.

It satisfies weak budget balance in every case where any dominant strategy, efficient and ex-interim individual rational mechanism would be able to do so.

Wanna hear some bad news?
Huh? Well, yes, there is still some content left to learn, but I have other bad news.
I meant this:

Theorem

Green-Laffont; Hurwics

No dominant-strategy truthful mechanism is always both efficient and weakly budget balanced, even if agents are restricted to the simple exchange setting.

Simple exchange setting: a setting of buyers and sellers with quasilinear utility functions, all interested in trading a single identical unit of some good.
Brace yourself; there’s more!

Theorem

Myerson-Satterthwaite

No Bayes-Nash incentive-compatible mechanism is always simultaneously efficient, weakly budget balanced and ex interim individually rational, even if agents are restricted to quasilinear utility functions.

The class of Bayes-Nash incentive-compatible mechanisms includes the class of dominant-strategy truthful mechanisms.

Revelation Principle

What is a revelation?
When “Something is revealed, especially a dramatic disclosure of something not previously known or realised”?
How about when you’re caving someone’s face in and you spontaneously come up with killer moves?

Not quite... I’ll tell you.

Revelation Principle

So, we have dominant-strategy (DS) truthful mechanisms.
They’re cool because:

as a participant, it’s easy to figure out how to play (just be truthful, no need to calculate any ultimate plan)
as a designer, it’s easy to predict outcomes (assuming everyone is rational)

That’s the aim of the game here: make DS truthful mechanisms.

When we have a DS truthful mechanism, we make two assumptions:

Every participant has a dominant strategy, no matter the private valuation
This dominant strategy is direct revelation, where the participant truthfully reports all of its private information to the mechanism.

In English, this means:

Everyone has dominant strategies
Those dominant strategies are truthful

Wow. Dominant-strategy truthful means we have dominant strategies that are truthful.

“People die if they are killed”

We have mechanisms that satisfy the first point: everyone has dominant strategies.
However, it’s harder to prove the second one: those dominant strategies are truthful.
What if I told you that you don’t need to prove it? We get it “for free”?
The Revelation Principle has your back!
So long as you have dominant strategies, you can make an identical, direct, truthful mechanism!

Theorem

Revelation Principle

For every mechanism in which every participant has a dominant strategy (no matter what their private information), there is an equivalent direct dominant-strategy truthful mechanism.

How can that be the case?

Well, do you remember reductions from ToC?
That thing where you convert one problem to another using a Turing machine to show decidability?
Well anyway, this is similar to that.

Let’s say we have a mechanism already, Liar’s Mechanism, that has dominant-strategies, but isn’t truthful.
Each agent has a strategy function that, given a type, plays a dominant strategy (not a truthful one).
Everyone has types (private information) .
It doesn’t even have to be a direct mechanism.

So, right now, all the agents are lying to get what they want in this Liar’s Mechanism.
What if we wrapped this mechanism in another mechanism that plays dominant strategies for us, with our best interests at heart?
These new mechanisms are revelation mechanisms.
What should we call this mechanism... well, it’s like an ‘agent’ for us, so let’s call it Enrico.

We’ve now created a new direct DS truthful mechanism out of our old Liar’s Mechanism!

How is this truthful?

We’re still playing the Liar’s Mechanism, but we’re doing so indirectly through this “Enrico” mechanism.
It’s a mechanism where you tell it your type, and it’ll play the best dominant strategy for you.
So it’s like having a perfect robot that plays for you, so long as the robot knows what you truthfully like.

How is this direct?

The mechanism before wasn’t direct; you just played strategies.
However, now the mechanism is interfaced with “Enrico”.
All “Enrico” asks is your private information, and if you give it to “Enrico”, it’ll play the best dominant strategy for you!
In that case, maybe I should’ve called it “Google” instead...

In the words of the slides, “The agents don’t have to lie, because the mechanism already lies for them”.

What’s so good about the revelation principle?
It means that truthfulness is not a restrictive assumption.
We can consider only direct truthful mechanisms, because all mechanisms can be made directly truthful.
In addition, indirect mechanisms can’t do (inherently) better than direct mechanisms.

There is one small problem though:

The set of equilibria is not always the same in the original mechanism and the revelation mechanism.
The revelation mechanism does have the original equilibrium of interest, but there may be new, bad ones, especially with indirect mechanisms.

So... that’s it? We’ll never see an indirect mechanism again?
Not so fast!
A direct truthful mechanism ‘forces’ agents to reveal their private information.
In real life, some people might not be comfortable with that:

Government people
Paranoid people
People who berate you for not using Arch Linux

So, this places an unreasonable burden on the communication channel.

In addition, to compute a revelation mechanism, it needs to know the agents’ equilibrium strategies, or else it wouldn’t know which strategies to pick.
So there’s also a burden absorbed by the mechanism, which could be enough to make this concept not viable in real-life situations.

Characterisation of DS Truthful Mechanisms

So far we’ve been focusing on implementing an efficient (i.e. social-welfare maximising) outcome.
What other social choice functions can we implement in dominant strategies?

Maximising social welfare is pretty hard (it’s NP-complete).
But there are social choice functions that approximate the maximum social welfare.
In some settings, the mechanism designer might optimise other objectives:

Revenue
Some kind of fairness metric
Makespan (in scheduling applications)

Now, let’s move on to something completely different and talk about this theorem.

Let denote the set of choices that can be selected by the choice rule given the declaration by the agents other than (i.e. the range of ).

Theorem

A mechanism is dominant-strategy truthful if and only if it satisfies the following conditions for every agent and every :

The payment function

can be written as

Alright, I just hit you with a great wall of maths jargon. What is this?
Let’s say we have a mechanism, and every agent that isn’t has already decided and locked in their strategy.
is the set of choices that the mechanism may pick, depending on agent ’s final decision.
The first condition states that payments can only depend on other agents’ declarations and the selected choice.

The declaration of agent still indirectly affects the payment, but only through the choice of the mechanism.

The second condition states that, from agent ’s point of view, the mechanism always selects the choice that maximises their utility (in other words, the mechanism optimises for each agent).

What’s the proof of this?
The proof is split into two parts: the ‘if’ part, and the ‘only if’ part.

If part

Only if part

Let’s say that agent can either be truthful and declare , or lie and declare .

	The choice the mechanism picks if agent declares .
	The choice the mechanism picks if agent declares .
	The payment of agent for choice .
	The payment of agent for choice .

Now, let’s walk through what’s going to happen to agent :

If picks , their utility will be .

If yielded a greater utility for , then the mechanism would’ve picked instead of when declared truthfully.

Therefore, it’s a dominant strategy for to declare .

First condition:

Let’s assume, for the sake of contradiction, the payment function did depend on ’s declared valuation, .

That means there could be a truthful strategy and a untruthful strategy where they both yield the same choice , but the truthful strategy has a higher payment than the untruthful strategy .

Then could increase their utility by being untruthful, as they could get the same result and pay less.

So, as per this proof by contradiction, the payment cannot depend on ’s declaration if this mechanism is DS truthful.

Second condition:

Again, for the sake of contradiction, let’s say player declares truthfully, but the choice does not maximise their utility.

That means there could be some other untruthful valuation that picks the choice that does maximise their utility, .

Then could increase their utility by declaring the untruthful valuation and getting the choice that maximises their utility.

As per this proof by contradiction, the mechanism must pick the choice that maximises utility if this mechanism is DS truthful.

In summary, if you want to make a DS truthful mechanism, all you need are those two conditions in the theorem, and boom!

Dominant Strategy Implementation

Implementable Choice Rules

Remember when I talked about implementing other social choice functions?
Let’s talk about those.

Property

Definition

English pls

Weak Monotonicity

WMON

A social choice function satisfies weak monotonicity (WMON) if for all agents and all possible valuation profiles of the other agents we have that implies that .

Let’s say the mechanism picks if agent plays , and if they play .

When an agent changes their valuation from to , then the increase in value for their new choice is at least as large as the increase in value for their old choice .

There’s two theorems that go with WMON:

Theorem

WMON is necessary for DS truthfulness

If a mechanism is DS truthful, then satisfies WMON.

DS truthful → WMON

Theorem

When WMON is sufficient for DS truthfulness

If all domains of preferences are convex sets (as subsets of Euclidean space) then for every social choice function that satisfies WMON there exists payment functions such that is DS truthful.

WMON → DS truthful (if preferences are convex sets)

What do these two theorems mean?
They show that:

WMON is necessary for DS truthfulness
WMON is sufficient for DS truthfulness (when preferences are from convex sets)

Oof, there’s a lot to pick apart here.
Let’s start by explaining what I mean by ‘necessary’ and ‘sufficient’.

“p is necessary for q” means “if we don’t have p, we don’t have q”. It’s equivalent to “q implies p”.
“p is sufficient for q” means “if we have p, we have q”. It’s equivalent to “p implies q”.

Simple enough; it’s just a different syntax for “implies”.

The first theorem is easy enough to understand: if a mechanism is DS truthful, then the social choice function of that mechanism satisfies WMON as well.

You don’t have to fully understand the second theorem, because it’s not examinable to know, but it says:

If domains of preferences are convex sets, then every WMON social choice has payment functions such that the mechanism is DS truthful.

To simplify even further, you could say “if preferences are from convex sets, then WMON → DS truthful”.
But what are convex sets?
They’re simply regions where every point on a line segment drawn between two points in the region also lies completely in the region.

Wahoo yes convex set 🎉🎉🎉🎉	Boo no convex set 😡💢💢

If the agents’ preference vectors also work like that, then WMON → DS truthful.

Great, so WMON is good.
If we have a social choice function that satisfies WMON, we’re closer to a DS truthful mechanism (we’ll even have a DS truthful mechanism if the preferences are convex sets).
Which social choice functions satisfy WMON?
We don’t know generally, but we do know it for two extreme cases, under:

Unrestricted quasilinear settings where (set of reals)
Single dimensional settings

The Case for Unrestricted Quasilinear Settings

Affine maximiser: A social choice function is an affine maximiser if for some subrange , for some agent weights (set of positive reals), and for some outcome weights ,, has the form:

English please?
An affine maximiser is simply a choice function from Groves mechanism, but with weights. It’s also called a weighted VCG.
A subset of choices have weights, .
Each agent has weights too, .
Let me remind you of the choice function in Groves mechanisms:

Check out the middle: .
In Groves' mechanisms, we maximise everyone’s valuations.
In affine maximisers, we maximise this:
In affine maximisers, we also maximise everyone’s valuations, but with respect to weights on outcomes and agents.

Affine maximisers = DS truthful?

“Any affine maximiser social choice function can be implemented in dominant strategies; i.e. there exists payment functions where is DS truthful.”

Can we prove it? You bet we can!

VCG is truthful. What if we just stick the affine maximiser choice function into VCG and see if that works?

On the right track, but... hmm, no. That doesn’t feel right.

It’s not rigorous enough.

Let’s have a closer look at what makes VCG and affine maximisers different.

Let’s take the choice function of VCG and abstract it a little:

Here, I’ve defined a social welfare function for VCG, and substituted it in the choice and payment functions.

Now, how do we “transform” our VCG to an affine maximiser?

We should create a new, refined function, of course!

Can you see how substituting our new function results in the affine maximiser choice function?

Well, we’ve substituted it in the choice function. The payment function used too, you know. Let’s sub it in there too:

There we have it! A payment function that, when paired with its respective affine maximiser , creates a DS truthful mechanism (because we effectively have VCG, which is truthful).

We could factor out out of the payment function, but since it’s just a scalar, it shouldn’t affect the truthfulness of the mechanism.

Source; slides 15 to 19

(what, you thought I came up with this on my own?)

Alright, VCG with weights. What’s so special about it?
I’m glad you asked:

Theorem

Roberts

If there are at least three choices that a social choice function will choose given some input, and if agents have general quasilinear preferences, then the set of (deterministic) social choice functions implementable in dominant strategies is precisely the set of affine maximisers.

In other words, if a social choice function has to pick from at least three choices, and all the agents have general quasilinear preferences (each agent can have any valuation for each choice ), the only DS-implementable social choice functions are affine maximisers.

Under these conditions, all DS-truthful mechanisms have affine maximiser choice functions.

Efficiency is an affine-maximising social choice function for which and .

In other words, the social choice function that maximises social welfare is the same as an affine maximiser with all agent weights equal to 1 and choice weights equal to 0. So basically, the choice function from Groves mechanisms.

Since affine maximising mechanisms are really just weighted Groves mechanisms, we can’t stray very far from Groves if we focus on this setting.
So then, let’s try a simpler, more restrictive setting.

The Case for Single Dimensional Settings

With single dimensional settings, there is no valuation ‘vector’.
Only a single number determines the valuation vector .
We can generalise this in many different ways, but let’s just say:

You can “win”...
... or you can “lose”.
Agents have different scalar values for “winning”, and this value is 0 for “losing”.
Each agent has a subset of winning alternatives, which are choices (where ) that are all equivalent to each other in terms of value to agent . Choices yield a value of 0.
In other words, winning alternatives are choices where agent wins.

There are many examples of single dimension settings:

Single-item auctions where agents only care if they receive the item or not. is a singleton, with the choice where wins the item.
A combinatorial auction where agents are only interested in a specific bundle of items. contains choices where the agent gets their bundle and/or more.
Selfish routing setting where the mechanism wants to buy a path.

For now, we’ll be looking at single-parameter domains:

Each agent has winning alternatives and a range of values , which are both publicly known.
Values and are used as lower and upper bounds for the valuation , but don’t worry; they’re not important here.
Agent values all at (which is private knowledge), and 0 for everything else.

We will now define monotonicity in single-parameter domains:

Property

Definition

English pls

Monotonicity

Monotone in

A social choice function on a single parameter domain is called monotone in if for every and every we have that implies that . That is, if valuation makes win, then so will every higher valuation .

If agent plays and gets a choice from their winning alternatives , then if player plays a higher valuation , then player will still get a choice from their winning alternative .

I’m going to define something new: critical values.
What? No, it’s not when you deal a critical hit on a Groves mechanism!
It’s this:

For a monotone function , for every for which agent can both win and lose, there is always a critical value below which loses and above which they win.

If wins for every , then the critical value at is undefined.

In simple terms, critical values are points where the result of agents’ valuations turn from either losing or winning.

Mechanisms are normalised if the payment for losing is always 0.
That is, if .
Every DS truthful mechanism can easily be turned into a normalised one.

Let’s characterise normalised mechanisms:

Theorem

A normalised mechanism on a single parameter domain is DS truthful if and only if the following conditions hold:

is monotone in every
Every winning bid pays the critical value

If then
If is undefined (implying that given , wins for all ), then there exists some value , , such that .

Remember that we’re not restricted to affine-maximisers anymore.
We can do stuff like:

This gives us flexibility to design tractable (computationally efficient) approximation mechanisms whose exact optimisation is computationally intractable.
So instead of being forced to be completely optimal, like affine-maximisers and Groves would make us be, we can be way more computationally efficient and simply approximate.

Something like this might be useful for:

Task scheduling: allocating tasks among agents to minimise makespan
Bandwidth allocation in computer networks: allocate real-valued capacity of a single network link among users with different demand curves
Multicast cost sharing: share the cost of a multicast transmission among the users who receive it
Two-sided matching: pair up members of two groups according to their preferences, without imposing any payments, e.g. students and supervisors, hospitals and residents, kidney donors and recipients

Optimal Auctions

So far, we’ve ignored the revenue of the mechanism.
By this, I mean how much the seller earns from the auctions.
However, in some auctions, it’s desirable to maximise the seller’s revenue.

Mechanisms that are designed to maximise seller’s expected revenue are called optimal auctions.

Economists designing optimal auctions

Before, we assumed:

Independent private valuations (IPVs) (valuations do not depend on others’)
Risk-neutral bidders (bidders have no preference to playing risky or safe)

We also assume:

Each bidder ’s valuation is drawn from some strictly increasing cumulative density function with a PDF (probability distribution function) that is continuous and bounded below.
We allow (asymmetric auctions)
The seller knows each .

Optimal auctions maximise seller’s expected revenue subject to some form of individual rationality.

Example of an Optimal Auction

We’re going to give an example of an optimal auction, but first, let’s prepare the setting:

We shall run a single-item second price auction (Vickrey)!
We have 2 bidders, is uniformly distributed on (each possible valuation has an equal chance of being picked).
In this auction, we’re using fractions of money. Let’s just say we’re using some kind of bitcoin variant or something.
There is a reserve price between 0 and 1, where:

There’s no sale if both bids are below
Sale is at price if there’s one bid below and one above
Sale is at second highest bid price if both are above

The aim of the game is this: what value of will maximise expected revenue?
In other words, which value of makes this an optimal auction? 🤔

Buy BaharCoin while it’s still low

Remember, we’re still truthful. Let’s go over each case:

1. Both bids are below

The chance of one valuation being drawn below is itself. Since there’s two players, and valuations are drawn independently, the probability of two is , so the probability that two valuations are drawn below is .

Probability:

Revenue:

2. One bid is below and one is above

The chance that one valuation is drawn above is .

The first case is player 1’s valuation being drawn below and player 2’s valuation being above ; the probability of that is .

The second case is the opposite: player 2’s valuation being below and player 1’s being above. The probability of that is also .

Either the first case could happen OR the second case, so we add them up (both cases are mutually exclusive) to get the total probability that one bid is below and one is above: .

Probability:

Revenue:

3. Both bids are above

Both players draw valuations above , so the probability is , which is .

The expected revenue of this case is derived from the expected minimum valuation drawn, given that both bids are above . In maths language, that’s .

If you want to know why it’s , then read the appendix section for this. If you really don’t care and are willing to accept it for what it is (how to derive it is not examinable), then you’re ironically wiser than I am and should simply read ahead.

Probability:

Revenue:

To get the expected revenue of this auction, we sum the products of probability and revenue for each case.

Expected revenue =
Expected revenue =

Now that we have the expected revenue, we want to maximise it.
Do you remember from GCSE / A level maths on how to find the maximum of a function?
Find the turning points!

(Careful! There’s two turning points; this is the maximum, but the other () is a minimum.)

There you have it: this is an optimal auction when .

When the reserve price is , the seller will get on average BaharCoin.
When the reserve price is , the seller will get on average BaharCoin.
This is neat, but you lose sales if bids are below , but it would’ve been low revenue anyway and that’ll only happen of the time.
On the flip side, we could increase our reserve price when one bidder has low value and the other high, which happens of the time.

If a bidder bids 0.8 and the reserve price is 0.5, wouldn’t it be smarter to bump up our reserve price a bit?

When you think about it, a reserve price is like adding an extra bidder. Except that bidder is us.

Myerson’s Theorem

That was cool and all, but it seems a little clumsy just to make an optimal auction.
Isn’t there a way of characterising optimal auctions, so it’s easier to make them?
Why yes, I’m glad you asked!
What? You didn’t ask? Well, uh, just read on, I guess.

Bidder ’s virtual valuation is .

“Virtual”? We’re doing auctions in VR Chat now?
Not quite; this metric is used for determining the reserve price for each bidder.
We will assume that this is increasing in (every time goes up, the virtual valuation will go up too)
In our BaharCoin example, where valuations are uniformly distributed between 0 and 1, the virtual valuation is always .

What’s that bit we’re minus-ing from the valuation?
It’s ‘shaving off’ a small bit from our valuation to make our virtual valuation.

The top is the probability of drawing a valuation above .
The bottom is the relative frequency that we draw this given value of .

If you don’t fully get it, don’t worry; its main purpose is this:

Virtual valuations increase weak bidders’ bids (bids from bidders who don’t bid much; with lower expected valuations), making them more competitive.
In fact, it’s possible for a weak bidder to win the auction, even if they’re technically bidding less than other bidders.
This means strong bidders (bidders who bid high; with higher expected valuations) will have to bid more aggressively.
This is good for the seller because this maximises the revenue of the auction.

Bidder ’s bidder-specific reserve price is the value for which .

It’s the minimum a bidder can bid while still being able to win.
If they bid any lower than this, they can’t win. But more on that in a sec.

Now, on to the theorem:

Theorem

Myerson (1981)

The optimal (single-item) auction is a sealed-bid auction in which every agent is asked to declare their valuation.

The good is sold to the agent , as long as .

If the good is sold, the winning agent is charged the smallest valuation that they could have declared while still remaining the winner, i.e.:

Let’s go over these two conditions for an optimal auction:

The winner is the agent with the highest virtual valuation, so long as it’s non-negative (the valuation is above the reserve price).

If everyone’s virtual valuation is negative, the seller keeps the item and the seller’s revenue is 0.

The winner has to pay the smallest valuation they could have declared while still remaining the winner.

This is very similar to the second-price auction, but not quite the same, especially if the bids are discrete.

Very cool! So those are the conditions for an optimal auction.
Time for an optimal auction FAQ:

Is this VCG?

No, it’s not efficient.
In this auction, weak bidders’ valuations are inflated by the virtual valuations to make strong bidders bid more aggressively.
This means if a weak bidder wins (and they can, even if they bid less than others), the outcome won’t maximise social welfare.

How should bidders bid?

Truthfully!
This is effectively a Vickrey auction, but with a reserve price and virtual valuation space. Neither of these depend on the agent’s declaration, so just like how Vickrey is truthful, optimal auctions are truthful also.

What happens in the special case where all agents’ valuations are drawn from the same distribution?

We’ll just have a normal Vickrey auction, with a reserve prices that satisfies:

Since everyone will have the same reserve price, it’s effectively like in our BaharCoin example.

What happens in the general case (valuations are drawn from different distributions; asymmetric auctions)?

Like I said above, stronger bidders will have to pay more aggressively to keep up with the inflated virtual valuations of the weak bidders.
In a way, it exploits how much people are willing to bid on something they really want. A bit mean, when you think about it.

So, optimal auctions are the best thing ever, right?
We actually use these, right?
We’re not just learning about these and are never going to see these again after this module, right?!
Alas, no... they’re not used in real life.

Why?
They are not detail free.
That means the seller needs to know bidders’ distributions.

When you go onto an eBay bid, do you think the seller is going to know your exact bid distribution when you bid on a pair of socks?
Heck, do you even know your own bid distribution on pairs of socks auctions?
No. The website would have to build a profile on you to figure that out.
Saying that, if Google or Microsoft ever made an auction site, they’d probably use optimal auctions.

You heard it here first, folks: Google’s optimal sock auctions.

What does an optimal auction actually do?
It puts pressure on strong bidders to bid more aggressively to maximise revenue, right? It’s extra competition.
Can’t we achieve the same thing by just having more bidders?
In symmetric IPV settings, sometimes it’s better to just attract more bidders than calculate everyone’s valuation distributions and run an optimal auction.

Stable Matching (SM)

In this section, we’re going to cover an algorithm for the stable matching problem.
It goes like this:

There’s a set of leaders
There’s a set of followers
Each leader has strict preferences over followers
Each follower has strict preferences over leaders
All preferences together make a preference profile.
The objective is to find a one-to-one stable matching.

The objective is what again?

One-to-one: each leader is paired with at most one follower, and vice versa (everyone is paired with someone)
Stable matching: no pair wants to deviate

Let’s go into more detail on this ‘stable matching’.
What is a stable matching exactly?

“There is no leader-follower pair, each of whom would prefer to match with each other rather than their assigned partner”

In other words:

“A matching with no blocking pairs”

It’d be easier with an example:

Let’s say there’s two leaders, red and orange, and two followers, blue and purple.

They have the following preferences:

Leaders:

red : blue > purple

orange : blue > purple

Followers:

blue : red > orange

purple: red > orange

In summary, red is the most popular leader, and blue is the most popular follower.

It makes sense that the most popular leader and follower would want each other.

However, check out the pairing on the right: blue is paired with orange, and red is paired with purple.

Blue and red would rather be paired with each other. This is a blocking pair.

Now, look at the pairing on the left. Blue and red are now paired, so it’s not a blocking pair anymore.

Sure, purple would rather be with red, and orange would rather be with blue, but blue and red are happy where they are, so those aren’t blocking pairs.

You can think of a blocking pair like a couple that want to elope together. Like Romeo and Juliet.

Cool, so stable matchings are the best thing since the cavemen discovered fire.
Given a leader-follower problem, how do we get these stable matchings?
How do we know matchings like these even exist?
Can we get one efficiently?
Questions, questions, so many questions...
... allow me to answer them all with the Gale-Shapley (GS) algorithm!

Gale-Shapley Algorithm

Theorem

Myerson (1981)

A stable matching always exists, and can be found in polynomial time.

How do we find stable matchings in polynomial time?
We use this algorithm:

Gale-Shapley algorithm pseudocode (leader-oriented)

Assign all leaders and followers to be free // Initial state

while (some leader l is free and hasn't proposed to every follower)
f = first follower on l's list to whom l hasn't yet proposed

// next: l proposes to f
if (f is free)
assign l and f to be engaged // tentatively matched
else if (f prefers l to their fiancé l')
set l and f to be engaged
set l' to be free
else
f rejects l // and l remains free

output the n engaged pairs, who form a stable matching

There’s an example showing how this algorithm works in the Appendix.
You don’t wanna miss it!

Before we go over why GS returns stable matchings, let’s go over:

How to find blocking pairs given a non-stable matching.

Here’s an example:

The first thing to do is to gather all the possible matchings:

Then, we filter out the matchings that were already matched in our non-stable matching (I’ve already highlighted them in green above):

Now, we sequentially go through each matching and check if it’s a blocking pair.
Yeah, this method is a bit ‘brute force’, but it works.
Let’s go over two examples instead of all six:

Is a blocking pair?
To find out, we ask ourselves two questions.
This matching is a blocking pair if the answers to both questions are ‘yes’:

Does prefer to its current matching, ?

Looking at ’s preferences, is before .
So, yes, prefers .

Does prefer to its current matching, ?

Looking at ’s preferences, is before .
So, yes, prefers .

Since they both prefer each other, is a blocking pair.

Is a blocking pair?

Does prefer to its current matching, ?

Does prefer to its current matching, ?

None of them prefer the other, so this is not a blocking pair.

We do this to all matchings, and we eventually get the result:

Where greens are blocking pairs.

GS Returns Stable Matching Proof

I’ve been telling you that GS returns stable matchings, but how do we know?
Can we prove it?
Of course we can; why else would this section have “Proof” in the name?

To prove this, we split the proof into three sections:

1. The algorithm always terminates

2. The algorithm returns a matching

3. The matching it returns is stable

We notice two things about GS:

Leaders propose to followers in decreasing order of preference
Once a follower is matched up, they never become unmatched; only “trades up”

Because of this, the algorithm terminates after at most iterations of the while loop, where is the number of leaders (and thus, the number of followers too).

Why?

Because for each leader, they propose to followers, worst-case. If there’s also leaders, then there are , or , proposals.

This is also a proof for GS being in polynomial time.

In this proof, we’ll show that GS returns a one-to-one matching, meaning that each leader has at least one follower and vice versa.

This will be split into two parts:

1. In GS, all leaders get matched.

Let’s do a proof by contradiction.

Let’s say there’s some leader in the output matching that is unmatched.

That means there must be some follower that’s also unmatched (pigeonhole principle; a leader can’t be matched with more than one follower).

If is unmatched, that means they were never proposed to.

But, is unmatched, so they should have proposed to every follower.

A contradiction!

2. In GS, all followers get matched.

As proved before, all leaders get matched.

If all leaders are matched, then they all must be matched to unique followers.

But followers are all we have!

Therefore all followers must be matched.

We can prove that the returned matching is stable by showing that there’s no blocking pairs.

Consider any pair that is not in .

There are two possibilities for the pair :

1. never proposed to

proposes in decreasing order of preferences, so must have proposed to first before .

is not blocking.

2. proposed to

must have rejected , either right away or after they matched up once, because prefers to .

is not blocking.

In either case, is not a blocking pair.

Extensions of Stable Matching

Did you know that all executions of leader-oriented GS lead to the same stable matching, even though an instance could have several stable matchings and GS is non-deterministic?

Wait, why is GS non-deterministic?
Because when you have multiple free leaders, the algorithm can pick any leader arbitrarily.
This doesn’t affect the outcome, though.
In fact, you could create a parallelised variant of GS where all free leaders propose at the same time. Efficient!

But we’re here to talk about Extensions of Stable Matching.
One of which is Stable Matching with Incomplete Lists (SMI), which is where preferences might not include all members of the opposite group (some leaders might find some followers unpreferable, and vice versa).
An agent would rather be unmatched than matched with an unpreferable (or unacceptable) partner.
Such possible preferences might look like this:

What algorithm can be used for SMIs?
We can use a variant of the GS algorithm (changes are highlighted):

Gale-Shapley algorithm pseudocode (leader-oriented, SMI)

Assign all leaders and followers to be free // Initial state

while (some leader l is free and hasn't proposed to every follower they find acceptable)
f = first follower on l's list to whom l hasn't yet proposed

// next: l proposes to f
if (f is free and finds l acceptable)
assign l and f to be engaged // tentatively matched
else if (f prefers l to their fiancé l')
set l and f to be engaged
set l' to be free
else
f rejects l // and l remains free

output the engaged pairs, who form a stable matching

There’s also Stable Matching with Ties (SMT), which is where agents may be indifferent on which partners they get.

No! 🤦‍♂️ It’s not where we match office ties...

An example of such preferences are:

(The partners in brackets are the ones whom the agent is indifferent to).
How many kinds of blocking pairs can we define with SMT?
There’s three, actually.
They are, given blocking pair :

strictly prefers to their current follower, and strictly prefers to their current leader
strictly prefers to their current follower, and is indifferent between and their current leader
is indifferent between and their current follower, and strictly prefers to their current leader

Which definition of blocking pair we pick depends on our problem domain, I suppose.
We usually pick the first one though, where both preferences must be strict.

What if we have both?
A Stable Matching problem with Ties and Incomplete Lists (SMTI)?

Almost the same algorithm for SMI works for SMTI (and thus, SMT).
The only difference is that we need to break ties arbitrarily.

Optimal Stable Matchings

For a given instance, there may be many stable matchings:

Here, there are two stable matchings: green and purple.
Both and are achievable to .
Only is achievable to .
Wait... what’s ‘achievable’?
Achievable: A leader and a follower are achievable if there exists some stable matching where and are matched.

But back to our green and purple matchings.
Don’t you think the purple one favours the leaders way more than the green does?
On the other hand, the green matching looks pretty good for the followers.
What gives?

The purple matching is a leader-optimal matching: a stable matching in which each leader receives their best achievable partner.
The green matching is a follower-optimal matching: a stable matching in which each follower receives their best achievable partner.

When preferences are strict, all executions of leader-oriented GS return the leader-optimal matching.
Can we prove this? You bet...

Proof

So our claim was “leader-oriented GS’ return leader-optimal matchings”.

To do a proof by contradiction, we need to consider “a leader-oriented GS does not return a leader-optimal matching”.

Leaders propose in decreasing order of preferences, so therefore some leader is going to be rejected by an achievable partner during GS (this doesn’t always happen, but it happens most of the time. We’ll consider it in this example).

Let’s say we have some leaders and some followers, and we’re running a GS.

All of the sudden, some leader is rejected by their first achievable partner (this is the first ever rejection by an achievable follower in this run. Remember that)!

Since and are achievable, there must be some stable matching that contains the matching .

Now, when rejected , why do you think did that? It’s because was proposed by some leader they prefer! We’ll call this leader .

We now know prefers to

In our ‘other’ stable matching, , let’s say is matched to some other follower, .

When was rejected by , hadn’t been rejected by any achievable partner yet. That’s because we’ve already defined the rejection of by to be the first in this run (remember? ;D).

So, had never been rejected yet, before they proposed to and made reject .

If had never been rejected yet, then they must prefer over .

Why? Because if they preferred , would’ve proposed to first, and would have had to be rejected by in order to propose to in the first place!

We now know prefers to

Let’s summarise what we’ve found out so far:

is supposed to be a stable matching. Can you spot the contradiction?

There is a blocking pair: , but is meant to be a stable matching. A clear contradiction!

Therefore, leader-oriented GS’ must return leader-optimal matchings.

Because of this, we now know that:

When preferences are strict,

all executions of leader-oriented GS

return the same stable matching.

When preferences are strict, there exists:

a leader-optimal stable matching
a follower-optimal stable matching

We have leader-optimal and follower-optimal, but we also have:

Leader-pessimal matching: a stable matching in which each leader receives their worst achievable partner
Follower-pessimal matching: a stable matching in which each follower receives their worst achievable partner

Theorem

Myerson (1981)

When preferences are strict, the leader-optimal matching is the follower-pessimal matching, and vice versa.

We’ve already said that when preferences are strict, there exists:

a leader-optimal stable matching and
a follower-optimal stable matching.

But what about when there’s ties, in SMTs?

In SMTs, there might not be any leader-optimal or follower-optimal stable matchings.
Here’s an example:

Are there any leader-optimal matchings?

No: prefers purple, but prefers green

Are there any follower-optimal stable matchings?

No: prefers green, but prefers purple

So, there aren’t any leader or follower optimal matchings.

DS Truthfulness

In GS, is it a dominant strategy to report truthful preferences?
It depends which version of GS you’re using.

Theorem

Dubins and Freedleader; Roth

The mechanism that yields the leader-optimal stable matching (in terms of stated preferences) makes it a dominant strategy for each leader to state their true preferences.

Similarly, if we’re using a mechanism that yields the follower-optimal stable matching, then followers should state their true preferences.
If we have ties in the preferences, but a leader-optimal stable matching exists, then leader-oriented GS is DS truthful for leaders.

So, leaders should tell the truth for leader-oriented, and followers should tell the truth for follower-oriented. Seems simple enough.
But that means followers can misreport with leader-oriented, and leaders can misreport with follower-oriented!

Theorem

Roth, 1982

No stable matching mechanism exists for which truth-telling is a dominant strategy for every agent.

So no matter what mechanism you use here, someone can benefit from lying:

Theorem

Roth and Sotomayor, 1990

When any stable mechanism is applied to a marriage market in which preferences are strict and there is more than one stable matching, then at least one agent can profitably misrepresent their preferences, assuming others tell the truth.

Can we prove the first theorem?
Why would I say that if we couldn’t? Of course!

Proof

Here, we have two leaders and two followers:

This instance has two stable matching:

Green, leader-oriented
Purple, follower-oriented

Every stable matching mechanism must choose either green or purple.

Suppose that our mechanism picks green.

Now, if changes their preferences so they only like , the mechanism will pick purple: the follower-oriented matching that likes.

Similarly, if declares that they only like , the mechanism can only pick green: the leader-oriented matching that likes.

In this scenario, at least one agent benefits by lying.

That proof is cool and all, but it assumes we can have incomplete lists.
It’s a proof for SMI, not SM!
Is there a proof, like this, but for general SM?
Yeah; it’s on the Coursework 1 answers PDF.
What? You’re too lazy to open it up and you want to read it here?
Nngh... it’ll take a while to write.
Fine. But only because you asked so nicely!

Here’s yet another way to word the same theorem:

“No stable matching mechanism exists for which it is always an equilibrium for every agent to state their true preferences”.

So, that sucks. No matter what, people can lie.
Who do we trust?
Does that mean we can never apply to universities again?
Residents can never apply to hospitals?
Men and women can never be truthfully matched up?
Not so fast!

Roth and Peranson found that, in practice, very few students and hospitals can actually benefit by submitting false preferences.

For example, in 1996, out of 24,749 applicants, only 21 of them could affect their match by lying.
That’s only around 0.08%!

So, it’s not the end of the world:

Theorem

Immorlica and Mahdian, SODA 2005

The number of students that have more than one achievable hospital vanishes. Therefore, with high probability, the truthful strategy is a dominant strategy.

Here’s a better explanation of that theorem:

Let’s say there’s students applying to universities.
Each student has an arbitrary complete preference list.
Each university has a random list of students.
The expected number of students who have more than one achievable university is bounded by a constant that only depends on , and not .
Therefore, as gets larger and larger, the proportion of such students gets smaller and smaller, until it’s practically zero.

Here’s another brain teaser:

If a follower misreports their preferences to benefit themselves, would their misreport harm any of the other followers?

The answer lies here:

Theorem

Gonczarowski and Friedgut, 2013

In an instance of SM (no ties or indifference), and assuming that lying is only possible by permuting one’s preferences: if no lying follower is worse off then

No follower is worse off
No leader is better off

In other words, if there are followers who lie (and lie well), none of the other followers suffer from it, but none of the leaders will benefit from it either.
So, the only people who will suffer detriment from lying followers will be the leaders.

Hospital Residents Problem (HR)

There’s different variants, or extensions, of the stable matching problem:

Problem	Description
SMI Stable Matching problem with Incomplete lists	Agents may declare some candidates unacceptable, represented by preference lists of only some choices
SMT Stable Matching problem with Ties Shin Megami Tensei	Agents may be indifferent among several candidates
SMTI Stable Matching problem with Ties and Incomplete lists	Both incomplete lists and indifferences are allowed
HR Hospitals / Residents problem	Many-to-one stable matching problem with incomplete lists
HRT Hospitals / Residents problem with Ties	Many-to-one stable matching problem with ties and incomplete lists

In this section, we’ll focus on HR and HRT!

The namesake comes from junior doctors applying

and doing training in hospitals..

First off, what actually is the Hospital Residents problem (HR)?

We have a set of residents:
We have a set of hospitals:
We want to match up residents to hospitals by their preferences.
and find each other acceptable if they are in each others’ preference lists.
Each hospital has a capacity.
Residents rank hospitals with strict preferences, and hospitals do the same with residents.
We assume unacceptability is mutual: if finds unacceptable, then finds unacceptable.

The difference between HR and SM is that hospitals have a capacity.
This makes SM a subset of HR (SM is just HR with capacity 1).

A matching is a set of resident-hospital pairs such that:

find each other acceptable
No resident appears in more than one pair
No hospital appears in more pairs than its capacity

Here’s an example setting:

An example matching of this setting could be:

Like before, matchings can be stable if they have no blocking pairs.
is a blocking pair if:

and find each other acceptable
Either:

is unmatched in or
prefers to their assigned hospital in

Either:

is undersubscribed in or
prefers to its worst resident assigned in

Here’s three examples of blocking pairs:

is a blocking pair because prefers to , and prefers to .
is a blocking pair because is unmatched, and prefers to .
is a blocking pair because is unmatched, and is unmatched.

An example of a stable matching would be:

Resident-oriented GS (RGS)

So how do we find stable matchings?
There’s two: one of which is resident-oriented GS (RGS).

Resident-oriented Gale-Shapley algorithm pseudocode

M = ∅ // assign all residents and hospitals to be free

while (some resident ri is unmatched and has a non-empty list)

ri applies to the first hospital hj on their list

M = M ∪ {(ri,hj)}

if (hj is over-subscribed)

rk = worst resident assigned to hj

M = M \ {(rk,hj)} // rk is set free

if (hj is full)

rk = worst resident assigned to hj

for (each successor rl of rk on hj’s list)

delete rl from hj’s list

delete hj from rl’s list

output the engaged pairs, who form a stable matching

Again, there’s an example in the Appendix section.

When preferences are strict:

All executions of resident-oriented GS (RGS) return the resident-optimal (stable) matching (residents get their best possible choices).
All executions of hospital-oriented GS (HGS) return the hospital-optimal (stable) matching (hospitals get their best possible choices).
The hospital-optimal matching is the resident-pessimal matching, and vice versa (residents get their worst possible choices).

In other words, whether the residents or the hospitals get the better choices depend on which algorithm you’re using: RGS or HGS.
However, in both cases, you still get stable matchings.

Yes, as you’ve probably guessed, since there’s a resident-oriented GS, there exists a hospital-oriented GS too.

Hospital-oriented GS (HGS)

Here is the pseudocode for hospital-oriented GS (HGS):

Hospital-oriented Gale-Shapley algorithm pseudocode

M = ∅ // assign all residents and hospitals to be free

while (some hospital hi is under-subscribed and has non-empty list)

si = qi - |M(hi)| // the number of available seats in hi

hi applies to the first si residents on its list

foreach (rj that hi applies to)

if (rj is free)

M = M ∪ {(rj,hi)}

else if (rj prefers hi to her current hospital h’)

M = M ∪ {(rj,hi)} \ {(rj,h’)} // h’ is set free

if (rj is not free)

for (each successor hl of M(rj) on rj’s list)

delete hl from rj’s list

delete rj from hl’s list

output the engaged pairs, who form a stable matching

You guessed it; there’s an example in the Appendix section.

Here’s a comparison of the resident-optimal and hospital-optimal matchings:

Resident-optimal matchings and hospital-optimal matchings

Can you see how the resident-optimal matchings give crap pairings for the hospitals, and the hospital-optimal matchings give crap pairings for the residents?

Theorems

Theorem

Rural Hospitals Theorem

(Roth, 1984; Gale and Sotomayor, 1985; Roth, 1986)

In a given instance of HR:

The same residents are assigned in all stable matchings.
Each hospital is assigned the same number of residents in all stable matchings.
Any hospital that is under-subscribed in one stable matching is assigned exactly the same set of residents in all stable matchings.

Here’s some more ‘layman’ explanations:

If you get a HR stable matching where a subset of the residents get assigned hospitals and the rest do not get assigned, then for all stable matchings in that HR problem, all residents in will be assigned and all other residents will not be assigned.
For all stable matchings in our HR problem, the number of residents assigned to hospital is the same.
Given any stable matching in our HR problem, if we have a hospital who has matchings with and is undersubscribed, then all stable matchings have the exact same matchings with for .

Remember DS truthfulness?
Could residents and/or hospitals be better off if they lie?
These theorems have the answer:

Theorem

(Roth; 5.16 in TSM)

A stable matching mechanism that yields the resident-optimal matching makes it a dominant strategy for all residents to state their true preferences.

In other words, under strict preferences, RGS is DS truthful for residents.

Theorem

(Roth; 5.14 in TSM)

No stable matching mechanism exists that makes it a dominant strategy for all hospitals to state their true preferences.

Even when HGS is executed and preferences are strict, some hospitals may benefit from misreporting their preferences.

What about if we have ties (HRT)? Would that make a difference?
Then, hospitals could be indifferent to certain residents:

The thing is, HRT may not have hospital-optimal OR resident-optimal matchings.

A matching is stable in an HRT instance if and only if is stable in some instance of HR obtained from by breaking the ties [Manlove et al, 1999].

Hospital Residents Problem with Couples (HRC)

Pairs of residents who wish to be matched to geographically close hospitals form couples.
Couples are pairs of residents, , that rank pairs of hospitals representing the assignment of to and to .
So, couples are like ‘doubles’ for residents and hospitals. Rhyming not intended.
HRs with couples are called HRCs.

We can introduce a notion of stability when we have couples, but we won’t go into that here.

Unlike HRs, HRCs might not have stable matchings. Here’s an example:

(r1,r2): (h1,h2)

r3 : h1 h2

h1: r1 r3 r2

h2: r1 r3 r2

Blocking pair:

(r1,h1)

(r1,r2): (h1,h2)

r3 : h1 h2

h1: r1 r3 r2

h2: r1 r3 r2

Blocking pair:

(r3,h1)

(r1,r2): (h1,h2)

r3 : h1 h2

h1: r1 r3 r2

h2: r1 r3 r2

Blocking pair:

(r3,h2)

Those are all the combinations!

Solving HRCs is hard. It’s NP-complete!
Even when hospitals have capacity 1, it’s NP-complete.
Even when:

there are no single residents

[Ng and Hirschberg, 1988;

Ronn, 1990]

there are no single residents, and
each couple has a preference list of length <= 2, and
each hospital has a preference list of length <= 3

[Manlove and McBride, 2013]

the preference list of each single resident, couple and hospital is derived from a strictly ordered master list of hospitals, pairs of hospitals and residents respectively, and
each preference list is of length ≤3, and
the instance forms a “dual market”

“What does that mean”, you may ask. I don’t know. I just write the notes.

[Biró et al, 2011]

[Manlove and McBride, 2013]

There is an algorithm for HRCs, but it’s not examinable.
It’s called Algorithm C, and uses a Gale-Shapley-like heuristic.
What? No! It wasn’t made by Dennis Ritchie. Come on now.

When a member of an assigned couple is rejected, their partner must withdraw from their hospital as well.
This means residents previously rejected by that hospital must be reconsidered again.
You can imagine how that’d get messy quick...
In fact, Algorithm C might not even terminate. It doesn’t terminate if there isn’t a stable matching.
Heck, it might not terminate even when there is a stable matching.
However, if it does terminate, that means it’s guaranteed to have found a stable matching.

Stable Roommates (SR)

In this section, we’ll be going over the Stable Roommates (SR) problem:

We have agents (or students)
Each agent has strict preferences over all other agents
One agent can pair up with any one other agent (and be roommates).
A matching is stable if there is no such that prefers to their current partner and vice versa.

So, this is basically SM, but all agents are equal; there’s no disparity between ‘leaders’ or ‘followers’.

Here’s an example of an instance of SR:

We have 4 students, and 2 matchings: and .
However, there is a blocking pair: !
You can see it’s a blocking pair because:

In A’s preferences, C is preferred to the chosen partner, B
In C’s preferences, A is preferred to the chosen partner, D

So, what would be a stable matching?
This:

Unlike SM, there might not be a stable matching in SR.
An example is here:


Let’s get all the matchings and show that none of them are stable.

Blocking pair:	Blocking pair:	Blocking pair:

Irving’s Algorithm

Irving’s Algorithm can, for a given instance of SR, decide whether a stable matching exists, and if it does, it’ll find one.
It’s polynomial time, , and works in two phases:

Reduces the preference lists to a ‘phase-1 table’
Gets the stable matching, if it exists, by eliminating ‘rotations’

The first concept we should cover before going into this algorithm is semi-engagement.

In SM, if is engaged to , then that should mean is engaged to .

However, in SR, engagement is not symmetric!
If is engaged to , is not necessarily engaged to .

If you are my friend, that doesn’t necessarily imply that I am your friend too.
So, we say that is semi-engaged to .
could either be:

free

semi-engaged to some other agent

also semi-engaged to (making them fully engaged)

A stock image of semi-engaged roommates.

Phase 1

Phase 1 reduces the preference table into a phase-1 table.
A phase-1 table is simply the result of phase 1 of Irving’s Algorithm: a subset of the preference table with agents removed.

Here’s the pseudocode for phase 1:

Irving’s Algorithm Phase 1 pseudocode

Assign all agents to be free // initial state

while (some free agent x has a non-empty list)

y = first agent on x’s list

// next: x proposes y

if (some person z is semi-engaged to y)

assign z to be free // y rejects z

assign x to be semi-engaged to y

for (each successor w of x on y’s list)

delete w from y’s list

delete y from w’s list

After phase 1, we can reduce a list of preference lists:

... into a phase-1 table by eliminating agents:

What? You wanted to know what those red and dark red rectangles are for?

means X is semi-engaged to Y.
means Y is being ‘semi-engaged’ by X.

To see the process of this, check out the example in the Appendix.

If there exists an agent who got rejected by everyone else (an agent with no preferences in our phase-1 table), then there is no stable matching.
If every agent holds a proposal, then we can move on to phase 2.

Phase 2

In phase 2, we reduce the phase-1 table by eliminating ‘exposed rotations’.
We keep reducing the table until:

All lists contain just one entry, which constitutes the stable matching, or
At least one list becomes empty, in which case there is no stable matching.

The intuition is that the existence of a stable matching is preserved by removing exposed rotations.
So, no matter how many rotations we remove, the stable matchings will remain intact in our table.
Like how iterated elimination of strictly dominated strategies is for Nash equilibria!

What actually is an exposed rotation?
It’s a pair of sequences where:

is the second entry on ’s list, and
is the last entry on ’s list

To eliminate a rotation:

delete from ’s list
delete from ’s list

Intuitively, how do you find a rotation in a phase-1 table?

Alright, here’s a phase-1 table, and I’ll walk through how to find a rotation in it.

Let’s start at the top, with A, because A has a second preference:

p: A

A’s second preference is B, so we add B to the ‘q’ list:

p: A

q: B

Now, we look at B. B’s last preference is C, so we add C to the ‘p’ list:

p: A C

q: B

We move on to C. C’s second preference is D, so we add D to the ‘q’ list:

p: A C

q: B D

Onto D. D’s last preference is A, so we add A to the ‘p’ list:

p: A C A

q: B D

We’ve already got an A in the ‘p’ list! So, we stop there, or else we’d be repeating ourselves.

So, you see? We just alternate between putting the second preference in ‘q’ and the last preference in ‘p’ until we hit a cycle.

If we wanted to eliminate this rotation, we’d eliminate the bottom-left top-right pairs in these two sequences:

p: A C A

q: B D

So, we’d remove and from our preference table and either look for more rotations to eliminate, or, if a list is empty or all lists have one entry, we stop.

Here’s the pseudocode for phase 2:

Irving’s Algorithm Phase 2 pseudocode

T = phase-1 table // initial state

while ((some list in T has more than one entry) and

(no list in T is empty))

find a rotation exposed in T

T = T / // eliminate

if (some list in T is empty)

return instance unsolvable

else

return T, which is a stable matching

To see phase 2 in action with the phase-1 table above, check out the Appendix.

Phase 1 always returns the same phase-1 table...
... but phase 2 doesn’t always return the same stable matchings.
The stable matching phase 2 returns depends on the order of rotation eliminations.

Is Irving’s Algorithm DS truthful?

No.
Do you remember what we said about SMs?

“No stable matching mechanism exists for which truth-telling is a dominant strategy for every agent.” - Roth, 1982

SMs are effectively specialised versions of SR.
The only difference is that SMs have the extra restriction that leaders (one partition of the agents) can only have preferences for followers (the other partition of the agents) and vice versa.
You could say the set of SM problems is a subset of the set of SRI problems.
If there’s no DS truthful mechanisms for SM, then by extension, there can’t be any DS truthful mechanisms for SR, either.
If there are no DS truthful SR mechanisms, then Irving’s Algorithm, one such possible SR mechanism, also can’t be DS truthful.

What happens when we input an SM problem into this SRI mechanism?

Our resulting phase-1 table would be:

Red: Man-optimal matching
Blue: Woman-optimal matching
Well, why don’t we do phase 2 and see what happens?

Starting our rotation with m1

Starting our rotation with w1

p: m1 m2 m1

q: w1 w2

Eliminated:

p: w1 w2 w1

q: m2 m1

Eliminated:

As you can see, depending on the order of rotation eliminations, we can either get the man-optimal matching or the woman-optimal matching.

To see extensions of the Stable Roommate problem, check out the Appendix section on that!
To see content on the size vs stability content, that’s also in the Appendix.

There’s so many matching algorithms... how do we remember them all?
I created a subset tree that shows the relationship between all the matching algorithms we’ve seen so far!
Do you know where to find it? That’s right; in the Appendix!

One-Sided Matching

In this section, we’re going over One-Sided Matching (OSM)!
It’s like the other matching problems, but...
Individuals in one group have preferences over the other, but not the other way around.
For simplicity’s sake, we’ll assume all OSM settings have incomplete lists.
OSMs are equivalent to SMTI problems where all items have indifferent preferences to all agents (they tie with everyone).

Möbius strip matching: an alternative title.

Some examples we’ll cover are:

One-Sided Matching with Initial Ownership:

House Allocation (HA)
Student Rooms Allocation (SRA)

One-Sided Matching without Initial Ownership:

Teaching Allocation (TA)
(There’s this ‘eating’ one, but there’s no name for it on the slides, so I’ll call it Food Allocation (FA)).

With Initial Ownership

These two algorithms are for one-sided matching problems where the agents already own items, but want to swap with other agents to increase their utility.

We could do Random Serial Dictatorship?

Each tenant decides if they want to take part
A random ordering of the agents is decided
Serially give each tenant the house they like best, but is also still available.

Hmm... good, but there’s a problem.
Tenants are not guaranteed to get at least as good a house as their current one.
Therefore, they might not want to enter the lottery as they might end up being worse for participating.
It’s not individually rational!

So, we want a good algorithm for this. An algorithm that exhibits:

Individual rationality
Pareto efficiency (no changes can be made without at least one agent being worse off)
Strategy-proofness

Top Trading Cycle (TTC)

Let’s introduce the House Allocation Problem (HA).

We have agents, and houses.
Each agent owns a unique, indivisible house.
The objective is to reallocate the houses among the agents to improve the agents’ utility.

Would you take part in a mechanism to

swap your house with someone else?

The algorithm to do this is called Top Trading Cycle (TTC).

The algorithm goes like this:

Each agent points to their most preferred house (possibly their own), and each house points back to their owner.
This creates a directed graph; in this graph, identify cycles

Finite: cycle must exist
Strict preferences: each agent is in at most one cycle

Assign each agent in a cycle to the house they are pointing at and remove them from the mechanism with their assigned house.
If:

all houses are assigned, or
all agents are assigned, or
all preference lists are empty
... then stop.

Otherwise, repeat from step 1.

Here’s a quick example:

	Here’s the preference lists: a1 : h3 > h2 > h1 a2 : h1 > h4 > h2 a3 : h1 > h4 > h3 a4 : h3 > h4 The blue represents ownership, for example a1 owns h1. On the diagram, the green arrows are the houses pointing to their owners, and the blue arrows are the agents pointing to their desired houses.
	Now, we search for cycles. We start from one agent, and we follow through the arrows until we get back to the start. Starting at a1, the following cycle can be found: a1 → h3 → a3 → h1 → a1 So, we make the following assignment: a1 → h3 a3 → h1 We remove these agents and houses from the algorithm, and we continue.
	The houses a2 and a4 were pointing to are now allocated! So now, they point to the next available houses on their list. Now, let’s look for another cycle: a2 → h4 → a4 → h4 The cycle only loops around a4 and h4, so we’ll only assign a4 the house: a4 → h4 Now, we remove a4 and h4.
	It’s trivial that the only assignment left is to assign a2 to their own house! So our final assignment is: a1 → h3 a3 → h1 a4 → h4 a2 → h2

TTC is DS truthful.
This is a proof sketch of that:

Let’s say there’s some agent who got assigned a house in round .
No change in their report can yield them a house that was assigned in the earlier rounds (this actually requires more proof, but we’ll hold off on that).
No house assigned in a later round will make them better off

If our agent changes their preferences so they get a house in round , will they get a better house? No! They either get the same one they would’ve got in round , or a worse one because their old house would have already been taken.

So there’s no benefit in reporting any other preferences but the truth.

TTC allocations are always Pareto optimal.
The slides prove this by induction:

The agents with priority one (the agents in the first cycle) get their first choice, so they can’t improve their assignments.
The agents with priority two (the agents in the second cycle) can’t improve their assignments, because they’re already getting their best available houses.
And so on...

Intuitively, you can think of TTC as a type of Serial Dictatorship.
Instead of agents being picked randomly, like in Random Serial Dictatorship, agents are picked through these cycles.
Serial Dictatorship is Pareto efficient, so TTC must be too.

If a subset of the agents can make all of its members better off by exchanging houses, then the TTC allocation would be unstable.
However, allocations returned by TTC are stable; there never exists any such subset. This is called a core allocation, and they’re always returned by TTC.
In addition, for any HA problem, there is only one unique core allocation, and TTC returns it.

You Request My House - I Get Your Turn (YRMH-IGYT)

Let’s introduce the Student Rooms Allocation (SRA) problem.

We have a set of students that have rooms, .
We have a set of newcomers .
We have a set of vacant rooms .
Our problem is the same as HA: can we reallocate rooms, including the newcomers, that increases everyone’s utility?

We use the algorithm with the dumbest algorithm name in history: You request my house, I get your turn (YRMH-IGYT).

Do you want another weird algorithm? Try Stalin Sort.

It goes like this:

Fix a priority order of the agents.
Let the agent with the top priority receive their first-choice room, the second agent receive their top choice among the remaining rooms, and so on, until someone requests the room of an existing tenant.
If the existing tenant whose room is requested has already received a room, then proceed to the next agent.

Otherwise, insert the existing tenant at the top of the priority order and proceed with the procedure.

If at any step, a cycle forms, the cycle is formed by existing tenants where points to the house of agent , who points to the house of , and so on. In this case, assign these houses by letting them exchange, and then proceed with the algorithm.

This is a bit of a hefty algorithm, so there’s an example of it in the Appendix.

YRMH-IGYT is a variant of TTC.

It’s just that all vacant houses (and houses whose initial owners are already assigned other houses) point to the highest priority agents rather than the owners of the houses.
So, we sometimes call this algorithm TTC as well.
Huh? Well, I personally don’t, but I guess some people call it TTC.

Does it have good properties?
You bet! It’s Pareto efficient, strategy-proof, and makes no existing tenant worse off.

Without Initial Ownership

These two algorithms are for when agents don’t own any items, but want to be allocated to items that maximise everyone’s utilities.

Random Serial Dictatorship

Let’s introduce the Teaching Allocation (TA) problem:

We have agents (teachers), and items (subjects).
Each agent has a preference over the items.
We want to assign the agents to items.
In other words, split up the workload of teaching, and assign teachers to subjects.

Those that can’t do, teach.

Those that can’t teach, teach gym.

Remember that Random Serial Dictatorship algorithm I mentioned above?
Since teachers aren’t already assigned to subjects, we don’t have to worry about individual rationality (teachers getting a new assignment that’s worse than their old one).

Random Serial Dictatorship (RSD), or Random Priority, is just a small extension of Serial Dictatorship (SD):

Fix an ordering of the teachers a1, a2, ..., an (with Random Serial Dictatorship, this ordering is randomised).
a1 picks their most preferred subject.
a2 picks their most preferred subject that is still available.
a3 picks their most preferred subject that is also still available.
This continues until all teachers are assigned subjects.

Simple, right?
There’s a bit more to this:

There are different orderings that RSD randomly picks out of.
Each ordering leads to a different run of SD, which leads to a different outcome (some orderings may have the same outcome, but that’s fine).
So, we could say that there are different outcomes to pick from.
That’s a lot! How are we going to know, roughly, what everyone’s going to get?
That question would be a lot easier to answer if we had the probability distribution of teachers to subjects using RSD:

We go through every order of the agents, run SD, and denote the allocation matrix of each order as , for example (1 means that subject is allocated to that teacher):

Ai =		s1	s2	s3
	t1	0	1	0
	t2	1	0	0
	t3	0	0	1

Once we have all the allocation matrices of all orders, we can aggregate them into an average, with this equation:

The resulting matrix is the probability distribution of teachers to subjects.

You want an example? Fine... but only because I like you so much!

Serial Dictatorship is Pareto efficient.

Proof by induction:

The first teacher receives their best subject, so they cannot improve.
The second teacher receives their best possible subject, so they cannot improve, and so on.

Serial Dictatorship is DS truthful.

Proof by induction:

The first teacher gets their best subject, so they have no reason to lie.
The second teacher gets their best possible subject, so if they lied, they’d only get worse, and so on.

Serial Dictatorship is not envy-free.

Let’s say you have two teachers, t1 and t2, and they both want subject s1 over subject s2.
SD gives s1 to t1, and s2 to t2.
t2 is now envious of t1!
This applies to RSD as well, because either t1 or t2 is gonna get s1, so one will always be jealous of the other.

You can think of RSD as like a ‘probability distribution’ over all mechanisms with different permutations of teachers.
The probability distribution of this randomised algorithm is called its lottery, and would draw any mechanism with equal chance.
What does this have to do with anything?
I’ll tell you...

Ex-ante Pareto efficiency

Ex-post Pareto efficiency

The lottery is Pareto-efficient with respect to the expected utilities.

By ‘expected utilities’, we mean the average utility that the agents yield with our random lottery.

If there is no other lottery we could use that yields higher average utilities and no lower average utilities, then our algorithm is ex-ante Pareto efficient.

‘ex-ante’ here means ‘before the lottery’.

Allocations of all mechanism ‘outputs’ of the lottery are Pareto efficient.

If the allocations of every mechanism (that our lottery could possibly draw) are Pareto efficient, then our algorithm is ex-post Pareto efficient.

‘ex-post’ here means ‘after the lottery’.

Ex-ante Pareto efficiency implies ex-post Pareto efficiency.
An interesting thing to note is that:

You can only reason about ex-ante Pareto efficiency with cardinal preferences.
You can reason about ex-post Pareto efficiency with just ordinal preferences.

RSD is ex-post Pareto efficient, but not ex-ante Pareto efficient.
Why?

Why RSD is not ex-ante

Why RSD is ex-post

To be ex-ante Pareto efficient, there must not be another lottery that yields at least higher expected utilities to all teachers.

So, let’s define a lottery that fits that criteria!

Let’s define some setting for TA, and say there’s two possible mechanisms RSD’s lottery could pick: M1, and M2. There’s three teachers, and the utilities of each allocation yielded by each mechanism are (a,b,c) and (x,y,z), respectively.

We’ll assume that a, b, and c are bigger than x, y, and z.

So, the expected utility of each teacher using RSD’s lottery is (a+x)/2, (b+y)/2, and (c+z)/2, because there’s an even chance of picking M1 and M2.

Can we find a lottery that yields at least higher expected utilities?

How about a lottery that has a 100% chance of picking M1, and a 0% chance of picking M2?

The expected utilities there are a, b, and c.

Since a > x, b > y, and c > z, then a > (a+x)/2, b > (b+y)/2, and c > (c+z)/2. This lottery yields a higher expected utility for all teachers!

Therefore, RSD is not ex-ante Pareto efficient.

Remember that normal SD is Pareto efficient.

Once a random permutation of teachers is picked, the algorithm is now just regular SD.

So, no matter what the lottery picks, every mechanism outcome will be Pareto efficient.

Therefore, RSD is ex-post Pareto efficient.

So there’s two versions of Pareto efficiency when randomness is involved.
Did you know there’s also two versions of truthfulness?

Truthfulness-in-Expectation

Universal Truthfulness

A teacher always maximises their expected utility by acting truthfully.

If a teacher will always yield the same or less expected utility by giving false preferences, then the algorithm is truthful-in-expectation.

All mechanism ‘outputs’ of the lottery are DS truthful.

If every mechanism (that our lottery could possibly draw) is DS truthful, our algorithm has universal truthfulness.

RSD is both universally truthful and truthful-in-expectation.
Why?

Why RSD is truthful-in-expectation

Why RSD is universally truthful

SD is DS truthful, meaning you already maximise utility when you are truthful in SD.

Because any draw of the lottery turns RSD into an SD, being truthful will maximise your utility with any possible draw of the lottery, and therefore will maximise expected utility.

Therefore, RSD is truthful-in-expectation.

Remember, SD is DS truthful.

Any draw of the lottery turns RSD into a normal SD.

So no matter what the lottery does, RSD will always turn into a regular SD.

Therefore, RSD is DS truthful.

Now, let’s talk about fairness.

RSD, and the algorithm that follows, are anonymous.
Anonymity is a fairness property where two agents with the same preferences get the same probability distribution of item assignments.

The above definition is Gale’s, as Zhou mentioned in his paper.

Symmetry, or equal treatment of equals, is a fairness property where two agents with the same preferences get the same expected utility.

Again, this definition comes from Zhou's paper.
This is a weaker fairness property than anonymity, because:

Same probability distribution of items implies same expected utility
Same expected utility does not imply same probability distribution

Because RSD is anonymous, it’s also symmetric.

To conclude this section about RSD, here’s a theorem by Zhou:

Theorem

(Zhou, 1990)

There does not exist an ex-ante Pareto efficient, truthful, and symmetric mechanism.

💡 NOTE

Jie gave some confusing definitions of anonymity and symmetry in the lectures that don’t quite add up to the Zhou paper.

He says that anonymity is where the names of the teachers have no effect on the results, and only the preference order should affect the outcome of an agent.

He says that symmetry is where agents with the same preference profile get the same allocations.

I’m not saying Jie is wrong 😅 I’m saying I don’t understand this discrepancy between these two pairs of definitions. If any of you know, could you please comment here and explain? 🥺🙏

Probabilistic Serial Mechanism

Let’s introduce the Food Allocation (FA) problem:

We have hungry people, and dishes of divisible food.
Everyone has preferences over the food available.
How can we divide the food between everyone to maximise utility?

So, instead of finding probability distributions of teachers to subjects, we’re now dividing food between hungry people. When thinking about it that way, it’s effectively the same problem.
I just wanted a food-related problem to talk about.

We could use the Probabilistic Serial (PS) Mechanism to solve FA.

It’s defined by the Simultaneous Eating Algorithm, which involves all the agents simultaneously “eating” their favourite available food at a uniform speed, and moving on to their next most preferred food whenever their current food is fully eaten.
Each food takes the same amount of time to eat, and the proportion of food each agent eats is their allocation for that respective food item.

This is a bit hard to visualise with text, so I’ve put an example in the Appendix.

PS is not DS truthful.
Why?

Let’s provide a counter-example.
We have three agents: Alice, Bob, and Carol, and three foods: Paopu Fruit, Sea-salt ice cream, and Brock’s 'jelly-filled doughnuts'.
Alice’s cardinal preferences are (0.9, 1, 0) for the fruit, ice-cream, and doughnuts respectively.
Ordinally, here’s everyone’s preferences:

Alice: Ice-cream > Fruit > Doughnuts
Bob: Fruit > Ice-cream > Doughnuts
Carol: Fruit > Doughnuts > Ice-cream

The result of PS with these preferences is:

	fF	fI	fD
aA	0	3/4	1/4
aB	1/2	1/4	1/4
aC	1/2	0	1/2

Alice’s utility here would be (¾)(1) + (¼)(0) = ¾.
Now, let’s say Alice lies and changes their preferences to:

Fruit > Ice-cream > Doughnuts

The new result of the PS would be:

	fF	fI	fD
aA	1/3	1/2	1/6
aB	1/3	1/2	1/6
aC	1/3	0	2/3

Now, Alice’s utility is (⅓)(0.9) + (½)(1) + (⅙)(0) = 0.8
0.8 is bigger than ¾, so Alice benefitted from lying!
Therefore, PS is not DS truthful.

Nothing beats a jelly-filled doughnut.

However, despite not being DS truthful, PS is still useful for something:

Theorem

Bounded Incentives in Manipulating the Probabilistic Serial Rule

No agent can misreport its preferences such that its utility becomes more than 1.5 times of what it is when reported truthfully. This ratio is a worst-case guarantee by allowing an agent to have complete information about other agents’ reports.

So, in the worst case, an agent can only increase their utility by a factor of 1.5.
Even then, the agent would have to know the other agents’ preferences.
So PS may still be useful, in practice.

To conclude this section, here’s the algorithms we’ve looked at and their properties:

Name	Pareto efficiency	Incentives	Fairness
Random Serial Dictatorship (Random Priority)	Ex-post Pareto efficient	Truthful	Anonymous
Probabilistic Serial	Ordinal efficient	Not truthful	Anonymous Envy-free
Pseudo-market (not examinable)	Ex-ante Pareto efficient	Not truthful	Anonymous Envy-free

What’s ordinal efficiency?

Not on the slides, so not examinable, I think

A mechanism is ordinally efficient if the outputted random assignment (probability distribution from agents to items) is ordinally efficient.

A definition by Bogomolnaia and Moulin (that was made readable by Abdulkadiroglu and Sonmez) is: a random assignment is ordinally efficient if they are not stochastically dominated by any other possible random assignment.

What does it mean for a random assignment to stochastically dominate another?

It means that, for every agent , and every item that ordinally prefers over every item , the stochastically dominating random assignment has a higher or equal chance of drawing than .
In addition, there must be at least one item where there is a strictly higher chance of drawing than .
Source: Pycia, 2011

Let’s demonstrate this with an example. We have four agents, and four items:

a1, a2: a > b > c > d
a3, a4: b > a > d > c

RSD returns the following lottery (random assignment):

	a	b	c	d
a1	5/12	1/12	5/12	1/12
a2	5/12	1/12	5/12	1/12
a3	1/12	5/12	1/12	5/12
a4	1/12	5/12	1/12	5/12

However, every agent prefers the lottery returned by PS:

	a	b	c	d
a1	1/2	0	1/2	0
a2	1/2	0	1/2	0
a3	0	1/2	0	1/2
a4	0	1/2	0	1/2

The PS lottery stochastically dominates the RSD lottery!

Since PS returns the ordinal efficient random assignment, there is no other lottery that stochastically dominates this one.

Like ex-post, ordinal efficiency only requires ordinal preferences.

Ordinal efficiency is a stronger property than ex-post.

Ordinal efficiency is a weaker property than ex-ante.

Ex-ante implies ordinal efficiency.

Ordinal efficiency implies ex-post.

Here’s an example to show that ordinal efficiency is weaker than ex-ante:

a1, a2, a3: a > b > c

PS returns the following lottery for this setting:

	a	b	c
a1	1/3	1/3	1/3
a2	1/3	1/3	1/3
a3	1/3	1/3	1/3

Now, in terms of purely ordinal preferences, this is efficient.

But what if we got more specific and added cardinal preferences compatible with our ordinal ones?

u1 = (1, 0.8, 0)
u2 = (1, 0.2, 0)
u3 = (1, 0.2, 0)

According to these cardinal preferences, everyone’s expected utilities for the above lottery are (0.6, 0.4, 0.4).

However, I can make a new lottery with higher expected utilities:

	a	b	c
a1	0	1	0
a2	1/2	0	1/2
a3	1/2	0	1/2

With this new lottery, everyone’s expected utilities are (0.8, 0.5, 0.5). Higher than our ‘ordinally efficient’ lottery from before!

Secretary Problem

Throughout all these matching problems, we’ve assumed that all participants are present at the same time, and the game is single-shot (offline).
But what if new participants arrive at different times (online)?

Behold: The Secretary Problem!

We want to hire one new secretary, and there are applicants.
We’re going to interview them, one by one, and by doing that, we can see how good this secretary is compared to the ones we’ve already interviewed.
Once we interview an applicant, we have to either hire them, or decline them.
Our decision is irrevocable.
We don’t know how good the future applicants are until we interview them.

What is the best strategy to maximise the probability of hiring the best applicant?

How do we maximise our chances of hiring the best secretary? 🤔

Here’s a strategy we could employ:

Sequentially interview each candidate.
Reject the first candidates. Remember how good they were.
After this, accept the first candidate we meet who is better than all the first candidates we rejected.
If they’re all worse than the first candidates, then just accept the last one.

That’s a neat strategy, but which value of should we pick?
Say there’s 8 applicants. How many initial candidates should we reject before we consider hiring?
What value of maximises the chance of us hiring the best candidate?
Let’s apply a bit of maths here...

Suppose that we end up hiring candidate (where , obviously).
is the event that we accept the best candidate, which is .

This is the event where is the best candidate, and we end up picking them.

is the event that we accept the best candidate.

This is the event where:

was the best candidate, and we pick them or
was the best candidate, and we pick them or
...
was the best candidate, and we pick them or
was the best candidate, and we pick them

Remember, we want to maximise the chance of us getting the best candidate, .
We should calculate this probability!
However, depends on . We’re going to have to calculate that first.

Calculating P(Si)

When broken up, .
So, let’s calculate those two probabilities!

can be reduced to sub-problems:
How is this so?
Well, if the second best is in the first , and is the best, then nothing can be better between and .
Therefore, after we finish rejecting all , then we have no choice but to pick when we get to them, because nothing we’ll find along the way to is gonna be better than anything we found in .
The following graphic might help visualise this:

Each value is like that applicant’s ‘utility’.

We can substitute this and our goal will be:

Oops! How embarrassing... we’re repeating ourselves. is just , so:

So, what are these two probabilities?

, because if everyone has a uniform distribution of “utility”, everyone has an equal chance of being ‘the best’.

How about ?
Let’s reduce this problem from applicants to just :

What was once our second best is now our first best in this subproblem, because we’ve trimmed out the first best.
Now, we can convert into .
The probability of this is equal to:

The first candidate is the best in the subproblem or
The second candidate is the best in the subproblem or
...
The th candidate is the best in the subproblem.

Putting this into maths syntax, this is:
Therefore, .

We now have all the values to calculate our goal:

Calculating P(S)

So, we now know that .
Let’s calculate !

We use a special formula called the n-th harmonic number:

Basically, gets closer and closer to our approximated formula for as tends towards infinity.

If we take the derivative of this approximative formula, and set it to equal 0, we can find the value of that maximises .
I’m not going to do that here, but it shows that is the optimal value of to maximise .
When using the optimal value of , the probability tends to .

So, when using this strategy, the best you’re gonna get is a chance of 37% of getting the best secretary.

Optimal k For Smaller Values of n

The above solution is for when approaches infinity.
That means must be really big for those approximations to be accurate!
What if we have a small value of ? Can we easily find for those situations?

Well, there is a simple (I believe) solution that you could use.
Basically, iterate through all possible values of , calculate for each, and pick out the with the highest .
In other words, .
For example, if we had :

As you can see, yields the highest , so the optimal is 2.
This is because we’re iterating through all possible values of , and the number of possible values is , and in each of those iterations, we’re calculating , which involves summing fractions from to .

There exists a recursive algorithm that is compatible with dynamic programming.
It also iterates through all values of , but is a bit more involved with the maths.
Without DP, its time complexity is an exponential . Oof!
With DP, its time complexity is a linear . Ooh!
For a brief explanation of the algorithm in Excel, check out Tallys Yunes’ video.
For an explanation of the algorithm, pseudocode, and the time complexity explanation, check out the Appendix.

Generalised Second Price (GSP)

In this section, we’ll maximise revenue for search engines using pay-per-click.
But how do we maximise utility for advertisers?
Why not use the quasilinear utilities from back in VCG?

In other words, the thing.

If we’re using quasilinear utilities, why not just use VCG to auction ad space?
Because:

It’s too computationally expensive
Difficult to explain to customers
VCG isn’t good for revenue
Many other reasons...

Instead, we’ll use a ‘mutation’ of VCG called Generalised Second Price (GSP)!
In GSP:

Bidders with the highest prices each win a slot.
The ith highest price bidder gets the ith slot, and pays the (i+1)th highest bidding price.
The other bidders lose.

Agent	Bid	Allocation	Payment
1	£5	First slot	£4
2	£4	Second slot	£1
3	£1	N/A	N/A

A small example of GSP, selling two ad slots, with the

setting in yellow and the mechanism result in green.

In these examples, we assume everyone bids the same amount for all the slots.

What’s the difference between VCG and GSP?

In sponsored search, SEs do not give multiple slots to the same advertiser.
This is reflected in GSP, but in VCG, one agent could get multiple items in many-item auctions.
Also, VCG is a second price auction for single-item auctions.
However, for many-item auctions, VCG’s pricing isn’t necessarily the second price, as is demonstrated with the sandwich and wig example above.
However, GSP’s payment is hard-coded to always be second price.
Lastly, and most importantly, GSP is not truthful.

Google’s Weighted GSP

Google uses a weighted version of GSP, where each advertiser has a quality score based on:

Historic click frequency
Quality of the landing page
Some secret proprietary Google formula

Essentially, it’s an estimation of the advertiser’s click probability.

In Google’s auction:

The highest bidder gets the first slot, the second highest bidder gets the second slot, etc.
The highest bidder pays the second-highest effective price:

is the smallest weighted bid that our highest bidding advertiser could have bidded while still winning the first ad slot.
Bidder ’s expected utility =
Search engine’s expected revenue:

Here’s a truthful example, with two ad slots:

	Quality score	Bid (and true valuation)	Weighted bid	Allocation	Payment	Utility
Amazon	0.6	£10	£6	2	£3.33	4
eBay	0.9	£7	£6.30	1	£6.67	0.3
CEX	0.4	£5	£2	N/A	£0	0

Search engine’s expected revenue: £10.
Since GSP is not truthful, they can improve their utilities by lying, but they’ll be truthful for the sake of demonstration for now.

Click-Through-Rate

Click-through rate is the probability that a user will click on the ith slot when it is showing the jth advert.
By default, (higher-position slots are more likely to be clicked).

CTR can be separated into terms, like , where:

is the quality of slot and
is the quality of bidder .

Google uses this model, but Yahoo sets all beta values to 1.
So, with Yahoo, everyone is equal, but with Google, some are better than others. Hmm.

It is assumed that follows the Power Law.

That means, as the ad slots go from first to lower, the probability of a click radically goes down.
But as you go even lower, say from slot 10 to 11, the change in the probability of clicks becomes really small.

There’s also the cascade model, where the click probability of depends on whether or not the previous ad was clicked.

Generalised First Price (GFP)

There’s also a Generalised First Price (GFP):

The bidders with the highest prices each win a slot.
The kth highest-price bidder gets the kth slot and pays their bidding price.
The other bidders lose.

This was introduced by Overture, which is now a part of Yahoo.

It’s not very good for stability, so it’s not used by Google.

Nash Equilibria in GSP

Remember when I said GSP isn’t truthful?
How do you know? You’re just gonna take my word for it?
Let’s prove it, with this two-slot example:

First slot yields 20 clicks for an advertiser, and second slot yields 10 clicks.

Allocation	Agent	True valuation	Bid	Payment	Utility
1	Red	£5	£5	£4
2	Blue	£4	£4	£1
N/A	Green	£1	£1	£0	0

Now, what would happen if Red lies to get allocated the second slot instead of the first slot?

Allocation	Agent	True valuation	Bid	Payment	Utility
1	Blue	£4	£4	£2
2	Red	£5	£2	£1
N/A	Green	£1	£1	£0	0

Everyone weakly increased in utility!
Therefore, GSP is not DS truthful.

So, GSP isn’t truthful.
That means being truthful is not a dominant strategy.
So what is a dominant strategy? 🤔
It’s a bid profile where, for any bidder :

for any
for any

Uhh... huh?
What does that really mean?

Remember what Red did in our “GSP isn’t truthful” proof above?
Red lowered their bid so that they got the second slot, instead of the first.
In general, when advertisers lie to increase their utility, they can either:

Increase their bid to get a higher slot
Decrease their bid to get a lower slot

So, our two mathematical conditions for a Nash equilibrium means:

Our current utility must be better than if we got a lower slot
Our current utility must be better than if we got a higher slot

This means that, in a situation where these two conditions are true, no advertiser can lie about their valuation and get allocated a different slot without lowering their utility.

Is there an example of a Nash equilibrium?
You’ve just seen one!
The new situation in which Red bids £2 instead of £5 is a Nash equilibrium.
How? Let’s try all the cases and find out...

Target agent	Old allocation	New allocation	Change in utility
Red	2	1
Red	2	0
Blue	1	2
Blue	1	0
Green	0	2
Green	0	1

It seems no matter what advertiser switches to what slot, they will always get less utility! Therefore, this is a Nash equilibrium.

In practice, all this logic is done by automated bidders. I doubt actual workers in advertising care about this stuff.

Locally Envy-Free Equilibrium

It doesn’t end there!
Even in a Nash equilibrium, there’s still a way to increase your utility.
How?
Well, your payment is determined by the bid of the advertiser below you.
That also applies to the advertiser above you!
Their payment is determined by your bid.
So, when you increase your bid, their payment also increases.
What if you were to keep increasing your own bid, until the payment of the advertiser above you is so big, that it’s no longer viable for them to hold on to that slot, and it becomes better for them to lower their bid and give you that slot?
Advertisers that do this are called farsighted.

It’s possible to do this in the Nash equilibrium we had above:

If Red increases their bid, it’ll increase Blue’s payment, and by extension, decrease Blue’s utility.
If Red increases their bid to £2.55, that’ll decrease Blue’s utility to 29.
Blue then realises that if they lower their bid and get the second slot, their utility will be 30. 30 is higher than 29.
Notice how, after Red changes their bid, we are no longer in a Nash equilibrium! Blue can yield a higher utility with a different slot than their own.
Therefore, Blue lowers their bid and gets the second slot, meaning Red will get the first slot, bumping up Red’s utility to 50.

To help visualise it, here’s a gif I made:

Clever!
So, it seems Nash equilibria aren’t good enough for this problem.
We need a “stronger” version of Nash equilibria.

First, let’s introduce the concept of envy-freeness.

It means that nobody wants to exchange its allocation with anybody else’s.

Now, we can introduce the locally envy-free equilibrium!

It’s an equilibrium such that no bidder would like to exchange its bid with the bidder immediately above it.
In maths syntax, for any :

Nash equilibria is about one advertiser swapping slots, via changing their bid, to get a better utility.
Locally envy-free equilibria is about a bottom advertiser swapping bids with a top advertiser to get a better utility.

In the same way that we showed the £2 Red example was a Nash equilibrium, let’s show that it’s not a locally envy-free equilibrium:

Allocation	Agent	True valuation	Bid	Payment	Utility
1	Blue	£4	£4	£2
2	Red	£5	£2	£1
N/A	Green	£1	£1	£0	0

A copy of the table above

Bottom advertiser

Top advertiser

Change in utility

Red

Blue

Allo.	Agent	True	Bid	Pay	Utility
1	Red	£5	£4	£2
2	Blue	£4	£2	£1
N/A	Green	£1	£1	£0	0

Change in utility for Red

Change in utility for Blue

Green

Red

Allo.	Agent	True	Bid	Pay	Utility
1	Blue	£4	£4	£2
2	Green	£1	£2	£1
N/A	Red	£5	£1	£0	0

Change in utility for Green

Change in utility for Red

Red can get a positive change in utility by swapping their bid with Blue!
Therefore, this example is not a locally envy-free equilibrium.

Here’s an example that is a locally envy-free equilibrium:

Allocation	Agent	True valuation	Bid	Payment	Utility
1	Red	£5	£5	£2.50
2	Blue	£4	£2.50	£1
N/A	Green	£1	£1	£0	0

Bottom advertiser

Top advertiser

Change in utility

Blue

Red

Allo.	Agent	True	Bid	Pay	Utility
1	Blue	£4	£5	£2.50
2	Red	£5	£2.50	£1
N/A	Green	£1	£1	£0	0

Change in utility for Blue

Change in utility for Red

Green

Blue

Allo.	Agent	True	Bid	Pay	Utility
1	Red	£5	£5	£2.50
2	Green	£1	£2.50	£1
N/A	Blue	£4	£1	£0	0

Change in utility for Green

Change in utility for Blue

As you can see, there are no positive changes in utility for when anybody swaps bids.
Therefore, this is a locally envy-free equilibrium!

GSP vs VCG

GSP isn’t as good as VCG in theory.
It’s not even truthful!
But services use GSP because:

It’s simpler to explain
Re-engineering cost: if we already have GSP, it’d be expensive to replace it
Long term effect of switching from GSP to VCG is uncertain
VCG is time-consuming

However, the biggest reason is:

VCG makes less money!

Yep. Money. That’s it.

This is why we can’t just tell the truth

when we want to buy ad space.

There’s a theorem that states the revenue of GSP is greater than or equal to the revenue of VCG.
It holds true when GSP obtains a locally envy-free equilibrium.

Social Choice Theory (Voting)

We’ve all taken part in a vote before.
It’s an aggregate of everyone’s preferences into one collective decision.
A vote represents the will of the people.
But what makes a good voting system?
Does one exist?
That’s what we’re here to find out!

Here’s some properties we don’t want in a voting system:

Dictatorship: one person decides the outcome of the vote

e.g. whoever Enrico votes for is declared the winner.

Imposed Rule: the outcome is already decided before the vote even happens

e.g. no matter how anyone votes, Jie will be named the winner.

Here’s some properties we want in a voting system:

Anonymity: opposite to Dictatorship.

A voting system should treat all voters equally.
If any two voters swap ballots, the outcome should be the same.

Neutrality: opposite to Imposed Rule.

A voting system should treat all candidates the same.
If everyone switched their vote, the outcome would change accordingly.

Monotonicity:

It should be impossible for:

a winning candidate to lose by gaining votes and
a losing candidate to win by losing votes

Voting Algorithms

Let’s find some algorithms with these properties!
If you’ve done voting in Intelligent Systems, this part will be easy.

Majority Rule

Majority rule is for votes between two candidates.
It goes like this:

Each voter has a preference for one of the two candidates.
The candidate with the most votes wins.

We’ll assume there’s an odd number of voters, to avoid ties.

Super simple, right? We’ve all done votes like this before.
Its advantages are:

Anonymity
Neutrality
Monotone

Wow. It ticks all our boxes.
Is it possible to find another really good voting system like this?

Theorem

May’s Theorem (Kenneth May 1952)

Among all possible two-candidate voting systems that never result in a tie, majority rule is the only one that is:

Anonymous
Neutral
Monotone

Ah... I guess not.
That’s for two-candidate systems, though. What about more than two?

Condorcet’s Method

The Condorcet method goes as follows:

A candidate is a winner if they would defeat every other candidate in a one-on-one contest using majority rule.

It’s like a generalisation of majority rule for more than two candidates.
Here’s a brief example:

	Voter 1	Voter 2	Voter 3
1st	A	B	C
2nd	B	A	A
3rd	C	C	B

A vs B

A vs C

B vs C

	V1	V2	V3
1st	A	B	C
2nd	B	A	A
3rd	C	C	B

Winner: A

	V1	V2	V3
1st	A	B	C
2nd	B	A	A
3rd	C	C	B

Winner: A

	V1	V2	V3
1st	A	B	C
2nd	B	A	A
3rd	C	C	B

Winner: B

A is the winner of the Condorcet method!

The Condorcet method is also anonymous, neutral and monotone, but suffers from one problem.
It’s possible to yield no winners!

	Voter 1	Voter 2	Voter 3
1st	A	B	C
2nd	B	C	A
3rd	C	A	B

A vs B

A vs C

B vs C

	V1	V2	V3
1st	A	B	C
2nd	B	C	A
3rd	C	A	B

Winner: A

	V1	V2	V3
1st	A	B	C
2nd	B	C	A
3rd	C	A	B

Winner: C

	V1	V2	V3
1st	A	B	C
2nd	B	C	A
3rd	C	A	B

Winner: B

There is no ‘overall’ winner in these majority votes, so nobody wins.

Condorcet’s Method is also manipulable.
For example, if voter 2 doesn’t want A to win, they could switch C and A in their preferences and make nobody win.
I shouldn’t have to tell you why manipulation is bad:

Election results don’t reflect true will of society
Suboptimal choice in terms of maximising social welfare

Here’s a million different ways of saying the same thing:

The winner of the Condorcet method is called the Condorcet winner.
The phenomena of when a Condorcet winner doesn’t always exist is called the Condorcet paradox.
A voting system satisfies the Condorcet criterion, if it always chooses a Condorcet winner when one exists.
Voting systems satisfying this property are called Condorcet methods and are said to be Condorcet consistent.

Plurality Voting

Plurality Voting goes as follows:

The candidate with the most first place votes wins.

This is another generalisation of the majority rule.
Here’s another brief example:

	Voter 1	Voter 2	Voter 3
1st	A	B	C
2nd	B	A	A
3rd	C	C	B

There’s no winner in this example, because everyone has one first-place vote.
This also shows that Plurality voting is not Condorcet consistent, because the Condorcet winner here is A, but Plurality says there is no winner.

Plurality is manipulable, like in this example:

There are 19 voters, and 3 candidates, Red, Blue, and Green (they got sick of bidding over ad spaces and decided to start a government).

9 voters: R > B > G
8 voters: B > R > G
2 voters (Enrico and Bahar): G > B > R

Red has 9 first-place votes
B has 8 first-place votes
G has 2 first-place votes

Enrico and Bahar really don’t like R.
But they know that if they vote G, then R will just win over B, because G doesn’t have enough votes.
They’re “throwing their votes away”.
So, they may decide to vote for B and elect B instead, because they at least like B over R:

9 voters: R > B > G
10 voters (including Enrico and Bahar): B > R > G

Even one voter could manipulate!

Borda Count

Borda Count goes as follows:

Let’s say we have candidates.
Each first-place vote is worth points.
Each second-place vote is worth points, and so on.
The candidate with the most points is the winner.

Here’s a brief example:

	Voter 1	Voter 2	Voter 3	Voter 4	Voter 5
1st	A	A	A	B	B
2nd	B	B	B	C	C
3rd	C	C	C	A	A

A: 2+2+2+0+0=6

B: 1+1+1+2+2=7

C: 0+0+0+1+1=2

B is the winner!

It’s anonymous, neutral and monotone, which is nice.
But Borda Count does not satisfy “Independence of irrelevant alternatives (IIA)”:

The social preference between candidates A and B depend only on the individual preferences between A and B.

So, say if someone in a vote voted A > B > C, and A wins, it would break IIA if they changed their vote to A > C > B and C won, because in either case, they still preferred A to C.
Here’s a proper example, where voters 4 and 5 from the previous example swap B and C:

	Voter 1	Voter 2	Voter 3	Voter 4	Voter 5
1st	A	A	A	C	C
2nd	B	B	B	B	B
3rd	C	C	C	A	A

A: 2+2+2+0+0=6

B: 1+1+1+1+1=5

C: 0+0+0+2+2=4

A is the winner!

No IIA here! 4 and 5 still preferred B to A in both cases, so why does A win over B?

Sequential Pairwise Voting

Sequential Pairwise Voting goes like this:

Put the candidates in an ordered list
Replace the first and second candidates with the winner of the two by a majority rule.
Keep going until the list has only one candidate left. That candidate is the winner!

Here’s a brief example:

4 voters pick A > B > D > C
3 voters pick C > A > B > D
3 voters pick B > D > C > A
First round: A vs B vs C vs D
Pairwise A vs B = A wins
Second round: A vs C vs D
Pairwise A vs C: C wins
Third round: C vs D
Pairwise C vs D: D wins
Final list: D
D is the winner!

Sequential Pairwise Voting is not unanimous:

If everyone prefers one candidate A over another candidate C, then C should not be the winner.

If you look at the example above, D wins, even though every voter prefers B over D.
Why should D win over B?

In addition, the winner depends on the order of the comparison.
The outcome should only depend on the votes, and not any setting of the algorithm.

The Hare System

There’s lots of versions of the Hare System.
Here’s the one from the slides:

Repeatedly delete candidates that are “least preferred”; in other words, is ranked first the least in everyone’s preference lists.
Keep deleting until a single candidate remains. That candidate is the winner.

Here’s a brief example:

First round:

5 voters pick A > B > C
4 voters pick C > B > A
3 voters pick B > C > A
2 voters pick B > A > C

Number of first places for A: 5
Number of first places for B: 5
Number of first places for C: 4
C is eliminated
Second round:

5 voters pick A > B
4 voters pick B > A
3 voters pick B > A
2 voters pick B > A

Number of first places for A: 5
Number of first places for B: 9
B is eliminated
A is the only one left: A is the winner!

No 🤦‍♂️ not that kind of hare

Here’s another example, which shows what happens when we get draws:

First round:

5 voters pick A > B > C
4 voters pick C > B > A
3 voters pick B > C > A
1 voter picks B > A > C

Number of first places for A: 5
Number of first places for B: 4
Number of first places for C: 4
B and C are both eliminated
A is the only one left: A is the winner!

The Hare System is not monotone.
Why?

Let’s say that, in the previous example, the one voter who voted for B > A > C has now changed their vote to A > B > C.
The algorithm now goes like this:

First round:

6 voters pick A > B > C
4 voters pick C > B > A
3 voters pick B > C > A

Number of first places for A: 6
Number of first places for B: 3
Number of first places for C: 4
B is eliminated
Second round:

6 voters pick A > C
4 voters pick C > A
3 voters pick C > A

Number of first places for A: 6
Number of first places for C: 7
A is eliminated
C is the only one left: C is the winner!

Now C wins instead of A.
But that voter was moving ‘A’ higher.
Moving A higher and causing A to lose shows that the Hare System is not monotone.

Approval Voting

Approval Voting isn’t a social choice function, but a whole different way of voting.
In it, everyone selects candidates they find acceptable.
There’s no preferences, it’s all binary: people either vote for someone or they don’t.

Simple enough, right? They do that in real life, too.
The winner is the candidate who receives the most approval votes.

Approval Voting can be manipulated, in the same way Plurality voting can.

Arrow’s Impossibility Theorem

Here’s a summary of all the voting systems so far:

Voting System	Cons
Condorcet’s Method	Voting cycle
Plurality Voting	Fails to satisfy the Condorcet Criterion, manipulable
Borda Count	Fails to satisfy the Independence of Irrelevant Alternatives (IIA) property
Sequential Pairwise Voting	Fails to satisfy Pareto efficiency (unanimity)
Hare System	Fails to satisfy monotonicity
Approval Voting	Manipulable

They’re cool and all, but they all have cons.
Can’t we find one without any cons?
According to Arrow, no:

Arrow’s Impossibility Theorem

With three or more candidates and any number of voters, there does not

exist - and there will never exist - a voting system that satisfies:

If every voter prefers alternative A over alternative B, then the group prefers A over B. -- unanimity
If every voter’s preference between A and B remains unchanged, then the group’s preference between A and b will also remain unchanged (even if voters’ preferences between other pairs like A and C, B and C, or D and E change). -- independence of irrelevant alternatives
No single voter possesses the power to always determine the group’s preference -- non-dictatorship

Sorry for the inconsistency! I just felt like copying the slides for a second there.
Here’s a summary:

Theorem

Arrow’s Impossibility Theorem

When you have three or more candidates, you can’t have a voting system that is:

Unanimous
Independent of Irrelevant Alternatives
Non-dictatorial

There’s another theorem:

Theorem

Gibbard-Satterthwaite, 1971

Every non-dictatorial aggregation rule with >=3 candidates is manipulable.

Basically, there is no perfect mechanism.
So why do we have so many?
If we can’t make manipulation impossible, we can at least make it really hard.
So, the question is really this: can we design a universal method to turn any voting protocol into a hard-to-manipulate one?

This is all for worst-case analysis.
Making an average-case hardness is pretty hard to achieve, ironically.

Here’s another question: what is the maximum fraction of manipulators can we tolerate?
How can we achieve that? 🤔

... what? You wanted me to answer that question?
The slides sort of just end there, unfortunately...
So, I don’t know. I just write the notes.
If you really want to know the answer, try doing a PhD with Jie.

Now, this is the end of the examinable stuff of Algorithmic Game Theory.
There’s one extra section: Fair Division.
However, it’s not examinable.
If you want to check that out, go to the Appendix.

Appendix

Normal / Extensive Form Games

Normal Form Games

Extensive Form Games

Intro to Algorithmic Mechanism Design

Auctions as Mechanisms

Intro to Bayesian Games

Basics of Mechanisms

Bayesian Game Settings

Bayes-Nash Equilibrium

Dominant-Strategy Truthfulness

Vickrey-Clarke Groves (VCG) Mechanism

Direct Quasilinear Mechanisms

Groves Mechanisms

Vickrey-Clarke-Groves (VCG) Mechanisms

VCG Examples

Properties of VCG

Truthfulness

Individual Rationality

Budget Balance

Revelation Principle

Revelation Principle

Characterisation of DS Truthful Mechanisms

Dominant Strategy Implementation

Implementable Choice Rules

The Case for Unrestricted Quasilinear Settings

The Case for Single Dimensional Settings

Optimal Auctions

Example of an Optimal Auction

Myerson’s Theorem

Stable Matching (SM)

Gale-Shapley Algorithm

GS Returns Stable Matching Proof

Extensions of Stable Matching

Optimal Stable Matchings

DS Truthfulness

Hospital Residents Problem (HR)

Resident-oriented GS (RGS)

Hospital-oriented GS (HGS)

Theorems

Hospital Residents Problem with Couples (HRC)

Stable Roommates (SR)

Irving’s Algorithm

Phase 1

Phase 2

One-Sided Matching

With Initial Ownership

Top Trading Cycle (TTC)

You Request My House - I Get Your Turn (YRMH-IGYT)

Without Initial Ownership

Random Serial Dictatorship

Probabilistic Serial Mechanism

Secretary Problem

Calculating P(Si)

Calculating P(S)

Optimal k For Smaller Values of n

Sponsored Search Auctions

Greedy vs Offline Optimal

Generalised Second Price (GSP)

Google’s Weighted GSP

Click-Through-Rate

Generalised First Price (GFP)

Nash Equilibria in GSP

Locally Envy-Free Equilibrium

GSP vs VCG

Social Choice Theory (Voting)

Voting Algorithms

Majority Rule

Condorcet’s Method

Plurality Voting

Borda Count

Sequential Pairwise Voting

The Hare System

Approval Voting

Arrow’s Impossibility Theorem