Polak: All right, so last time we did something I
think substantially harder than anything we’ve done in the class
so far. We looked at mixed strategies,
and in particular, we looked at mixed-strategy
equilibria. There was a big idea last time.
The big idea was if a player is playing a mixed strategy in
equilibrium, then every pure strategy in the mix–that’s to
say every pure strategy on which they place some positive
weight–must also be a best response to what the other side
is doing. Then we used that trick.
We used it in this game here, to help us find Nash Equilibria
and the way it allowed us to find the Nash Equilibria is we
knew that if, in this case,
Venus Williams is mixing between left and right,
it must be this case that her payoff is equal to that of right
and we use that to find Serena’s mix.
Conversely, since we knew that Serena is mixing again between l
and r, we knew she must be indifferent between l and r and
we used that to find Venus’ mix. So I want to go back to this
example just for a few moments just to make one more point and
then we’ll move on, but we’ll still be talking
about mixed strategies throughout today.
So this was the mix that we found before we changed the
payoffs, we found that Venus’ equilibrium mix was .7,
.3 and Serena’s equilibrium mix was .6, .4.
And a reasonable question at this point would be,
how do we know that’s really an equilibrium?
We kind of found it but we didn’t kind of go back and
check. So what I want to do now is
actually do that, do that missing step.
We rushed it a bit last time because we wanted to get through
all the material. Let’s actually check that in
fact P* is a best response to Q*.
So what I want to do is I want to check that Venus’ mix P* is a
best response for Venus against Serena’s mix Q*.
The way I’m going to do that is I’m going to look at payoffs
that Venus gets now she knows – or rather now we know
she’s playing against Q*. So let’s look at Venus’
payoffs.. I’m going to figure out her
payoffs for L, her payoffs for R,
and also her payoff for what she’s actually doing P*.
So Venus’ payoffs, if she chooses L against Q*
then she gets–very similar to what we had on the board last
week, but now I’m going to put in
what Q* is explicitly–she gets 50 times .6..
[This is Q* and this is 1-Q*.]. So she gets 50 times .6 and 80
times 1 minus .6 which is .4,80 times .4.
We can work this out, and I worked it out at home,
but if somebody has a calculator they can please check
me. I think this comes to .62.
Somebody should just check that. If Venus chose R–remember R
here means shooting to Serena’s right, to Serena’s forehand–if
she chose R then her payoffs are 90 Q*.
So 90(.6) plus 20(1-Q*) so 20(.4), so 90(.6) plus 20(.4),
and again I worked that out at home, and fortunately that also
comes out at .62. So what’s Venus’ payoff for P*?
We’ve got her payoff for both her pure strategies,
so her payoff from actually choosing P* is what?
Well, P* is .7, so .7 of the time she will
actually be playing L and when she plays L,
she’ll get a payoff of .62, and .3 of the time she’ll be
playing R, and once again, she’ll be getting a payoff of
.62 and–do I have a calculator? Sorry, thank you.
So P* is .7, yes, you’re absolutely right,
so this is P* and 1-P*, So let’s make that clearer.
I’ll show you what the equilibrium is but P* itself is
.7. So when Venus plays L with
probability of .7, then .7 of the time she’ll get
the expected payoff of .62 and .3 of the time she’ll get a
payoff again of .62 and that’s the kind of math I don’t have to
do at home, that’s going to come out at .62.
Again, assuming my math is correct.
So all I’ve really done here is confirm what we did already last
time. We knew–we in fact chose
Serena’s mix Q to make Venus indifferent between L and R.
And that’s exactly what we found here, going left it’s .62,
going right it gets .62 and hence P* gets .62.
But I claim we can now see something a little bit else.
We can now ask the question, is P* in fact the best
response? Well, for it not to be a best
response, for this not to be an equilibrium, there would have to
be some deviation that Venus could make that would make her
strictly better off. Let me repeat that.
If this were not an equilibrium, there would have to
be some deviation for Venus, that would make her strictly
better off. By playing P* she’s getting a
return of .62. So one thing she could deviate
to, is playing L all the time. If she deviates to playing L
all the time, her payoff is still .62 so
she’s not strictly better off. That’s not a strictly
profitable deviation. Another thing she could deviate
to, is she could deviate to playing R.
If she deviates to playing R, her payoff will be .62.
Once again, she’s not strictly better off: she’s the same as
she was before, so that’s not a strictly
profitable deviation. So what have I shown so far?
I’ve shown that P* is as good as playing L,
and P* is as good as playing R. In fact that’s how we
constructed it. So deviating to L is not a
strictly profitable deviation and deviating to R is not a
strictly profitable deviation. But at this point,
somebody might ask and say, okay, you’ve shown me that
there’s no way to deviate to a pure strategy in a strictly
profitable way, but how about deviating to
another mixed strategy? So, so far we’ve shown–we’ve
shown just up here–we can see that Venus has no strictly
profitable pure-strategy deviation.
She has no strictly profitable pure-strategy deviation because
each of her pure strategies yields the same payoff as did
her mixed strategy, yields the same as P*.
But how do we know that she doesn’t have a mixed strategy
that would be strictly better? How do we know that?
Anybody? No hands going up;
oh, there was a hand up, good. Student: Any mix between
left and right will still yield .62.
Professor Ben Polak: Good, so any mix that Venus
deviates to, will be a mix between L and R,
and any mix between L and R will be a mix between .62 and
.62 and hence will yield .62. So we’re going to use again,
this fact we developed last week.
The fact we developed last week was that any mixed strategy
yields a payoff that is a weighted average of the pure
strategy payoffs, the payoffs to the pure
strategies in the mix. Any mixed strategy yields a
payoff that is a weighted average of the payoff to the
pure strategies in the mix. That was our key fact last week.
So here if we’ve shown that there’s no pure-strategy
deviation that’s strictly profitable,
then there can’t be any mixed strategy deviation that’s
strictly profitable. Why?
Because the mixed strategy deviations must yield payoffs
that lie among the pure strategy deviations.
So this is a great fact for us. What’s the lesson here?
The lesson is we only ever have to check for strictly profitable
pure-strategy deviations. That’s a good job.
Why? Because if we had to check for
mixed strategy deviations one by one, we’d be here all night,
because there’s an infinite number of possible
mixed-strategy deviations. But there aren’t so many pure
strategy deviations we have to check.
Let’s just repeat the idea. Suppose there isn’t any
pure-strategy deviation that’s profitable, then there can’t be
any mixed strategy deviation that’s profitable,
because the highest expected return you could ever get from a
mixed strategy, is one of the pure strategies
in the mix, and you’ve already checked that none of those are
profitable. So this simple idea,
the simple idea we developed last time, not only helps us to
find Nash Equilibria, but also to check for Nash
Equilibria. Now a lot of people I gathered
from feedback from sections were left pretty confused last time.
It’s a hard idea. Actually I looked at the tape
over the weekend, I could see where it could be
confusing. But it’s actually,
I think what’s really confusing here–it wasn’t so much–I think
it wasn’t so much that I could have been clearer though I’m
sure I could have been. It’s that this is really a hard
idea, this idea of mixed strategies.
So we’re going to work on it again today, but I think one of
the ideas that gets people confused, is the following idea.
They say, look we found Venus’ equilibrium mix by choosing a P
and a 1-P to make Serena indifferent.
We found Serena’s equilibrium mix by finding a Q and a 1-Q to
make Venus indifferent and a natural question you hear people
ask then is, why is Venus “trying to make
Serena indifferent?” Why is Serena “trying to make
Venus indifferent?” That’s not really the point
here. It isn’t that Venus is trying
to make Serena indifferent. It’s that in equilibrium,
she is going to make Serena indifferent.
It isn’t her goal in life to make Serena indifferent between
l and r, and it isn’t Serena’s goal in life to make Venus
indifferent between L and R, but in equilibrium it ends up
that they make each other indifferent.
The way that we can see that is that if Venus puts–we said last
time it’s repeated–if Venus puts too much weight,
more than .7 on L, then Serena just cheats to the
left all the time, and that can’t possibly be an
equilibrium. And if Venus puts too much
weight on R, then Serena cheats to the right all the time and
that can’t be an equilibrium. So it has to be that what Venus
is doing is going to make Serena exactly indifferent and vice
versa. Now let’s see that idea in some
other applications. Let’s talk about this a bit
before we move on. So it turns out that some very
natural applications for mixed-strategy equilibria arise
in games, in sport. So let’s talk about a few now.
Can anybody suggest some other places where we see
randomization or at least mixed strategy in equilibria in
sporting events? Let me actually grab the mike
myself. Anybody here play football for
example, and we’re talking American football now,
the gridiron game, not the civilized type.
Anyone play? Yes, so some of you play
football. So where is the mixing involved
in playing football? Where in equilibrium would we
expect to see mixed strategies? There’s somebody down there can
we go on and get them. So shout it out.
Student: Running game and passing game.
Professor Ben Polak: All right, the running game and the
passing game. So a very simple idea whether
to run or whether to pass when you have the ball is likely to
end up as a mixed-strategy equilibrium.
The defense is also randomizing between, for example,
rushing the passer or playing a run defense.
Is that right–this is not exactly a game I know a lot
about, but I’m hoping I’m getting close enough here.
It couldn’t possibly be a pure-strategy equilibrium,
other than very extreme parts of the game,
like at the end of the game perhaps, but for most of the
game, it’s very unlikely to end up as a pure-strategy
equilibrium. Much more likely that the
offense is mixing between passing and running,
and for that matter between going to the left,
going to the right and going to the center, and the defense is
also mixing between–over its types of defense.
So we see that–for those people who were watching
yesterday–we see that in football games.
Where do we see it else in sport, some other sports?
I can’t have a room full of non-sports fans.
How many of you ever watch any sports?
Let’s raise some hands here–some of you do.
So this is baseball playoff season.
How many of you have been watching the baseball playoffs?
Raise your hands if you’ve been watching the baseball playoffs.
I’ll let you off, I know you should have been
doing my homework. How many of you have been
watching the playoffs instead? How many watched the Yankees
game last night? Quite a few of you.
So they haven’t been very exciting yet but we’re hoping
that it’s going to get more exciting.
So when you’re watching baseball what kind of things do
you see where you just know that there must be mixed strategies
involved? There must be randomization
involved. Now I’ve got a few more hands
out. Good, so you sir.
Student: Choosing how to pitch the ball.
Professor Ben Polak: Choosing how to pitch the ball.
Enlarge a little bit more, say a bit more.
Student: Fast ball versus slider,
versus change up, all sorts of different things.
Prof Ben Polak: All right, so there’s different ways
of throwing the ball, and there’s going to be
randomization from the pitcher, or at least it’s going to look
like there’s randomization by the pitcher over whether to
throw a fast ball or a curve ball or whatever.
How is the hitter randomizing there?
How is the hitter randomizing? Is the hitter randomizing at
all? What’s the hitter doing while
this is going on? Anybody?
Yeah. Student: He’s choosing
whether to swing or not to swing.
Professor Ben Polak: Okay, he’s choosing whether to
swing or not to swing, although presumably he can do
that just after the ball’s thrown.
So you sometimes hear the commentator say that that hitter
was looking for a fast ball, is that right?
Or looking for a curve ball. The hitter is trying to
anticipate the pitch, is that right?
This is not a game I played a lot of either–I played a little
bit. You’re trying to anticipate
where the ball is going to be thrown.
So the type of ball you throw in baseball and the way in which
the pitch being anticipated by the hitter, is likely to be a
mixed strategy. What else is likely to be a
mixed strategy in baseball? What else?
Anybody here on the Yale baseball team?
Okay, I’ve got one volunteer here.
So what else, stand up for a second.
Let’s have a Yale baseball team member, what’s your name?
Student: Chris. Professor Ben Polak:
Where do you play? Student: I’m a pitcher.
Professor Ben Polak: You’re a pitcher,
okay. So he’s not going to get on
base now, so he’s not going to answer this.
Suppose you did get on base, pitchers don’t often get on
base. Let’s assume that happens,
what might you randomize? There you are,
you’re standing on base, what might you randomize about?
Student: Whether to steal second or not.
Professor Ben Polak: Right, whether to steal or not,
whether to try and steal or not.
Stay up a second. So the decision whether to try
and steal or not is likely to end up being random.
If you’re the pitcher, what can you do in response to
that? Student: You can either
choose to try to pick them off or not.
Professor Ben Polak: What else?
So one thing you can try and pick him off.
What else? Student: You can be
quicker to the plate. Professor Ben Polak:
Quicker to the plate, what else?
Student: You can pitch out.
Professor Ben Polak: You can pitch out,
what else? At least those three things,
right? Student: Yeah.
Professor Ben Polak: At least those three things okay,
thank you. I have an expert here,
I’m glad I had an expert. So in this case we can see
there’s randomization going on from the runner whether he
attempts to steal the base or not,
and by the pitcher on whether he throws the pitch out or
whether he tries to throw, to get to the plate faster.
So we see this in sport. We don’t see it well
anticipated by sports commentators.
Let me put this down a second. So in baseball,
for example, you’ll sometimes see quite
sophisticated statistical analyses of baseball in which
somebody will have looked at base stealers across the major
leagues and they’ll look at all the instances in which a player
was on first base and in the position where you think they
might steal, and they’ll look at what
happened on every attempt to steal, whether they were in fact
caught stealing or not, and they’ll try and measure the
value of these things and they’ll see, the conclusion
they’ll come to is something like this.
They’ll conclude that whether the guy stole or not,
whether the guy attempted to steal or not,
sorry, or whether he just sat on first base doesn’t seem to
make much difference, they’ll say.
They’ll say that the payoff for even great base stealers are
attempting to steal or not, when you take into account the
pick offs versus just staying put, turns out the payoff in
terms of the impact on the game is roughly equal,
and then they’ll draw– these analysts will then draw the
following conclusion. They’ll say,
oh look, speed or the ability to steal bases is therefore
overrated in baseball. How have they made a mistake?
What’s the mistake they made there?
So the premise was, let’s give them the premise,
the premise was that when a base stealer is attempting to
steal or not the expected return in terms of outcome of the game
is roughly equal, whether they attempt to steal
or don’t attempt to steal. The conclusion is,
therefore stealing doesn’t seem such a big deal.
What’s the mistake they’ve made? Yeah, let me borrow it again,
sorry. Student: The pitcher has
to react differently in pitching when he knows that there’s a
fast guy on base. Professor Ben Polak:
Good, so our pitcher has to react differently.
Let’s talk to our pitcher again, so one thing our pitcher
said was he wants to get to the plate faster.
What does that mean getting to the plate faster?
–Shout out so people can hear you.
Student: It means just getting the ball to the catcher
as fast as possible so he has the best chance to throw out the
runner. Professor Ben Polak:
Right, so you’re going to pitch from, you’re not going to do
that funny windup thing, you’re not, thank you,
you’re going to pitch from the stretch, I knew there was a term
there somewhere. I’m learning American by being
here. And you’re more likely to throw
a fast ball, there’s some advantage in throwing a fast
ball rather than a curve ball. Both actions of which,
both having to move more towards fast balls and pitching
to the stretch are actually costly for the pitcher.
But we’ll get there in a second, let’s just back up a
second, so that was good, that’s right.
But let’s just back up a second. The premise of these
commentators was what? It was that the return to
stealing, attempting to steal, seems to be roughly a wash.
It seems to be that the expected return when this great
base runner attempts to steal a base is roughly the same as the
return when they don’t attempt to steal the base.
But I claim we knew that was going to the case.
We didn’t have to go and look at the data.
Why did we know that was going to be the case?
How did we know that we were bound to find a return in that
analysis that finds those things roughly equal?
Yeah. Student: If he is
randomizing that means that the returns will be equal.
If they weren’t equal he would just do one or the other all the
time. Professor Ben Polak:
Good, excellent. Since we’re in a mixed strategy
equilibrium, since he’s randomizing, it must be the case
that the returns are equal. That’s the big idea here,
that’s the thing we learned last time.
If the player, and these are professional
baseball players doing this, they’ve been very well trained,
a lot of money has been spent on getting the tactics right.
There’s people sitting there who are paid to get the tactics
right. If it was the case that the
return to base stealing wasn’t roughly equal when you attempt
to steal or didn’t attempt to steal,
then you shouldn’t be randomizing.
Since you are randomizing it must be the case that the
returns are roughly equal. So that’s the first thing to
observe and the second thing to observe is what we just pointed
out. In fact, the value of having a
fast base stealer on the team doesn’t show up in the expected
return on the occasions on which he attempts to steal,
or which he does not attempt to steal.
It shows up where? It shows up in the fact that
the pitching team changes their behavior to make it harder for
this guy to steal by going faster to the plate,
or throwing more fast balls. Where will that show up in the
statistics? If you’re just a statistician
like me, you just look at the data, where will that show up?
I mean suppose I can’t keep track of every single pitch,
I can’t actually observe all these fast balls,
where will I see the effect of all these extra fast balls in
pitching and from the stretch, in the data?
Somebody? It’s going to show up in the
batting average of the guy who’s hitting behind the base stealer.
The guy hitting behind the base stealer is going to have a
higher batting average because he’s going to get more pitches
which are fast balls to hit, and more pitches out of the
stretch. So if you ignore that effect,
you’re going to be in trouble. But we know,
if we analyze this properly using Game Theory,
we know we’re in a mixed strategy equilibrium.
We know, in fact, the pitching team must be
reacting to it. We know there must be a cost in
doing that, and the cost turns up in the hitter behind.
So when you’re watching the playoffs in the last– now I’m
giving you permission to watch a bit of TV at night,
after you’ve done my homework assignment, but before anyone
else’s homework assignment–you can have a look at these
baseball games and have a go at being a little bit better than
the commentators who are working on them.
So one application for mixed strategies is in sports,
but not the only application. Let’s just talk about another
application, a slightly more scary application.
So after 9/11 there was a lot of talk in the U.S.
about the placement of baggage checking machines at airports.
Actually there’s still quite a lot of talk, but there was a lot
of talk then about the placement of machines to search the
luggage that goes onboard. The hand luggage was being
searched anyway, but to search luggage going
into the cabins. It was pointed out at the time,
this has changed since, there weren’t actually enough
machines in the U.S., on the day after 9/11,
to search every single bag that went into the hold.
You’d hear discussions of the following type.
You’d hear these experts on Nightline or whatever and they’d
say: look there’s no point trying to do this,
because if we put all our baggage searching machines at
Logan Airport in Boston, for example,
then the terrorists will simply move their attack to O’Hare and
if we put them at O’Hare, then they’ll move their attack
to Logan. If we have enough to do both
Logan and O’Hare then they’ll move their attack to some third
airport. So there was a sense of doom in
the air. It was kind of a depressing
time anyway. There was a sense of doom in
the air saying that if you put your baggage searching machines
somewhere, all you do is cause the
attempted terrorists, terrorists attempting to blow
up the planes, to go elsewhere.
And you hear the same things today about searching
individuals as they go on the plane.
For example, you’ll hear a discussion that
says, if we only search men traveling alone,
let’s say, then you’ll quickly end up with all the people
carrying bombs being couples or women.
Again, there’s this sense of doom, this sense that says it’s
hopeless. Whatever we do we’re just going
to force the terrorists to do something else but we won’t have
gained anything. So once again that’s wrong.
What’s wrong about that one? What should we be doing in that
setting? Let me come down again.
What should they have done–in fact they did do–with those
luggage/baggage searching machines when they were in short
supply after 9/11? What do they do with searching
people as they got on planes? What do they do?
Well here’s what they didn’t do, they didn’t just put them at
certain airports and announce they’re just at these airports.
That would have been a crazy thing to do.
That would have been hopeless–not entirely
hopeless–but not wise. What should they have done?
What did they do? Anybody want to guess?
Yeah. Student: In name they
randomized who they were checking.
Professor Ben Polak: Right, so when they’re checking
passengers, they’re going to randomly check passengers.
When they’re checking, when they think about the
baggage machines, a sensible thing to do is to
put a big metal box at every single airport and say:
we’re not going to tell you which of these boxes actually
have baggage checking machines, which effectively is
randomizing. From the point of view of the
terrorists, they’re not going to know where the baggage checks
are going on. That’s worth doing.
It doesn’t–It isn’t going to perfectly eliminate,
well unfortunately it isn’t probably going to perfectly
eliminate all terrorist attacks, but it does make it harder for
the terrorists. So randomization there–whether
it’s literally randomizing over who is checked,
or whether it’s “as it were” randomizing, by concealing where
in fact you have placed those machines–can be very effective.
The hard thing, both in sports and in these
military examples, is really mimicking
randomization. It’s very hard for us as humans
to do it, and there’s a famous story about a military
commander, actually a English military
commander during an insurgent war in, I think it was Malaysia
after World War II, where again he had to worry
about randomizing which convoys to protect.
And the way in which…–He figured out
that randomizing was the right thing to do to try and protect
these convoys as well as he could with small numbers of
troops. And the way in which he
randomized was he literally randomized.
Every morning he put a bit of paper in his hand and he had
somebody, had one of his sergeants, pick which hand the
paper was in. So we do actually see these
random strategies used. The reason we have to literally
randomize is because it’s very difficult to do so unless you’re
a professional sports player. Okay, but it turns out that
mixed strategy equilibria, and mixed strategies in
general, are relevant beyond just these
contexts in which you think of people literally randomizing.
I want to look at a different context now.
So I want to go back to a game we started a few weeks ago.
This isn’t the same game. It’s a sequel.
It’s a follow up in our exciting adventure of our dating
couple in the classroom. Who were our dating couple,
do we still have them here? That was the guy,
who is the–yeah, there they are.
They’re even sitting closer. What a success here.
Can we get the camera on them a second?
Stand up a second, thank you. Your name was?
Student: David. Professor Ben Polak:
David. And look at this.
Is this romantic or what? David and your name is?
Student: Nina. Professor Ben Polak:
Nina and David, okay.
I think we pretty much figured out last time that Nina’s Player
I and David’s Player II, is that right?
As we remember last time, I’ll pick on you in a second,
you can sit down a second. So we figured out last time
that they were going to try and go on a date and they had
arranged to go to the movies. They picked out two,
in fact three movies, but two that remained viable,
and the problem was being typical Economics majors who
are, are you both Economics majors,
I think we figured that out? They are, look at that,
so being typical Economics majors who are just hopeless at
dating they had forgotten to tell each other which movie
they’re going to. So that, I don’t know if that
worked out well or not, but now that life has moved on,
they’re going to try it again, but this time taking advantage
of fall in New England, rather than go to a movie,
they’ve decided on some new activities.
So they might either go apple picking or they might go to the
Yale Rep and see a play.. And so apple picking has its
advantages: the fall weather, it’s local flavor,
it has certain undertones about the Garden of Eden or something.
I don’t know if you can use the term flavor, local or otherwise,
for American apples but never mind.
And the Yale Rep, Yale Rep is a good thing to do
in New Haven, go to a play,
I think it’s Richard II is showing now, is that it?
Probably not a great “date play” but Economists are trying
to show they have culture, so there it goes.
And let’s assume the payoffs are like this.
Much as they were before, whereby we mean that Nina wants
to meet David but she would, given the choice she would
rather meet David in the apple fields.
And David who’s a dark personality, likes the sort of
darker side of Shakespeare. And he also wants to meet Nina
but he would rather meet at the Yale Rep.
If that’s backwards I apologize to their preferences.
But once again, because they’re still
incompetent Economics majors, they’ve again forgotten to tell
each other where they’re going. So let’s analyze this game
again, we’ve figured out this was a coordination game last
time or several weeks ago., And we know in this game,
we know what the pure strategy Nash Equilibria are,
so no prizes to be able to spot them.
One of the Nash Equilibria in pure strategies,
let’s put this in pure strategies,
so one of the pure strategy Nash Equilibria is for them both
to go apple picking and meet up in Bishop’s Orchard or whatever,
and another pure strategy equilibrium is for them both to
choose the Rep. We’d figured out that if they
were able to communicate, there’s really a pretty good
chance of them managing to coordinate at one of these
equilibria but we suspect, I think, that this is not all
that’s going on here. It looks quite likely that come
your next Saturday afternoon, when we send these guys out on
their date, they’re going to fail to meet,
it’s at least plausible. To test the plausibility of
that, let’s ask them, have you been,
have you managed to meet on a date yet?
No, haven’t managed to meet on a date.
See, so I’m proving the point that in fact they haven’t
managed to at least coordinate an equilibrium yet.
So it seems at least plausible that they’re going to fail to
coordinate. It’s plausible they’re going to
fail to coordinate. We’d like to sort of capture
that idea, and the way we’re going to capture that idea
is–let’s see if there’s another equilibrium in this game.
Well, there certainly isn’t another pure strategy
equilibrium in this game is there?
We know that.. So if there’s another
equilibrium it better be mixed. So let’s try and find a mixed
Nash Equilibrium in this game, and remember this game is
called Battle of the Sexes, it’s a famous game.
This is Battle of the Sexes revisited.
So how are we going to go about finding this mixed Nash
Equilibrium in the game? We’ll interpret it later but
let’s just work on finding it. So, in particular,
I’m going to postulate the idea that Nina is going to mix P,
1 – P and David is going to mix Q, 1 – Q.
So how do we go about finding David’s equilibrium mix Q,
1–Q? What’s our trick from last week?
Should be able to cold call at this point, but let’s not have
to. How am I going to find Q,
the equilibrium Q? Somebody?
Thank you, they can use Venus’ payoffs, good.
So to find–it isn’t Venus’ payoffs–it’s Nina’s payoffs.
Fair enough, sorry. So to find the Nash Equilibrium
Q, to find the mix that David’s using we use Nina’s payoffs.
So let’s do that. So, in particular,
for Nina, if she goes apple picking then her payoff is 2
with probability Q if she meets David and 0 otherwise.
If she goes to the Rep then her payoff is 1 if she meets David,
sorry, need to be careful, let’s do it again.
If she goes to the Rep her payoff is 0 if David goes apple
picking with probability Q, and her payoff is 1 if she
meets David at the Rep, which happens with probability
1 – Q, is that correct?
So this is her payoff from apple picking and this is her
payoff from seeing Richard II. And what do we know if Nina is
indeed mixing, what do we know about these two
payoffs? They must be equal.
If Nina is in fact mixing, then these two things must be
equal. And that means:
what we’re saying is 2Q equals 1(1-Q) or Q equals 2/3,
I guess it is. No it’s 1/3 sorry.
Is that right? Q is 1/3.
Okay, so our guess is that if there’s a mixed strategy
equilibrium it must be the case that David is assigning a
probability 1/3 to going apple picking,
which means he’s assigning probability 2/3 to his more
favored activity which is going to see Richard II.
What about, I’m going to pull these both down,
okay, how do we find Nina’s mix?
So to find the Nash Equilibrium P, to find Nina’s mix what do we
do? What’s the trick?
Somebody? Use David’s payoffs.
So David’s payoffs, if he goes apple picking then
he gets a payoff of 1 if he meets Nina there and 0 otherwise
and if he goes to the Rep he gets a payoff of 0 if Nina’s
gone apple picking, and he gets a payoff of 2 if he
meets Nina at the Rep. Once again, if David is
indifferent it must be that these are equal.
So if these are– if David is in fact mixing between apple
picking and going to the Rep–it must be that these two are equal
and if we set this out carefully we’ll get,
let’s just see, we’ll get 1(P) equals 2(1-P),
which is P equals 2/3 and 1-P equals 1/3.
So here we have Nina assigning 2/3 to going apple picking,
which in fact is her more favored thing and 1/3 to going
to the Rep. Okay, so we just used the same
trick as last time, let’s check that this is in
fact an equilibrium. So, in particular,
let’s check that it is in fact an equilibrium for Nina to
choose 2/3,1/3. Let’s check.
So check that P equals 2/3 is in fact the best response for
Nina. Let’s go back to Nina’s payoffs.
For Nina, if she chose to go apple picking,
her payoff now is 2 times Q but Q is equal to 1/3 plus 0(1-Q)
and if she chooses to go to the Rep then her payoff is 0 with
probability 1/3 and 1 with probability now 2/3.
All I’ve done is I’ve taken the lines I had before and
substituted in now what we know must be the correct Q and 1-Q
and this gives her a payoff of 2/3 in either case.
If she chooses P, her payoff to P will be 2/3 of
the time she’ll get the payoff from apple picking which is 2/3
and 1/3 of the time she’ll get the payoff from going to the Rep
which is 2/3 for a total of 2/3. So Nina’s payoff from either of
her pure strategies is 2/3. Her payoff from our claimed
equilibrium mixed strategy is 2/3, so neither of her possible
pure strategy deviations were profitable.
She didn’t lose her anything either, but they weren’t
profitable, and by the lesson we started the class with,
that means there cannot be any strictly profitable mixed
deviation either, so indeed,
for Nina, P is a best response to Q.
We can do the same for David but let’s not bother,
it’s symmetric. So in this game we found
another equilibrium. The other equilibrium,
the new equilibrium is Nina mixed 2/3,1/3 and David mixed
1/3,2/3 and we also know the payoff from this equilibrium.
The equilibrium from this payoff, for both players,
was 2/3. There are three equilibria in
this game. They managed to meet at apple
picking in which case the payoffs are 2 and 1.
They managed to meet at the Rep, that’s the second pure
strategy equilibrium, in which case the payoffs are 1
and 2, or they mixed, both of them mixed in this way,
and their payoffs are 2/3,2/3. Why is the payoff so bad in
this mixed strategy equilibrium? Does everyone agree,
this is a pretty lousy payoff? The other equilibrium payoffs
the worst you got was 1 and you sometimes got 2,
but now here you are playing a different equilibrium and at
this different equilibrium you’re only getting 2/3.
Why are you only getting–what happened?
Why have these payoffs got pushed down so far?
What’s happening to our poor hapless couple?
Or not hapless I don’t know. What’s happening to our couple?
Student: Sometimes they don’t meet.
Professor Ben Polak: Yeah, they’re failing to meet.
The reason, what’s forcing these payoffs down is they’re
not meeting very often? How often are they actually
meeting? How often are they meeting?
Let’s have a look. Let’s go back to the previous
board. Here it is..So they meet when
they end up in this box or this box, is that right?
So what’s the probability of them ending in those boxes?
Well ending up in this box is probability 2/3,1/3 and ending
up in this box is probability 1/3,2/3, is that right?
You end up meeting apple picking, the 2/3 of the time
when Nina goes there times the 1/3 of the time when David goes
there. And you end up meeting at the
Rep the 1/3 of the time Nina goes there times the 2/3 of the
time that David goes there. So this is the total
probability of meeting and it’s equal to 4/9,
is that right? So 4/9 of the time they’re
meeting, but 5/9 of the time–more than half the
time–they’re screwing up and failing to meet.
This is why I call them a hapless dating couple.
So this is a very bad equilibrium, but it captures
something which is true about the game.
What is surely true about this game is that if they just played
this game, they wouldn’t meet all the time.
In fact what we’re arguing here is they’d meet less than half of
the time. But certainly this idea that
we’re given from the pure strategy equilibria,
that they would magically always manage to meet seems very
unlikely, so this does seem to add a
little bit of realism to this analysis of the game.
However, it leads to a bit of an interpretation problem.
You might ask the question why on Earth are they randomizing in
this way. Why are they doing this?
It’s bad for everybody. Why are they doing this?
This leads us to think about a second interpretation for what
we think mixed strategy equilibria are.
Rather than thinking of them literally as randomizing,
it’s probably better in this case to think about the
following idea. We need to think about David’s
mixture as being a statement about what Nina believes David’s
going to do. David may not be literally
randomizing. But his mixture Q,
1–Q, we could think of as Nina’s belief about what David’s
going to do. Conversely, Nina may not
literally be randomizing. But her P, 1 – P,
we could think of as David’s belief about what Nina’s going
to do. And what we’ve done is we’ve
found the beliefs such that these players are exactly
indifferent over what they do. We found the beliefs for David
over what Nina’s going to do, such that David doesn’t really
quite know what to do. And we found the beliefs that
Nina holds about what David’s going to do such that Nina
doesn’t quite know what to do. That make sense?
So it’s probably better here to think about this not as people
literally randomizing but these mixed strategies being a
statement about what people believe in equilibrium.
We’ll come back and look at this game some more later on,
so our couple I’m afraid are not quite out of the woods yet.
But I want to spend the rest of today looking at yet another
interpretation of mixed strategy equilibria.
So, so far we have two, we have people are literally
randomizing. We have thinking of these as
expressions about what people believe in equilibrium rather
than what they’re literally doing.
And now I’m going to give you a third interpretation.
So for now we can get rid of the Venus and Serena game.
So to motivate this third idea I want to think about tax
audits. So none of you here have ever,
probably ever, had to fill out a tax form,
except for the fact that there seems to be a lot of parents in
the room today, is it parents weekend,
is that what’s going on? So where are the parents in the
room? Wave your arms in the air if
you’re a parent here. So at least these guys at
probably some point in their life filled out a tax form.
So come tax day, the parents in the room face a
choice, and the choice is are they going to honestly fill out
their taxes, or are they going to cheat?
I’m not going to ask them what they, well maybe I will,
but for now I won’t ask them what they did.
So they can choose one of two things.
They can choose to pay their taxes honestly–we’ll call that
H–or to cheat. This is the tax payer,
the parent. And at the same time the audit
office, the auditor, has to make a choice,
and the auditor’s choice is whether to audit you or not and
it’s not literally true because literally the auditor can wait
until your tax return comes in and then decide whether to audit
you. But for now let’s think of
these choices being made simultaneously,
and we’ll see why that makes it more interesting.
So let me put down some payoffs here and then I’ll explain them.
So 2,0, 4, -10,4, 0 and 0,4. So how do we interpret this?
Let’s look at the auditor’s payoffs first of all.
So the auditor is very happy not having to audit your parents
and having your parents pay taxes, so we’ll give that a
payoff of 4. It’ll turn out,
in this game, we’ve decided in the payoffs,
that the auditor is equally happy if she actually audits
your parents in the year that they cheated.
We’ll say that makes the auditor equally happy.
Now the auditor is not so happy if she audits your parents when
they’re honest because audits are costly.
The auditor is really unhappy if she fails to audit when the
parents cheated. Let’s look at the–I keep
wanting to call them parents–I should stop calling them
parents, let’s call them taxpayers.
So for the taxpayers, what are their payoffs?
Well we’ll normalize things, so if they’re honest we’ll give
them a payoff of 0. That means they correctly fill
in their tax form and pay what they’re supposed to pay,
but if they can conceal some of their income,
they pretend to have whatever it is,
a third child, then they might be in trouble
if they’re audited. If they’re audited they’re
going to have to pay a big fine, maybe even go to jail,
so that’s -10. Of course if they’re not
audited they get to keep a chunk of money so we’ll call that 4.
Everyone understand the basic idea of this game?
In reality, we could add more complications,
we could think of different ways to cheat on your taxes,
but I don’t want to give tutorials on how to cheat on
your taxes here. So it’s not going to take long
staring at this game to figure out that there are no pure
strategy equilibria in this game.
Let’s just do that, so from the taxpayer’s point of
view, if they’re going to be audited,
then they’d rather pay their taxes than not,
and if they’re not going to be audited then according to these
payoffs they’d rather cheat. From the auditor’s point of
view, if they knew everyone was going to pay taxes,
then they wouldn’t bother auditing and if they knew
everyone was going to cheat, then they’d of course audit.
So you can quickly see that there’s no box in which the best
responses coincide, there’s no pure strategy Nash
Equilibria. For those people who are
thinking this is seeming other worldly, you will have to pay
taxes in a couple of years, and trust me your parents are
paying taxes now. So what we want to do here is
we’re going to solve out and find a mixed strategy
equilibrium, but we’re going to give it a
different interpretation to the equilibria we found so far.
But the basic, initial exercise is what?
We’re going to find–we’re going to try and find the
equilibrium here. So to find the Nash Equilibrium
here we know it’s going to be mixed.
So to find the probability with which taxpayers pay their
taxes–and let me already start getting ahead of myself and just
say to find the proportion of taxpayers who are going to pay
their taxes–what do we do? What must be true of that
equilibrium proportion Q of taxpayers who pay their taxes?
How am I going to find that Q? Shout it out somebody.
Yeah look at the auditor’s payoffs.
So from the auditor’s point of view, if the auditor audits,
their payoff is 2Q plus 4(1-Q) and if they don’t audit their
payoff is 4Q plus 0(1-Q). Everyone see how I do this,
this is 2Q plus 4(1-Q) and this is 4Q plus 0(1-Q).
And if indeed the auditor is mixing, then these must be
equal. And if they’re equal,
let’s just do a little bit of algebra here and we’ll find that
2Q equals 4(1-Q) so Q equals 2/3, is that right?
So our claim is to make the auditor exactly indifferent
between whether to audit or not, it must be the case that 2/3 of
the parents of the kids in the room, are going to be paying
their taxes honestly, which means 1/3 aren’t,
which is kind of worrying, but never mind.
Let’s have a look at the taxpayer.
To find, sorry. We found the taxpayer,
we found the proportion of taxpayers who are paying their
taxes, now I want to find out the probability of being
audited. How do I figure out the
equilibrium probability of being audited in this model?
How do I work out the equilibrium probability of being
audited? Shout it out.
So the equilibrium probability of being audited are going to
use P and 1-P, so P is going to be the
probability of being audited, how do I find P?
Yeah, I’m going to look at the taxpayer’s payoffs.
So from the taxpayer’s point of view, if the taxpayer pays their
taxes, their payoff is just 0, and if they cheat they’re
payoff is -10P plus 4(1-P). And if indeed the taxpayers are
mixing–or in other words, we are saying that not all
taxpayers are cheating and not all taxpayers are honestly
paying their taxes–then these must be equal.
So if these are equal I’m going to get 4P equals 14–no it
didn’t– I’m going to get 4 equals 14P,
let’s try again, 4 equals 14P,
that was a bit worrying, 4 equals 14P,
which is the same as saying P equals 2/7.
If somebody can just check my algebra I think that’s right.
So my claim is that the equilibrium here is for 2/3 of
the taxpayers to pay their taxes and for the audits,
the auditor, to audit 2/7 of the time.
Now we could go back in here and we could check,
I could do what I did before, I could plug the Ps and Qs in
here and check that in fact this is an equilibrium,
but trust me that I’ve done that, trust me that it’s okay.
So here we have an equilibrium, let’s just write down what it
is. From the auditor’s point of
view it is that they audit 2/7 of the time, or 2/7 of the
population, and from the taxpayers’ point
of view, it’s that they pay their taxes honestly 2/3 of the
time and not otherwise. Now without focusing too much
on these exact numbers for a second, I want to focus first
for a minute on how do we interpret this mixed strategy
equilibrium. So from the point of view of
the auditor we’re really back where we were before with the
base stealer or the person who’s searching baggage at the
airport. We could think of the auditor
literally as randomizing. In fact, there’s some truth to
that. It actually is the case that by
law, that the auditor’s literally have to randomize.
So this 2/7,5/7 this has the same interpretation as we had
before. This is really a randomization.
But this 2/3,1/3 has a different interpretation and a
potentially exciting interpretation.
It isn’t that we think that your parents get to tax day,
work out what their taxes would be and then toss a coin.
They may be doing that, I’m looking at the parents and
I don’t think that’s what they’re doing.
The interpretation here is that the parents, some parents are
paying their taxes and some parents aren’t paying their
taxes. There’s a lot of parents out
there, a lot of potential taxpayers, and in the
population, in equilibrium,
if these numbers were true, 2/3, of parents would be paying
their taxes and 1/3 would be cheating.
So this is a randomization by a player, and this is a mixture in
the population. The new interpretation here is,
we could think of the mixed strategy not as players
randomizing, but as a mix in a large
population of which some people are doing one thing and the
other group are doing the other. It’s a proportion of people
paying taxes. So I don’t know if this 2/3,1/3
is an accurate number for the U.S.
It’s probably not very far off actually.
For Italy I’m ashamed to say the number of people who pay
taxes is more like 40%, maybe even lower now,
and there are countries I think where it gets as high as 90%.
I think the U.S. rate when they end up auditing
is a little higher than this but not much.
So again, we’re going to think of this not as randomization but
as a prediction of the proportion of American taxpayers
who are going to pay their taxes.
Now, I want to use this example in the time we have left,
to actually think about a policy experiment.
So let’s put this up somewhere we can see it.
Let’s think about a new tax policy.
So suppose that Congress gets fed up with all these newspaper
reports about how 2/3 of American’s don’t pay their taxes
or whatever the true proportion is,
I think it’s actually a little higher than that but never mind.
They get fed up with all these reports and they say,
this isn’t fair, we should make people pay their
taxes so we’re going to change the law and instead of
paying–instead of being in jail for ten years,
or the equivalent of a fine of -10 if you’re caught cheating,
we’re going to raise the fine or the time in jail so that it’s
now -20. So the policy experiment is
let’s raise the fine–to fine the cheating–to -20 and the aim
of this policy is try to deter cheating, right?
It seems a plausible thing for a government to want to do.
Let’s redraw the matrix, so here’s the game – 2,0,
4, -20,4, 0,0, 4 audit, not audit and pay
honestly or cheating. So here’s our new payoffs and
let’s ask the question, with this new fine in place,
now we’ve raised the fine, to being caught not paying your
taxes, in the long run once things have worked their way
back into equilibrium again, after a few years,
do we expect American taxpaying compliance to go up or to go
down, or what do we expect?
What do we think is going to happen?
So who thinks it’s going to go up?
Who thinks it’s going to go down?
Who thinks it’s going to say the same?
Who’s abstaining here? I notice the parents are
abstaining. u’re not really meant to
abstain. You have to vote here.
Well how are we going to figure this out?
How are we going to figure out what’s going to happen to
compliance? What happens to tax compliance?
Tax compliance, remember that was our P–no it
wasn’t, sorry, it was our Q.
The only way we’re going to figure this out is to work out,
so let’s work out the new Q in equilibrium.
Let’s do this, so to find out the new Q in
equilibrium, once again, we’re going to have to look at
the auditor’s payoffs, and the auditor’s payoffs if
they audit, they’re going to get 2Q plus 4(1-Q),
and if they don’t audit they’re going to get 4Q plus 0(1-Q),
and if the auditor is indifferent,
if they’re mixing, it must still be the case that
these are equal. And now I want to ask you a
question, where have you seen that equation before?
Yeah, it’s still there right, I didn’t delete it.
It’s the same equation that sits up there.
Is that right? From the auditor’s point of
view, given the payoffs to the auditors nothing has changed,
so the tax compliance rate that makes the auditor exactly
indifferent between auditing your parents and not auditing
your parents, is still exactly the same as it
was before at 2/3. In equilibrium,
tax compliance hasn’t changed at all.
Let me say that again, the policy was we’re going to
double the fines for being caught cheating and in
equilibrium it made absolutely no difference whatsoever to the
equilibrium tax compliance rate. Now why did it make no
difference? Well let’s have a techie answer
and then a better, a more intuitive answer.
The techie answer is this, what determines the equilibrium
tax compliance rate, what determines the equilibrium
mix for the column player is what?
Is the row’s payoffs. What determines the equilibrium
mix for the column player are the row’s payoffs–row player’s
payoffs. We didn’t change the row
player’s payoffs, so we’re not going to change
the equilibrium mix for the column player.
Say again, we changed one of the payoffs for the column
player but the column player’s equilibrium mix depends on the
row player’s payoffs and we haven’t changed the row player’s
payoffs, so we won’t change the
equilibrium compliance rate, the equilibrium mix by the
column player. What will have changed here?
What will have changed in the new equilibrium?
So we’ve pretty much established that people are
cheating as much as they were before in equilibrium.
Rahul, can I get Henry here? Student: Probability has
changed. Professor Ben Polak: Say
again. Student: The probability
of audit would have changed. Professor Ben Polak: The
probability of audit will have changed.
What’s going to change is not the Q but the P,
the probability with which you’re audited is going to
change in this model. Let’s just check it,
to find the new P, I need to look at the
taxpayer’s payoffs and the taxpayer’s payoffs are now 0,
–sorry, if they pay their taxes
honestly then they get 0, and if they cheat they get -20
with probability P and 4 with probability 1-P.
If they’re mixing, if some of them are paying and
some of them are not, this must be the same,
and I’m being more careful than I was last time I hope,
this gives me 24 P is equal to 4 or P equals 1/6.
So the audit rate has gone down from 2/7 to 1/6.
I’m guessing that probably wasn’t the goal of the policy
although it isn’t necessarily a bad thing.
There is some benefit for society here,
because audits are costly, both to do for the auditor and
they’re unpleasant to be audited,
so the fact that we’ve managed to lower the audit rate from 2/7
to 1/6 is a good thing, but we didn’t manage to raise
the compliance rate. So I don’t want to take this
model too literally, because it’s just a toy model,
but nevertheless, let’s try and draw out some
lessons from this model. So here what we did was we
changed the payoff to cheating, we made it worse.
But a different kind of change is we could have changed,
sorry,- we changed the payoff negatively to being caught
cheating. But a different change we could
have done is we could have left the -10 in place and we could
have raised the payoff to cheating and not getting caught.
We could have left this 10 in place and changed this 4 let’s
say to a 6 or an 8. We’ve increased the benefits to
cheating if you’re not caught. What would that have done in
equilibrium? So I claim, once again,
that would have done nothing in equilibrium to the probability
of people paying their taxes, but that would have done what
to the audit rate? The audit rate would have gone
up, the equilibrium audit rate would have gone up.
Let’s tell that story a second. So rich people,
people who are well paid, have a little bit more to gain
from cheating on their taxes if they’re not caught,
there’s more money at stake. So my colleagues who are
finance professors in the business school have more money
on their tax returns than I do, so in principle,
they gain more if they cheat. Does that mean that they cheat
more than me in equilibrium? No, it doesn’t mean that they
cheat more than me in equilibrium.
What does it mean? It means they get audited more
often. In equilibrium,
richer people aren’t necessarily going to cheat more,
but they are going to get audited more,
and that’s true. The federal audit rates are
designed so they audit the rich more than they audit the poor.
Again, it’s not because they think the rich are inherently
less honest, or the poor are inherently more honest,
or anything like that, it’s simply that the gains to
cheating and not getting caught are bigger if you’re rich,
so you need to audit more to push back into equilibrium.
Now, suppose we did in fact want to use the policy of
raising fines to push down, to push up the compliance rate,
to push down cheating. How would we change the law?
Suppose we want to raise the fines for cheating,
we don’t like people cheating so we raise the fines,
but we’re worried about this result that didn’t push up
compliance rates, how could we change the law or
change the incentives in the game so that it actually would
change compliance rates? What could we do?
Yeah. Student: If we changed
the payoffs of auditing to 4, from 2 to 4.
Professor Ben Polak: Good, if we want to change the
compliance rates we should change the payoffs to the
auditor. The problem with the way the
auditor is paid here is that the auditor is paid more if they
manage to catch people, but audits are costly.
The problem with that is when you raise the fine on the other
side, all that happens is the auditor’s audit less often in
equilibrium. So if you want to get a higher
compliance rate, one thing you could do is
change the payoffs to the auditor to make auditing less
costly for them, or making catching people nicer
for them, give them a reward, or you could simply take it out
of Game Theory altogether. You could enforce,
you could have congressional law that sets the audit rates
outside of equilibrium, and that’s been much discussed
in Congress over the last five years.
Somebody setting audit rates, as it were, exogenously by
Congress. Why might that not be a great
idea? Leaving aside Economic Theory
for a second, leaving aside Game Theory,
why might it not be a great idea to have Congress set the
audit rates rather than some office?
Student: Most members in Congress have a lot of money so
they’re going to lower the audit rates so that they don’t get
audited. Professor Ben Polak: So
the lady in the front is saying that a lot of Congressmen are
rather rich, so maybe they have particular
incentives here. I don’t want to take a
particular political stance here, but it could be whatever
side of the political spectrum you guys sit on,
it could be that you might not trust Congress to get this
right. You might think they’re going
to be political considerations going on in Congress other than
just having an efficient tax system.
Okay, so what do I want to draw out as lessons here?
The big lessons from this class are there are three different
ways to think about randomization in equilibrium or
out of equilibrium. One is it’s genuinely
randomization, another is it could be
something about peoples belief’s,
and a third way and a very important way is it could be
telling us something about the proportion of people who are
doing something in society, in this case the proportion of
people who are paying tax. A second important lesson I
want to draw out here, beyond just finding equilibria,
two other things we drew out today, one lesson was when
you’re checking equilibria, checking mixed strategy
equilibria, you only have to check for pure strategy
deviations. Be careful, you have to check
for all possible pure strategy deviations, not just the pure
strategies that were involved in the mix.
If the guy has seven strategies and is only mixing on two,
you have to remember to check the other five.
The third lesson I want to draw out today is because of the way
equilibria works, mixed strategy equilibria work,
if I change the column player’s payoffs it changes the row
player’s equilibrium mix, and if I change the row
player’s payoffs, it changes the column player’s
equilibrium mix. Next time, we’re going to pick
up this idea that mixed strategies can be about
proportions of people playing things and take it to a totally
different setting, namely evolution.
So on Wednesday we’ll start talking about evolution.