The weight adjustment for any particular subcase is based on the
threshold hand value needed for the observed action. The threshold
is in turn based on the observed frequency of folding, calling and
raising of the player in question (or the default
frequencies when we are using generic opponent modeling
or have insufficient data). From these frequencies we deduce
the average (,
representing the mean hand) and spread
(
,
to account for uncertainty) of the threshold needed
for the observed action.
The values
and
define an expected distribution of
hands to be played, realizing a transformation
function for re-weighting. We take some ranked distribution
of all possible hands held (IR for the pre-flop and EHS' for
the post-flop rounds, meaning we presume our opponents rank hands
like we do) and each hand h is compared to
and
and
re-weighted accordingly (see Figures 7.1 and 7.2).
When the value
for h is equal to
,
the re-weighting value is 0.5;
when it is more than
below
,
it is 0.01; when it is more than
above
,
it is
1 (the weight is unchanged); and when it is within
of
it
is linearly interpolated between 0.01 and 1 (we use a simple linear
interpolation because it is easy and we do not know what the
distribution should look like).
Since we do not want to completely rule out any legal subcase we do
not allow the weight to go below 0.01.
The value
is based on the threshold needed for the observed
action, which is in turn based on the frequency.
For example, consider that player p has been observed
100 times in the pre-flop with 2 bets to call: 20 times it was
raised, 70 times called and 10 times folded. If we wanted to re-weight
based on a raise we compute the frequency of raising:
,
meaning the threshold
is
(player p raises with the top 20% of hands).
However, if we wanted to re-weight based on
a call (frequency 0.70) we must note that the threshold for a call is not
but is rather 0.70 below the raise threshold:
(player p calls with the middle 70% of hands).
There are other intricacies
dependent on the actual round of play, such as determining
and
using IR values (which are non-percentile), which will be
discussed in the following sections.
One immediately noticeable source of error is that this system presumes
a proper distribution over the hand rankings ( i.e. the hands above
represent 20% of all hands).
This is not true for the post-flop rounds, because EHS' is
optimistically increased with PPOT. However, the relative ranking
of hands is still correct.
As a final note, to prevent the weight array from being distorted
by automatic or false actions, we only perform one re-weighting per model
per round. We store a copy of the weight array at the beginning
of the round and each time a new action is witnessed requiring
a higher threshold, the saved weight array is used in the re-weighting.
For example, if we witness opponent p calling a bet, we
may re-weight using some ,
say 0.5.
If, later in the betting round, we see that opponent raise,
re-weighting will be done with the higher value of
,
and is based
on the original weight array.