Next: 7.2.2 Pre-Flop Re-Weighting Up: 7.2 Learning Previous: 7.2 Learning   Contents

## 7.2.1 Re-Weighting System

The weight adjustment for any particular subcase is based on the threshold hand value needed for the observed action. The threshold is in turn based on the observed frequency of folding, calling and raising of the player in question (or the default frequencies when we are using generic opponent modeling or have insufficient data). From these frequencies we deduce the average (, representing the mean hand) and spread (, to account for uncertainty) of the threshold needed for the observed action.

The values and define an expected distribution of hands to be played, realizing a transformation function for re-weighting. We take some ranked distribution of all possible hands held (IR for the pre-flop and EHS' for the post-flop rounds, meaning we presume our opponents rank hands like we do) and each hand h is compared to and and re-weighted accordingly (see Figures 7.1 and 7.2). When the value for h is equal to , the re-weighting value is 0.5; when it is more than below , it is 0.01; when it is more than above , it is 1 (the weight is unchanged); and when it is within of it is linearly interpolated between 0.01 and 1 (we use a simple linear interpolation because it is easy and we do not know what the distribution should look like). Since we do not want to completely rule out any legal subcase we do not allow the weight to go below 0.01.

The value is based on the threshold needed for the observed action, which is in turn based on the frequency. For example, consider that player p has been observed 100 times in the pre-flop with 2 bets to call: 20 times it was raised, 70 times called and 10 times folded. If we wanted to re-weight based on a raise we compute the frequency of raising: , meaning the threshold is (player p raises with the top 20% of hands). However, if we wanted to re-weight based on a call (frequency 0.70) we must note that the threshold for a call is not but is rather 0.70 below the raise threshold: (player p calls with the middle 70% of hands). There are other intricacies dependent on the actual round of play, such as determining and using IR values (which are non-percentile), which will be discussed in the following sections. One immediately noticeable source of error is that this system presumes a proper distribution over the hand rankings ( i.e. the hands above represent 20% of all hands). This is not true for the post-flop rounds, because EHS' is optimistically increased with PPOT. However, the relative ranking of hands is still correct.

As a final note, to prevent the weight array from being distorted by automatic or false actions, we only perform one re-weighting per model per round. We store a copy of the weight array at the beginning of the round and each time a new action is witnessed requiring a higher threshold, the saved weight array is used in the re-weighting. For example, if we witness opponent p calling a bet, we may re-weight using some , say 0.5. If, later in the betting round, we see that opponent raise, re-weighting will be done with the higher value of , and is based on the original weight array.

Next: 7.2.2 Pre-Flop Re-Weighting Up: 7.2 Learning Previous: 7.2 Learning   Contents
Denis Papp
1998-11-30