Due Nov 4, 11:55pm.
Submit via eClass submission link on the main eClass course page.
Late submission (with 20% deduction) Nov 6, 11:55pm.
Submit via separate eClass link for late submissions.
In this assignment, you develop a probabilistic playout
policy
for Binary Game.
You add your code to our starter code a3.py
,
a random Binary Game player similar to the starter code
in Assignment 2.
assignment3
contains:
a3.py
, similar starter code as for assignment 2.
This is based on a sample solution to assignment 1,
and implements a random Binary Game player.
nopattern.txt
, twopattern.txt
, fullpattern.txt
is provided.
These contain different weights
to use in your probabilistic playout policy (see below).
assignment3-public-tests.txt
with some sample test cases for your program.
a3test.py
.
Use this code to test your solution with
python3 a3test.py a3.py assignment3-public-tests.txt
a3.py
- this is what our test process will call.
Go3
player
Implement a probabilistic playout policy for Binary Game defined by simple pattern weights stored in a file. Use a default weight if no pattern applies in a row or column.
In probabilistic simulations, different legal moves may have different probabilities of being chosen in simulation. For example, in class we look at Remi Coulom's MM algorithm to find good move weights in the case of Go.
In this assignment we use simpler patterns and weights for Binary Game. These are already given to you in a file. Your job is to implement the probabilistic simulation policy that uses these patterns and weights.
In this assignment, each move has both a row weight and a column weight. Both weights are defined by 5x1 patterns, with a default value when no pattern matches. The weight of each pattern/digit combination is stored in a text file.
A pattern is a sequence of five points.
Each point is either 0, or 1, or empty (indicated by .), or off the board
(indicated by X).
The center point of each pattern must be empty - this is the location of the move that the pattern is for.
An example of a pattern that is completely on the board is 10..1
.
An example of a pattern for a move on the first line is XX..1
. Note the two XX which indicate that those locations are outside the board.
A pattern is said to match a 5x1 area around a point on the board if the sequence of
points is the same, with each X matching a location outside the board.
For example, on the board
..1.. 10..1
the pattern 10..1
matches horizontally in the middle
of the second row.
The pattern XX..1
matches horizontally at the
top left corner.
Patterns can also be rotated by 90, 180, or 270 degrees, to match columns instead of rows, and to match a mirror image of the pattern on the board.
Example:
the pattern X0.1.
matches in eight places on the board -
all the neighbors of a corner point:
0.1.0 ..... 1...1 ..... 0.1.0
A pattern file, such as nopattern.txt
and twopattern.txt
,
contains a (possibly empty) list of patterns, moves and weights.
When a pattern file is read, do the following:
Each line in a pattern file has the following format:
pattern move weight
For example, the content of twopattern.txt
is:
10..1 0 20 10..1 1 12
By the first line, playing a 0 in the middle of the pattern
has a weight of 20.
(Comment: If that move is played, then afterwards there is a different pattern 100.1
at this location.)
By the second line, playing a 1 in the same pattern has a weight of 12.
(The result would be 101.1
).
If there is no matching pattern for a move, then the default weight for that row or column is 10.
We use the same patterns for both rows and columns in all four rotations. The total weight of a move is the sum of its row weight and its column weight. Note that one or both of these weights can be the default (10), if there is no matching pattern in that direction.
Our patterns have been designed in a way such that for each location/digit combination:
In a probabilistic policy, each move is selected in proportion to its weight. To compute the move probabilities, we need to first compute the sum W of all move weights, then divide the weight of each move by W.
Example: assume we have three legal moves with weights 5, 12 and 3. Then W = 5+12+3 = 20, and the probabilities of choosing these moves are 5/20 = 0.25, 12/20 = 0.6 and 3/20 = 0.15.
You implement two text commands:
loadpatterns filenameFirst, delete all existing patterns in your program. Then, load all the new patterns from the file given by
filename
,
and use all these patterns for your policy from now on.
When your program first starts up, there should be no patterns loaded.
policy_moves
This command should print a sorted list of all legal moves and their probabilities, rounded to at most three digits after the decimal point, for the current simulation policy in the format
= move1 prob1 move2 prob2 ...moven probnSee Details and Examples for
policy_moves
All other existing text commands should be left as-is. Especially,
game
and play
are needed to set up test cases.
python3 a3test.py a3.py assignment3-public-tests.txt
assignment3.tgz
which can be uncompressed
with
tar xvfz assignment3.tgz
a3.py
in the assignment3
directory.
Follow the same general steps as in assignment 1 to
create your
presubmission.log
file and your
submission,
but (of course) using your assignment3
directory,
assignment3.tgz
as file name,
and assignment3-public-tests.txt
as test.
Remember to include your new
presubmission.log
and readme.txt
.
random.choices
can deal with both uniform random and weight-based probabilistic selection.
See prob_select.py for an example.
round(p, 3)
.
00.11
, then there will not be a 11.00
pattern in the file.
Instead, you try to match the given pattern in all four rotations:
00.11
11.00
0 0 . 1 1and
1 1 . 0 0
policy_moves
Moves should be numerically sorted in increasing order by x-coordinate first, by y-coordinate second, and by digit last. So for example, 0 0 0 before 0 0 1 before 0 1 0 before 1 0 0, and 9 0 0 before 10 0 0.
Example:
We make a 2x2 board and play a move: game 2 2 play 1 0 1 Then we call policy_moves: policy_moves 0 0 0 0.25 0 1 0 0.25 0 1 1 0.25 1 1 0 0.25 We then play another move. This time fewer moves are left: play 1 1 0 policy_moves 0 0 0 0.5 0 1 1 0.5
policy_moves
Here are some explanations for the first few examples
in assignment3-public-tests.txt
.
policy_moves
command.
..1.. 10..1There are still no patterns loaded. All 8 legal moves have a probability of 1/8 = 0.125.
loadpatterns twopattern.txt
and run policy_moves
again.
The probabilities are now affected by the given patterns. loadpatterns nopattern.txt
to remove the previous patterns.
The probabilities in policy_moves
are back to uniform.loadpatterns fourpattern.txt
to consider a pattern that is closer to the edge.loadpatterns sixpattern.txt
to illustrate how rotations are also important to generate the policy moves.loadpatterns fullpattern.txt
and tests it on an empty 10x10 board.
Compare the result to the third test.There will again be 5 marks for this assignment.
presubmission.log
which shows
the log of your correct test execution, and your readme.txt
file
(as in Assignment 1).Last modified: Oct 11, 2024 by Abbas Masoumzadeh and Martin Müller