Contents

Cmput 455 Assignment 3

Due Nov 4, 11:55pm. Submit via eClass submission link on the main eClass course page.
Late submission (with 20% deduction) Nov 6, 11:55pm. Submit via separate eClass link for late submissions.

In this assignment, you develop a probabilistic playout policy for Binary Game. You add your code to our starter code a3.py, a random Binary Game player similar to the starter code in Assignment 2.

Setup

  1. First, as in previous assignments, make sure you have your Python 3 and NumPy set up. You can review the procedures under Lecture 3 activities.
  2. Download assignment3.tgz and expand it. The directory assignment3 contains:
  3. Modify/add code in directory assignment3 only to implement your solution. Your main program needs to remain in a3.py - this is what our test process will call.
  4. You do not need to write a complete player in this assignment. Just implement the commands below.
  5. For more information about concepts related to this assignment, see the following resources:

Goals of the Assignment

Implement a probabilistic playout policy for Binary Game defined by simple pattern weights stored in a file. Use a default weight if no pattern applies in a row or column.

Pattern Weights for Probabilistic Simulations

In probabilistic simulations, different legal moves may have different probabilities of being chosen in simulation. For example, in class we look at Remi Coulom's MM algorithm to find good move weights in the case of Go.

In this assignment we use simpler patterns and weights for Binary Game. These are already given to you in a file. Your job is to implement the probabilistic simulation policy that uses these patterns and weights.

Move patterns and weights

In this assignment, each move has both a row weight and a column weight. Both weights are defined by 5x1 patterns, with a default value when no pattern matches. The weight of each pattern/digit combination is stored in a text file.

A pattern is a sequence of five points. Each point is either 0, or 1, or empty (indicated by .), or off the board (indicated by X). The center point of each pattern must be empty - this is the location of the move that the pattern is for. An example of a pattern that is completely on the board is 10..1. An example of a pattern for a move on the first line is XX..1. Note the two XX which indicate that those locations are outside the board. A pattern is said to match a 5x1 area around a point on the board if the sequence of points is the same, with each X matching a location outside the board. For example, on the board

..1..
10..1

the pattern 10..1 matches horizontally in the middle of the second row. The pattern XX..1 matches horizontally at the top left corner.

Patterns can also be rotated by 90, 180, or 270 degrees, to match columns instead of rows, and to match a mirror image of the pattern on the board.

Example: the pattern X0.1. matches in eight places on the board - all the neighbors of a corner point:

0.1.0
.....
1...1
.....
0.1.0

A pattern file, such as nopattern.txt and twopattern.txt, contains a (possibly empty) list of patterns, moves and weights. When a pattern file is read, do the following:

  1. First, remove all previous patterns (if any) from your program.
  2. Read all the new patterns, line by line, from a file and add them to your program.
  3. Lines starting with a hash mark # are comments and should be discarded when reading.

Each line in a pattern file has the following format:

pattern move weight

For example, the content of twopattern.txt is:

10..1 0 20
10..1 1 12

By the first line, playing a 0 in the middle of the pattern has a weight of 20. (Comment: If that move is played, then afterwards there is a different pattern 100.1 at this location.) By the second line, playing a 1 in the same pattern has a weight of 12. (The result would be 101.1 ).

If there is no matching pattern for a move, then the default weight for that row or column is 10.

We use the same patterns for both rows and columns in all four rotations. The total weight of a move is the sum of its row weight and its column weight. Note that one or both of these weights can be the default (10), if there is no matching pattern in that direction.

Our patterns have been designed in a way such that for each location/digit combination:

  1. In the row, there will be either no pattern, or exactly one pattern, that matches.
  2. In the column as well, there will be either no pattern, or exactly one pattern, that matches.

From Weights to Probabilities

In a probabilistic policy, each move is selected in proportion to its weight. To compute the move probabilities, we need to first compute the sum W of all move weights, then divide the weight of each move by W.

Example: assume we have three legal moves with weights 5, 12 and 3. Then W = 5+12+3 = 20, and the probabilities of choosing these moves are 5/20 = 0.25, 12/20 = 0.6 and 3/20 = 0.15.

Text Commands

You implement two text commands:

Testing Procedure

Test your solution with
python3 a3test.py a3.py assignment3-public-tests.txt

The Usual Warnings - Read them All

Pre-submission Test and Submission

Follow the same general steps as in assignment 1 to create your presubmission.log file and your submission, but (of course) using your assignment3 directory, assignment3.tgz as file name, and assignment3-public-tests.txt as test. Remember to include your new presubmission.log and readme.txt.

Hints and Details - Read them All

Details and Examples for policy_moves

Moves should be numerically sorted in increasing order by x-coordinate first, by y-coordinate second, and by digit last. So for example, 0 0 0 before 0 0 1 before 0 1 0 before 1 0 0, and 9 0 0 before 10 0 0.

Example:

We make a 2x2 board and play a move:
game 2 2
play 1 0 1
Then we call policy_moves:
policy_moves
0 0 0 0.25 0 1 0 0.25 0 1 1 0.25 1 1 0 0.25
We then play another move. This time fewer moves are left:
play 1 1 0
policy_moves
0 0 0 0.5 0 1 1 0.5

Examples for policy_moves

Here are some explanations for the first few examples in assignment3-public-tests.txt .

  1. The first test is on a 1x1 board, with no patterns. There are two moves, both with probability 0.5. Each move has a row weight and a column weight of 10, which is the default value when no pattern matches. So both moves have a weight of 20, and their probability is 20/(20+20) = 0.5. After a move is played, there is no move left to print with the policy_moves command.
  2. The second test is on a 2x1 board, still with no patterns. There are four moves, all with probability 0.25.
  3. The third test is on an empty 10x10 board. There are 200 moves, all with probability 0.005.
  4. For the next few tests we setup the following 5x2 board with play commands:
    ..1..
    10..1
    
    There are still no patterns loaded. All 8 legal moves have a probability of 1/8 = 0.125.
  5. Now we run loadpatterns twopattern.txt and run policy_moves again. The probabilities are now affected by the given patterns.
  6. Next, we run loadpatterns nopattern.txt to remove the previous patterns. The probabilities in policy_moves are back to uniform.
  7. Next we run loadpatterns fourpattern.txt to consider a pattern that is closer to the edge.
  8. This time we run loadpatterns sixpattern.txt to illustrate how rotations are also important to generate the policy moves.
  9. The last test runs loadpatterns fullpattern.txt and tests it on an empty 10x10 board. Compare the result to the third test.

Marking

There will again be 5 marks for this assignment.


Last modified: Oct 11, 2024 by Abbas Masoumzadeh and Martin Müller