Assignment Overview
Setup
Goals of the Assignment
- Probabilistic simulations
- Text Commands
Testing Procedure
The Usual Warnings - Read them All
Pre-submission Test and Submission
Hints and Details - Read them All
Marking

Cmput 455 Assignment 3

Due Nov 4, 11:55pm. Submit via eClass submission link on the main eClass course page.
Late submission (with 20% deduction) Nov 6, 11:55pm. Submit via separate eClass link for late submissions.

In this assignment, you develop a probabilistic playout policy for Binary Game. You add your code to our starter code a3.py, a random Binary Game player similar to the starter code in Assignment 2.

Setup

First, as in previous assignments, make sure you have your Python 3 and NumPy set up. You can review the procedures under Lecture 3 activities.
Download assignment3.tgz and expand it. The directory assignment3 contains:
- File a3.py, similar starter code as for assignment 2. This is based on a sample solution to assignment 1, and implements a random Binary Game player.
- A number of sample pattern files with names such as nopattern.txt, twopattern.txt, fullpattern.txt is provided. These contain different weights to use in your probabilistic playout policy (see below).
- A file assignment3-public-tests.txt with some sample test cases for your program.
- A program a3test.py. Use this code to test your solution with
```
python3 a3test.py a3.py assignment3-public-tests.txt
```
Modify/add code in directory assignment3 only to implement your solution. Your main program needs to remain in a3.py - this is what our test process will call.
You do not need to write a complete player in this assignment. Just implement the commands below.
For more information about concepts related to this assignment, see the following resources:
- Assignments 1 and 2: Binary Game
- Lecture 12 - random simulations, simulations in TicTacToe, basics of Go3 player
- (Lecture 13 - Improving simulations, filtering rules and pattern-based playouts)
- Lecture 14 - Probabilistic Simulation Policies
- Lecture 12 and 14 python codes which include simulation-based players for TicTacToe.
- The Go3 and Go4 programs, and the Go3 documentation.

Goals of the Assignment

Implement a probabilistic playout policy for Binary Game defined by simple pattern weights stored in a file. Use a default weight if no pattern applies in a row or column.

Pattern Weights for Probabilistic Simulations

In probabilistic simulations, different legal moves may have different probabilities of being chosen in simulation. For example, in class we look at Remi Coulom's MM algorithm to find good move weights in the case of Go.

In this assignment we use simpler patterns and weights for Binary Game. These are already given to you in a file. Your job is to implement the probabilistic simulation policy that uses these patterns and weights.

Move patterns and weights

In this assignment, each move has both a row weight and a column weight. Both weights are defined by 5x1 patterns, with a default value when no pattern matches. The weight of each pattern/digit combination is stored in a text file.

A pattern is a sequence of five points. Each point is either 0, or 1, or empty (indicated by .), or off the board (indicated by X). The center point of each pattern must be empty - this is the location of the move that the pattern is for. An example of a pattern that is completely on the board is 10..1. An example of a pattern for a move on the first line is XX..1. Note the two XX which indicate that those locations are outside the board. A pattern is said to match a 5x1 area around a point on the board if the sequence of points is the same, with each X matching a location outside the board. For example, on the board

..1..
10..1

the pattern 10..1 matches horizontally in the middle of the second row. The pattern XX..1 matches horizontally at the top left corner.

Patterns can also be rotated by 90, 180, or 270 degrees, to match columns instead of rows, and to match a mirror image of the pattern on the board.

Example: the pattern X0.1. matches in eight places on the board - all the neighbors of a corner point:

0.1.0
.....
1...1
.....
0.1.0

A pattern file, such as nopattern.txt and twopattern.txt, contains a (possibly empty) list of patterns, moves and weights. When a pattern file is read, do the following:

First, remove all previous patterns (if any) from your program.
Read all the new patterns, line by line, from a file and add them to your program.
Lines starting with a hash mark # are comments and should be discarded when reading.

Each line in a pattern file has the following format:

pattern move weight

For example, the content of twopattern.txt is:

10..1 0 20
10..1 1 12

By the first line, playing a 0 in the middle of the pattern has a weight of 20. (Comment: If that move is played, then afterwards there is a different pattern 100.1 at this location.) By the second line, playing a 1 in the same pattern has a weight of 12. (The result would be 101.1 ).

If there is no matching pattern for a move, then the default weight for that row or column is 10.

We use the same patterns for both rows and columns in all four rotations. The total weight of a move is the sum of its row weight and its column weight. Note that one or both of these weights can be the default (10), if there is no matching pattern in that direction.

Our patterns have been designed in a way such that for each location/digit combination:

In the row, there will be either no pattern, or exactly one pattern, that matches.
In the column as well, there will be either no pattern, or exactly one pattern, that matches.

From Weights to Probabilities

In a probabilistic policy, each move is selected in proportion to its weight. To compute the move probabilities, we need to first compute the sum W of all move weights, then divide the weight of each move by W.

Example: assume we have three legal moves with weights 5, 12 and 3. Then W = 5+12+3 = 20, and the probabilities of choosing these moves are 5/20 = 0.25, 12/20 = 0.6 and 3/20 = 0.15.

Text Commands

You implement two text commands:

```
loadpatterns filename
```
First, delete all existing patterns in your program. Then, load all the new patterns from the file given by filename, and use all these patterns for your policy from now on. When your program first starts up, there should be no patterns loaded.
```
policy_moves
```
This command should print a sorted list of all legal moves and their probabilities, rounded to at most three digits after the decimal point, for the current simulation policy in the format
```
= move1 prob1 move2 prob2 ...moven probn
```
See Details and Examples for policy_moves
All other existing text commands should be left as-is. Especially, game and play are needed to set up test cases.

Testing Procedure

Test your solution with

python3 a3test.py a3.py assignment3-public-tests.txt

The Usual Warnings - Read them All

Please make sure that these details are correct in your assignment 3 submission.
Your file must be a valid tgz file named assignment3.tgz which can be uncompressed with tar xvfz assignment3.tgz
Keep your main file name as a3.py in the assignment3 directory.
Do not introduce extra levels of directories or different directory names. Keep the same simple structure as in the starter code.
You may add extra python files within the same directory.

Pre-submission Test and Submission

Follow the same general steps as in assignment 1 to create your presubmission.log file and your submission, but (of course) using your assignment3 directory, assignment3.tgz as file name, and assignment3-public-tests.txt as test. Remember to include your new presubmission.log and readme.txt.

Hints and Details - Read them All

The Python function random.choices can deal with both uniform random and weight-based probabilistic selection. See prob_select.py for an example.
Use the built-in python round function to round your probabilities to 3 digits of precision, as in round(p, 3).
Weights in our pattern files are integers in the range 1..20
All patterns are written as rows. For a pattern and its mirror image, at most one of them is present in a file. For example, if there is a pattern 00.11, then there will not be a 11.00 pattern in the file. Instead, you try to match the given pattern in all four rotations:
- Original, in the row: 00.11
- 180 degree rotation, in the row: 11.00
- 90 and 270 degree rotations in the column:
```
0
0
.
1
1
```
  and
```
1
1
.
0
0
```
No special error handling code is required in this assignment. We will only test with valid pattern files, using valid commands which setup and test legal Binary Game positions.
Time management: the test script is set to a time limit of 1 second per command. (Our simple sample solution stays far below that limit even on large boards.)
If your team changes from assignment 2, submit the new information using the web form by Oct 28.

Details and Examples for `policy_moves`

Moves should be numerically sorted in increasing order by x-coordinate first, by y-coordinate second, and by digit last. So for example, 0 0 0 before 0 0 1 before 0 1 0 before 1 0 0, and 9 0 0 before 10 0 0.

Example:

We make a 2x2 board and play a move:
game 2 2
play 1 0 1
Then we call policy_moves:
policy_moves
0 0 0 0.25 0 1 0 0.25 0 1 1 0.25 1 1 0 0.25
We then play another move. This time fewer moves are left:
play 1 1 0
policy_moves
0 0 0 0.5 0 1 1 0.5

Examples for `policy_moves`

Here are some explanations for the first few examples in assignment3-public-tests.txt .

The first test is on a 1x1 board, with no patterns. There are two moves, both with probability 0.5. Each move has a row weight and a column weight of 10, which is the default value when no pattern matches. So both moves have a weight of 20, and their probability is 20/(20+20) = 0.5. After a move is played, there is no move left to print with the policy_moves command.
The second test is on a 2x1 board, still with no patterns. There are four moves, all with probability 0.25.
The third test is on an empty 10x10 board. There are 200 moves, all with probability 0.005.
For the next few tests we setup the following 5x2 board with play commands:
```
..1..
10..1
```
There are still no patterns loaded. All 8 legal moves have a probability of 1/8 = 0.125.
Now we run loadpatterns twopattern.txt and run policy_moves again. The probabilities are now affected by the given patterns.
Next, we run loadpatterns nopattern.txt to remove the previous patterns. The probabilities in policy_moves are back to uniform.
Next we run loadpatterns fourpattern.txt to consider a pattern that is closer to the edge.
This time we run loadpatterns sixpattern.txt to illustrate how rotations are also important to generate the policy moves.
The last test runs loadpatterns fullpattern.txt and tests it on an empty 10x10 board. Compare the result to the third test.

Marking

There will again be 5 marks for this assignment.

1 mark for your file presubmission.log which shows the log of your correct test execution, and your readme.txt file (as in Assignment 1).
2 marks for passing our public test cases
2 marks for passing our private test cases

Last modified: Oct 11, 2024 by Abbas Masoumzadeh and Martin Müller

Contents