Cmput 455 Activities and Readings

All papers/articles are on eClass. See the button "Readings page - local copies and links for papers" near the top of the main eClass page.

Lecture 1
Lecture 2
Lecture 3
Lecture 4
Lecture 5
Lecture 6
Lecture 7
Lecture 8
Lecture 9
Lecture 10
Lecture 11
Lecture 12
(no Lecture 13 readings or activities)
Lecture 14
Lecture 15
(no Lecture 16 readings or activities)
Lecture 17
(no Lecture 18 readings or activities)
Lecture 19
Lecture 20
Lecture 21
Lecture 22
Lecture 23 and 24

Lecture 1 Reading and Activities

Reading: Krakovsky, Reinforcement Renaissance

Read around Lecture 1, before Quiz 2

Activity 1a: Read Course Information

Read around Lecture 1

Course syllabus - see link on eClass (important!)
Writing exams at LAC
Activities (this page) for the current week
Teaching team
Learning resources
Assignments Overview and Rules

Activity 1b: Install Python 3 and refresh your Python 3 knowledge

Do around Lecture 1, and later review as needed.

Use the Python 3 bootcamp/refresher to review typical language features.
See the Python 3 language information for how to run Python 3 programs, or install Python 3 on your own computer.

Lecture 2 Activities

Activity 2a: Python sample code for 455 code

Review before Lecture 2

Most lectures in this course come with sample code in Python 3. There is a Python sample code page with download links and short descriptions of the code. It will be updated throughout the term.
Download go2d.py
To use the sample codes, the simplest way is to download a file and run it from the command line, e.g.
```
python3 go2d.py
```

Activity 2b: Download and run Go0 and Go1 program code

Do around Lecture 2

Download go.tgz from the Python code page and unpack it. See First Go programs - Go0 and Go1 for instructions and explanations.

Activity 2c

Do around Lecture 2

Get familiar with the Go0 and Go1 program codes which you downloaded in activity 2b. Study the data structure for the Go board, and the algorithms for playing a move, checking legality, and capturing stones.

Optional Activity 2d: install GoGui

Do around Lecture 2, finish before Lecture 3

Download and install the GoGui user interface. You can use it for Go and other games that use similar game pieces.

Read the GoGui Overview Information for Cmput 455
Follow the GoGui install notes to install GoGui on a computer of your choice.
Add a playing engine to GoGui: the Go1 sample session includes how to add a player in GoGui.

Activity 2e: efficiency of finding blocks in `board.py`

Do after Lecture 2

In class we discussed the code which implements a floodfill (dfs) to find blocks. Is this efficient? Can you think of a faster way?

Lecture 3 Reading and Activities

Reading

Read before Lecture 3: Heingartner, Maybe We Should Leave That Up to the Computer

Activities

(Activities 3a and 3b are not used this term)

Activity 3c: Download Assignment 1 starter code and test cases

Do after Lecture 2. See Assignment 1 for details.

Activity 3d: Play a game of Go

Do before Lecture 4:

Play a game of Go against a colleague, against the Go1 program, or against an online opponent. See resources - Go.

Activity 3e: finding captured stones

Do before Lecture 4: Analyze the algorithm to identify captured stones which is implemented in go/board.py, functions _detect_and_process_capture and _has_liberty. Is it efficient? Can it be improved?

Activity 3f: Watch Two Herb Simon Videos

Watch before Lecture 4.

The Limits or Bounds of Rationality
What is Intuition?
Optional - read more about Herb Simon (wikipedia article)

Lecture 4 Activities

Optional Activity 4a

Do around Lecture 4:

Get familiar with the gogui-regress tool for automated testing. See Regression testing with gogui-regress.

Activity 4b: Watch Two Videos about Kahneman and Tversky's Work

Watch around Lecture 4.

Lecture 5 Reading and Activities

Reading:

Read around Lecture 5: O'Neil, How algorithms rule our working lives.

Activity 5a: compute size of game tree

Do around Lecture 5:

Given a game tree with constant branching factor b=10 and constant depth d=8. What is the total size of the tree? First, use the exact formula. Then, use the simple approximation b^d. How large is the approximation error, in percent (of the exact value)?

Activity 5b: compute size of game tree

Do around Lecture 5:

Same as activity 5a, but with b=8 and d=10. Which tree is bigger, the one in 5a or this one? How much bigger?

Activity 5c: compute number of states in DAG

Do around Lecture 5:

Given a DAG with constant branching factor b=10 and constant depth d=8. Let the root be at level 0. Assume that for nodes n at levels 2 and higher, there are two different ways to reach n from the previous level. This means that two different parent nodes p1 and p2 on the previous level both have a move that leads to n.

What is the number of nodes on level 8 of the DAG?

What is the total size of this DAG? How many times smaller is it than the tree from 5a?

Activity 5d: Count states in Tic-Tac-Toe DAG

Do after Lecture 5:

See the description of Activity 5d in the Lecture 5 slides.

Activity 5e: Effective branching factor in Tic-Tac-Toe DAG

Do after Lecture 5:

Compute the effective branching factor b_d, as explained in Lecture 5 slides, for all levels of the Tic-Tac-Toe DAG from Activity 5d.

Lecture 6 Activities

Activity 6a: Profile Go1 code

Do around Lecture 6:

Use profile_Go1.py in your go0and1 folder to profile Go1. Try out some changes/optimizations and profile it again.

Activity 6b: Profile your Assignment 1 code

Do after Lecture 6:

Using Activity 6a as a starting point, now profile your own program from the assignment. Note that this activity is not required for Assignment 1. However it will help you prepare for the upcoming assignment 2.

Lecture 7 Reading and Activities

Reading

Read around Lecture 7: Greenemeyer, 20 years after Deep Blue. Scientific American, 2017.

Activity 7a: Review resources related to blind search

This is an optional activity and extends beyond the material of the course. For example, you can review the slides on Blind Search Extras in the resources.

Activity 7b: experiment with `blind_search_on_tree.py`

Do around Lecture 7:

Re-run the sample code blind_search_on_tree.py and change the parameters b and d. Under which conditions does random sampling do well, or poorly?

Activity 7c: experiment with changing the sampling limit in `blind_search_on_tree.py`

Do around Lecture 7:

The sample code blind_search_on_tree.py uses 1000 tries. Change this value and re-run. How does the number of successful runs change with this limit?

Activity 7d: Expected Number of Steps for Random Sampling

This is an optional activity and is more challenging than usual activities.

Analyze the expected number of steps of random sampling as discussed in the slides. Read and follow the hints there.

Activity 7e: Random Sampling Without Repetition

This is an optional activity and is more challenging than usual activities.

Implement the version of random sampling without repetition as discussed in the slides. Measure the performance of this new code in the framework from blind_search_on_tree.py.

Lecture 8 Activities

Reading

Read around Lecture 8: Schaeffer et al, Checkers is solved. Science, 2007.

Activity 8a: Size of proof tree in the best case

Do after Lecture 8:

Given a uniform (b,d) tree, where the root is an OR node. Assume we can win. Find a closed formula for the total number of nodes in the proof tree in the best case. Hint 1: The counts of nodes per level are: 1, 1, b, b, b^2, b^2, b^3, b^3,.. Hint 2: consider the difference between even and odd depths d

Activity 8b: Size of disproof tree in the best case

Do after Lecture 8:

In the same setting as activity 8a, now find the best case size of the disproof tree, if the root is a loss. Hint 1: The nodes per level are now: 1, b, b, b^2, b^2, b^3, b^3,.. Hint 2: there is a very simple relation between the solution of the previous problem and this one. Find and use it.

Activity 8c: Experiment with solving TicTacToe positions using boolean minimax and negamax

Do after Lecture 8:

Run the solvers in boolean_minimax_test_tictactoe.py and boolean_negamax_test_tictactoe.py. Set up other opening positions and solve them.

Lecture 9 Activities

Activity 9a: Point target for winning in Go

Do after Lecture 9:

Given a board of size n times n, a komi k, and a player color. What is the minimum number of points that the player needs to win the game?

Activity 9b: Implement your 9a solution in Python

Do after Lecture 9:

Implement your solution from 9a in a Python function.

Lecture 10 Activities

Activity 10a: Implement a heuristic for TicTacToe

Do after Lecture 10:

See discussion in Lecture 10 slides.

Activity 10b: Measure the effect of your heuristic for solving TicTacToe

Do after Lecture 10:

Add a case in alphabeta_tictactoe_test.py to solve TicTacToe with your new heuristic. What is the speed difference?

Activity 10c: Measure nodes searched instead of time used

Do after Lecture 10:

Modify naive_negamax.py and alphabeta.py to return the number of nodes searched as well as the search result. Then run the experiments in alphabeta_tictactoe_test.py with your new instrumented code. What is the tradeoff between adding a heuristic in 10b in terms of nodes searched vs time per node?

Lecture 11 Reading

Reading

Read before Lecture 11: Van der Werf et al, Solving Go on small boards.

Lecture 12 Activities

Reading

Read after Lecture 12: Computing Elo Ratings of Move Patterns in the Game of Go. Rémi Coulom, Computer Games Workshop 2007, Amsterdam.

Activity 12a: Estimate Pi using Simulation

Do after Lecture 12:

Run estimate_pi.py to estimate the value of pi using simulation. Increase the number of simulations and see if the value becomes more accurate.

Implement a shape different from a quarter circle, for example a triangle or a quarter ellipse, and re-run the code. Compare with the exact area computed using math.

Then, estimate the volume of a unit ball in 3d centered at (0,0,0) (formula x^2 + y^2 + z^2 < 1) and compare with the exact volume.

Activity 12b:

Do after Lecture 12:

Re-run the experiment Sim(1000) vs Perfect TicTacToe player with more games. Does Sim(1000) lose any games?

Activity 12c: experiment with simulation-based TicTacToe player

Do after Lecture 12:

Read simulation code in tic_tac_toe4.py and tic_tac_toe_simulation_test.py

Run tic_tac_toe_simulation_test.py and vary the number of simulations in the call to randomTTT at the end

Activity 12d: experiment with simulation-based Go player

Do after Lecture 12:

Read simulation code for Go in Go3 and Go4, especially function play_game in board_util.py and the move generation functions called by it.

Optional Activity 12e

Write a perfect player with simulation-based tiebreaking, as discussed in the lecture 12 slides. Run the experiments from that slide with this player to see if it performs better against random.

Lecture 13

No Readings or Activities.

Lecture 14 Activities

Activity 14a: experiment with `bernoulli.py`

Do after Lecture 14

Using the code in bernoulli.py, experiment with different settings for p and limit. Study how many simulations you need to get different levels of accuracy.

Activity 14b: experiment with `mystery-bernoulli.py`

Do after Lecture 14

Play several rounds of mystery-bernoulli.py. How close is your guess? How many simulations did you choose?

Optional Activity 14c: Mystery game

Program your own simulation-based game. It works similar to Go3, but you chose the winrates of different moves randomly.

First, choose number of moves n
Next, generate the true winrates pi for each move i
Next, ask the user or number of simulations/move
Run that many simulations and collect empirical winrates (as in Go3)
Print out the empirical winrates
Let the user guess the best move
Now, let the user know the true winrates and true best move
Discuss: when is this game easy? When is it hard?

Lecture 15 Reading

Reading

Read before Lecture 15:

Parts of Browne et al, A Survey of Monte Carlo Tree Search Methods, as follows:

Read all of Sections 1, 2 and 3.
Read all of Section 5.
Read Section 6.1.9.
Optional: read the other sections as well.

Lecture 16

No Readings or Activities.

Lecture 17 Reading

Reading

Read before Lecture 17: A Few Useful Things to Know about Machine Learning. Pedro Domingos, Communications of the ACM 55 (10), October 2012, Pages 78-87.

Lecture 18

No Readings or Activities.

Lecture 19 Activities

Activities 19a - 19d: Watch videos on neural networks

Watch before Lecture 19:

Watch this series of animations about neural networks by Grant Sanderson of 3Blue1Brown. This is a little bit longer than usual activities (10-21 minutes per video), but I believe that these videos are an excellent resource.

Activity 19e: Neural networks as function approximation

Do around Lecture 19:

Explore the demos (briefly shown in class) on Michael Nielsen's page A visual proof that neural nets can compute any function. Understand in detail the intuition behind the "visual proof".

Lecture 20 Reading

Reading

Read before Lecture 20: Maddison, Huang, Sutskever and Silver, Move Evaluation in Go using Deep Convolutional Neural Networks.CoRR abs/1412.6564.
Optional additional reading: Clark and Storkey, Training Deep Convolutional Neural Networks to Play Go. ICML 2015.

Lecture 21 Activities

Watching

Watch before Lecture 21:

Rich Sutton, RL tutorial

Lecture 22 Reading

Read before Lecture 22:

Read Silver, Huang et al, Mastering the game of Go with deep neural networks and tree search. Nature 529, 484-489 (28 January 2016). The "first AlphaGo" paper.

Lecture 23 and 24 Activities

Reading

Read around Lecture 23: Silver, Schrittwieser et al, Mastering the game of Go without human knowledge. Nature 550, 354-359. The "AlphaGo Zero" paper.
Silver, Hubert et al, A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. The "Alpha Zero" paper.
Optional reading: Supplementary Materials for this paper.

Optional reading Lecture 24: Schrittwieser, Antonoglou, Hubert, T. et al., Mastering Atari, Go, chess and shogi by planning with a learned model, Nature 588, 604609 (2020).

Last reviewed: Sep 4, 2023, Martin Müller

Cmput 455 Activities and Readings

Contents

Lecture 1 Reading and Activities

Reading: Krakovsky, Reinforcement Renaissance

Activity 1a: Read Course Information

Activity 1b: Install Python 3 and refresh your Python 3 knowledge

Lecture 2 Activities

Activity 2a: Python sample code for 455 code

Activity 2b: Download and run Go0 and Go1 program code

Activity 2c

Optional Activity 2d: install GoGui

Activity 2e: efficiency of finding blocks in board.py

Lecture 3 Reading and Activities

Reading

Activities

Activity 3c: Download Assignment 1 starter code and test cases

Activity 3d: Play a game of Go

Activity 3e: finding captured stones

Activity 3f: Watch Two Herb Simon Videos

Lecture 4 Activities

Optional Activity 4a

Activity 4b: Watch Two Videos about Kahneman and Tversky's Work

Lecture 5 Reading and Activities

Reading:

Activity 5a: compute size of game tree

Activity 5b: compute size of game tree

Activity 5c: compute number of states in DAG

Activity 5d: Count states in Tic-Tac-Toe DAG

Activity 5e: Effective branching factor in Tic-Tac-Toe DAG

Lecture 6 Activities

Activity 6a: Profile Go1 code

Activity 6b: Profile your Assignment 1 code

Lecture 7 Reading and Activities

Reading

Activity 7a: Review resources related to blind search

Activity 7b: experiment with blind_search_on_tree.py

Activity 7c: experiment with changing the sampling limit in blind_search_on_tree.py

Activity 7d: Expected Number of Steps for Random Sampling

Activity 7e: Random Sampling Without Repetition

Lecture 8 Activities

Reading

Activity 8a: Size of proof tree in the best case

Activity 8b: Size of disproof tree in the best case

Activity 8c: Experiment with solving TicTacToe positions using boolean minimax and negamax

Lecture 9 Activities

Activity 9a: Point target for winning in Go

Activity 9b: Implement your 9a solution in Python

Lecture 10 Activities

Activity 10a: Implement a heuristic for TicTacToe

Activity 10b: Measure the effect of your heuristic for solving TicTacToe

Activity 10c: Measure nodes searched instead of time used

Lecture 11 Reading

Reading

Lecture 12 Activities

Reading

Activity 12a: Estimate Pi using Simulation

Activity 12b:

Activity 12c: experiment with simulation-based TicTacToe player

Activity 12d: experiment with simulation-based Go player

Optional Activity 12e

Lecture 13

Lecture 14 Activities

Activity 14a: experiment with bernoulli.py

Activity 14b: experiment with mystery-bernoulli.py

Optional Activity 14c: Mystery game

Lecture 15 Reading

Reading

Lecture 16

Lecture 17 Reading

Reading

Lecture 18

Lecture 19 Activities

Activities 19a - 19d: Watch videos on neural networks

Activity 19e: Neural networks as function approximation

Lecture 20 Reading

Reading

Lecture 21 Activities

Watching

Lecture 22 Reading

Lecture 23 and 24 Activities

Activity 2e: efficiency of finding blocks in `board.py`

Activity 7b: experiment with `blind_search_on_tree.py`

Activity 7c: experiment with changing the sampling limit in `blind_search_on_tree.py`

Activity 14a: experiment with `bernoulli.py`

Activity 14b: experiment with `mystery-bernoulli.py`