Gokaku version 2.1

Gokaku is a computer Shogi program developed using deep reinforcement learning from randomly generated game records. The deep learning model of Gokaku incorporates large-kernel depthwise convolutions and multi-head attention, enabling it to efficiently grasp the overall state of the board. Its reinforcement learning procedure is inspired by approaches used in Katago and Gumbel AlphaZero, allowing it to efficiently learn a wide range of patterns. This page shows the improvement in playing strength through reinforcement learning and the results of self-play matches. The rates of opening tactics are shown on here (in Japanese).

Gokaku is a sibling program of the computer Go program Maru. Gokaku shares the same deep learning model architecture, search algorithm, and reinforcement learning methodology as Maru.

This page shows the changes in playing strength during the reinforcement learning process of the latest version, Gokaku version 2.1. The Elo rating is calculated by playing matches against 60 models of nearby generations. The Elo ratings of other baseline programs are calculated based on their match results against Gokaku (these are not objective indicators of each program’s actual strength).

The progress of reinforcement learning for previous version, Gokaku version 2.0, can be found on here. If you have any questions, please contact Atsushi Takeda.

Ratings

Self-Play Games

History

2025/08/16: Reinforcement learning of Gokaku 2.1 has started from game records with random moves.

2025/08/16: Training of the 1st model, which consists of 4 blocks and 256 channels, has started.

2025/08/27: The 1st model training has been stopped when the number of generated game records reached 2.5M.

2025/08/27: Training of the 2nd model, which consists of 8 blocks and 384 channels, has started.

2025/09/16: The 2nd model training has been stopped when the number of generated game records reached 4M.

2025/09/16: Training of the 3rd model, which consists of 12 blocks and 512 channels, has started.

2025/10/24: The 3rd model training has been stopped when the number of generated game records reached 5.5M.

2025/10/24: Training of the 4th model, which consists of 20 blocks and 768 channels, has started.