Maru version 8.0

Maru is a computer Go program developed using deep reinforcement learning from randomly generated game records. The deep learning model of Maru incorporates large-kernel depthwise convolutions and multi-head attention, enabling it to efficiently grasp the overall state of the board. Its reinforcement learning procedure is inspired by approaches used in Katago and Gumbel AlphaZero, allowing it to efficiently learn a wide range of patterns. This page shows the improvement in playing strength through reinforcement learning and the results of self-play matches.

Maru is a sibling program of the computer Shogi program Gokaku. Maru shares the same deep learning model architecture, search algorithm, and reinforcement learning methodology as Gokaku.

If you have any questions, please contact Atsushi Takeda.

Ratings

Self-Play Games

History

2024/09/28: Reinforcement learning of Maru 8.0 has started from game records with random moves.

2024/09/28: Training of the 1st model, which consists of 6 blocks and 96 channels (b5n1c96), has started.

2024/10/09: The 1st model training has been stopped when the number of generated game records reached 1M.

2024/10/09: Training of the 2nd model, which consists of 12 blocks and 128 channels (b10n2c128), has started.

2024/11/02: The 2nd model training has been stopped when the number of generated game records reached 2.3M.

2024/11/02: Training of the 3rd model, which consists of 17 blocks and 192 channels (b15n2c192), has started.

2024/12/14: The CGOS rating of the 1st model with the visits of 500 reached 2464.

2024/12/28: The 3rd model training has been stopped when the number of generated game records reached 3.6M.

2024/12/28: Training of the 4th model, which consists of 18 blocks and 192 channels (b15n3c192), has started.

2025/01/07: The CGOS rating of the 2nd model with the visits of 500 reached 2934.

2025/01/07: The 4th model training has been stopped when the number of generated game records reached 4M.

2025/01/11: Training of the 5th model, which consists of 20 blocks and 256 channels (b16n4c256), has started.

2025/02/14: The 5th model training has been stopped when the number of generated game records reached 4.8M.

2025/02/18: The CGOS rating of the 3rd model with the visits of 500 reached 3157.

2025/02/18: The CGOS rating of the 4th model with the visits of 500 reached 3302.

2025/03/06: Training of the 6th model, which consists of 24 blocks and 256 channels (b20n4c256), has started.

2025/03/21: The 6th model training has been stopped when the number of generated game records reached 5M.

2025/03/25: The CGOS rating of the 5th model with the visits of 500 reached 3442.

2025/03/28: Training of the 7th model, which consists of 25 blocks and 320 channels (b20n5c320), has started.

Copyright (C) 2024 Atsushi TAKEDA