OpenSpiel Versions Save

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.

v1.4

5 months ago

This release adds new games, bug fixes, and process + support changes.

Support and Process changes

Added (partial) Python 3.12 support: building & testing of wheels only (also distributed on PyPI)
Added building and availability of Apple Silicon (arm64) binary wheels on PyPI
Removed support fpr Python 3.7 due to EOL June 2023
Upgraded versions of supported extra packages (JAX, PyTorch, TF, etc.)
Added ml-collections as a required dependency

Games

Added Dots & Boxes
Added Chat Game (python)
Added MFG crowd avoidance game
Added MFG periodic aversion game
Modify Predator-Prey MFG to take in initial values
(Not yet complete) Add partial implementation of Yacht

Algorithms

Removed PyTorch NFSP (see https://github.com/google-deepmind/open_spiel/issues/1008)
Remove unnecessary policy reload in outcome sampling MCCFR (see #1115)
Rewrite Stackelberg equilbrium solver using cvxpy (see #1123)

Examples

Training TD n-tuple networks on 2048

Improvements and other additions

Added a build_state_from_history_string helper function for debugging
GAMUT generator: expand the set of games provided by the wrapper
Add exclude list in game simulation tests for games that are partially complete
Refactored all games into individual directories
Changed 2048 to exclude moves that don't change the board from legal actions
Introduced number of tricks and change order of information in bridge observations (see #1118)
Added missing functions for C++-wrapped TabularPolicy to pybind11
Added missing functions to (CorrDevBuilder and C(C)E*Dist) to pybind11
Added more examples to help debug game implementations

Fixes

Backgammon: added the dice to the end of the observation vector
Fixed uses of functions deprecated in NumPy 1.25
Fixed float comparisons in playthroughs to default to 6 decimal places
Fixed bug in entropy schedule in R-NaD (see #1076)
Fixed bug in rho value (see #968)
Fixed to actions of game definition of Liar's poker (see #1127)
Fixed castling bug in chess (see #1125)
Corrected include statements for efg_game (causing C++ DQN to not build)

Several other miscellaneous fixes and improvements.

Acknowledgments

Thanks to Google DeepMind for continued support of development and maintenance of OpenSpiel.

Thanks to all of our contributors:

Core Team: https://github.com/deepmind/open_spiel/blob/master/docs/authors.md
All Contributors: https://github.com/deepmind/open_spiel/graphs/contributors

v1.3

10 months ago

This release adds several games and algorithms, improvements, bug fixes, and documentation updates.

Support and Process changes

Added Python 3.11 support
Added Roshambo bot population to wheels
Removed Python 3.6 support
Upgraded versions of supported extra packages (OR-Tools, abseil, Jax, TF, Pytorch, etc.)

Games

Bach or Stravisnky matrix game
Block Dominoes (python)
Crazy Eights
Dhu Dizhu
Liar's poker (python)
MAEDN (Mensch Ärgere Dich Nicht)
Nine Men's morris

Game Transforms

Add Noisy utility to leaves game transform
Add Zero-sum game transform

Other environments

Atari Learning Environment (ALE)

Algorithms

Boltzmann Policy Iteration (for mean-field games)
Correlated Q-learning
Information State MCTS, Cowling et al. '12 (Python)
LOLA and LOLA-DiCE (Foerster, Chen, Al-Shedivat, et al. '18) and Opponent Shaping (JAX)
MIP Nash solver (Sandholm, Gilpin, and Conitzer '05)
Proximal Policy Optimization (PPO); adapted from CleanRL. Supports single-agent use case, tested on ALE.
Regret-matching (Hart & Mas-Colell '00) for normal-form games and as a PSROv2 meta-solver
Regularized Nash Dynamics (R-NaD), Perolat & de Vylder et. al '22, Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Bots

Simple heuristic Gin Rummy bot
Roshambo bot population (see python/examples/roshambo_bot_population.py)

Examples

Opponent shaping on iterated matrix games example
Roshambo population example
Using Nash bargaining solution for negotiation example

Improvements and other additions

Add Bot::Clone() method for cloning bots
Avoid relying on C++ exceptions for playthrough tests
Add support Agent-vs-Task case in Nash averaging
Add scoring variants to the game Oh Hell
Add eligibility traces in C++ Q-learning and SARSA
Allow creation of per-player random policies
Support simultaneous move games in policy aggregator and exploitability
Support UCIBot via pybind11
Add single_tensor observer for all games
Add used_indices for non-marginal solvers in PSROv2
Add Flat Dirichlet random policy sampling
Add several options to bargaining game: probabilistic ending, max turns, discounted utilities
Add lambda returns support to JAX policy gradient
Several improvements to Gambit EFG parser / support
Add support for softmax policies in fictitious play
Add temperature parameter to fixed point MFG algorithms
Add information state tensor to battleship
Add option to tabular BR to return maximum entropy BR

Fixes

Fix UCIBot compilation in Windows
Misc fixes to Nash averaging
RNaD: fix MLP torso in final layer
Dark hex observation (max length)
Fix max game length in abstracted poker games
Fix legal moves in some ACPC(poker) game cases
Fix joint policy aggregator
Fix non-uniform chance outcome sampling in Deep CFR (TF2 & Pytorch)
Fix randomization bug in alpha_zero_torch

Several other miscellaneous fixes and improvements.

Known issues

There are a few known issues that will be fixed in the coming months.

Collision with pybind11 and version in C++ LibTorch AlphaZero. See #966.
PyTorch NFSP convergence issue. See #1008.

Acknowledgments

Thanks to Google DeepMind for continued support of development and maintenance of OpenSpiel.

Thanks to all of our contributors:

Core Team: https://github.com/deepmind/open_spiel/blob/master/docs/authors.md
All Contributors: https://github.com/deepmind/open_spiel/graphs/contributors

v1.2

1 year ago

This release adds several games and algorithms, improvements, bug fixes, and documentation updates.

Support and Process changes

Upgrade support for newer versions of dependencies
Add dependency to pybind11_abseil

Games

2048
Checkers
Dynamic routing game
Euchre
Mancala
Nim
Phantom Go

Algorithms

Asymmetric Q-learning
Magnetic Mirror Descent (MMD)
NeuRD (PyTorch)
Policy gradients (JAX)
Sample-based NeuRD loss (PyTorch)
Stackelberg solver
WoLF-PHC

Improvements and other additions

Blackjack: add observation tensor
C++ DQN: in-memory target net, saving + loading of model
Core API reference
Remove hard-coded inclusion of Hanabi and ACPC in setup.py

Fixes

Colored Trails: fix max utility
MCTS handling of chance nodes: properly handle them not just at the root
Nash averaging optimization fix
Othello: fix the max game length
Policy aggregator, surface copy -> deep copy
pybind11: change game references to shared pointers

Several other miscellaneous fixes and improvements.

Acknowledgments

Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.

Thanks to all of our contributors:

Core Team: https://github.com/deepmind/open_spiel/blob/master/docs/authors.md
All Contributors: https://github.com/deepmind/open_spiel/graphs/contributors

v1.1.1

1 year ago

This release adds several algorithms and games, and several process changes.

Support, APIs, and Process changes

Removed support for Python 3.6
Added support for Python 3.10
Upgrade support for newer versions of dependencies
Rust API: add support for loading bots
CI tests: added MacOS-11, MacOS-12, and Ubuntu 22.04. Removed CI tests for Ubuntu 18.04.

Games

Colored Trails
Dynamic Routing Game: added Sioux Falls network
Mancala (Kalah)
Multi-issue Bargaining
Repeated game transform: add info state strings & tensors, utility sum, finite recall
Sheriff: add info state tensor

Algorithms

Boltzmann DQN
Boltzmann Q-learning
Correlated Q-learning (Greenwald & Hall)
Deep average network for FP (mean-field games)
Nash averaging (Balduzzi et al.)
Nash Q-learning (Hu & Wellman)

Improvements and other additions

Example: support mean-field games
File wrapper: expose to Python and add WriteContents
Nash bargaining score example

Fixes

VR-MCCFR with nonzero baselines
PyTorch policy gradient clipping
Promote pawn to queen in RBC
PyTorch and LibTorch DQN: fix for illegal moves

Many other fixes to docs and code quality.

Acknowledgments

Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.

Thanks to all of our contributors:

Core Team: https://github.com/deepmind/open_spiel/blob/master/docs/authors.md
All Contributors: https://github.com/deepmind/open_spiel/graphs/contributors

v1.1.0

2 years ago

This release adds some major functionality: new games, new algorithm, several fixes and new features.

Support and APIs

Windows: native build via Microsoft Visual Studio (experimental)
Rust API

Games

Amazons
Morpion Solitaire
Gridworld pathfinding (single-agent and multiagent)
Linear-Quadratic games (mean-field game)
Pig: Piglet variant added
Quoridor: 3-player and 4-player support added
Utlimate Tic-Tac-Toe

Algorithms

AlphaZero support for games with chance nodes (Python and C++)
ADIDAS approximate Nash equilibrium solver by Gemp et al. '21
Boltzmann DQN
Deep Online Mirror Descent (for mean-field games)
Expectiminimax (C++)

Mean-field Games

Deep Online Mirror Descent
Best response value function (instead of only exact)
Allow specifying learning rate in fictitious play
Routing game experiment data
Softmax policy

Bots

WBridge5 external bot
Roshambo bots: expose to Python

Fixes

Chess SAN notation
get_all_states: support added for games with loops
Hex and DarkHex bug fixes for even-sized boards
MCTS sampling from the prior when 0-1 visits specified (Python and C++)
Pig: 2D observation tensor, ActionString, MaxChanceNodesInHistory
Stones n' Gems serialization fix

Miscellaneous

Added SpielFatalErrorWithStateInfo debug helper
Refactored policies computed by RL into a shared JointRLAgentPolicy
Custom info state resampling function for IS-MCTS
Hidden Information Games Competition tournament code: make optional dependency
Upgrade versions of abseil and OR-Tools and versions in python extra deps
Python dependency on scipy
Poker chump policies

Acknowledgments

Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.

Thanks to all of our contributors:

Core Team: https://github.com/deepmind/open_spiel/blob/master/docs/authors.md
All Contributors: https://github.com/deepmind/open_spiel/graphs/contributors

v1.0.2

2 years ago

This is a minor release: mainly for bug fixes, and also some new functionality and updates to core functionality.

New games and modifications

Dynamic routing game: change to explicit stochastic (Python MFG), or deterministic (Python)
New garnet game (randomized MDPs) for mean-field games (C++)

New algorithms and other functionality

Restricted Nash Response (C++), Johanson et al. '08
Update mean-field game algorithms to use value functions
Enable Python best response to work for simultaneous-move games

Bug fixes

Allow observation tensors for turn-based simultaneous move games
Fixes to HIGC tournament code, add synchronous mode, explicit calls to bots
Fix game type in built-in observer
Fix information type for iterated prisoner's dilemma
Fix to wheels CI testing: always use python3

Misc

Add missing algorithms and games to algorithms page
Add patch to our version of absl to compile with newer compilers (Ubuntu 21.10)
Add python games to API test (now fully supported alongside all C++ games)
Enable noisy_policy to work for simultaneous move games
Added Common Loop Utils (CLU) to python extra deps

Acknowledgments

Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.

Thanks to all of our contributors:

Core Team: https://github.com/deepmind/open_spiel/blob/master/docs/authors.md
All Contributors: https://github.com/deepmind/open_spiel/graphs/contributors

v1.0.1

2 years ago

This is a minor release: mainly for bug fixes, and also some new functionality and updates to core functionality.

New game

Dynamic routing (python game and its mean-field limit game)

New functionality

Allow TabularBestResponseMDP to be computed for a specific player
Add Hidden Information Game Competition (HIGC) tournament code
Add expected game score for simultaneous move games

Bug fixes

Fix to blackjack to use standard policy for dealer
Several fixes to Reconnaissance Blind Chess (see #695 #696 and #697)
Update dependency to newer version of Hanabi
Fix imperfect recall state string in Phantom Tic-Tac-Toe and Dark Hex
Fix noisy policy (see https://github.com/deepmind/open_spiel/commit/2703b208068169fb45ebc5bee25dafc0bcb76cfc)
Fix UndoAction for a number of games, add test for it (also remove UndoAction from some games)

Acknowledgments

Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.

Thanks to all of our contributors:

Core Team: https://github.com/deepmind/open_spiel/blob/master/docs/authors.md
All Contributors: https://github.com/deepmind/open_spiel/graphs/contributors

v1.0.0

2 years ago

This is our first major stable release and fully-supported entry into pip/PyPI (binary distribution wheels and build from source).

New functionality since v0.3.1

Games

Dark Chess
Dark Hex
Imperfect recall variants of:
- Dark Hex
- Liar's Dice
- Phantom Tic-Tac-Toe
Kriegspiel
Mean-field games:
- Crowd modelling (C++ and Python)
- Crowd modelling 2D (C++)
- Predator prey (Python)
Python games:
- Iterated prisoner's dilemma
- Tic-Tac-Toe
Reconnaissance Blind Chess

Algorithms:

Deep CFR (JAX)
DQN (C++, via Libtorch)
DQN (JAX)
Fixed Strategy Iteration CFR (FSICFR) (Neller & Hnath '11)
Joint Policy-Space Response Oracles, JPSRO (Marris et al. '21)
Mean-field game algorithms:
- best response / NashConv
- Fictitious Play
- Mirror Descent
NFSP (JAX)
Tabular best response MDP (C++): alternative best response, including proper support for perfect info games and imperfect recall

Bots:

UCI (chess-playing) Bot
Gin Rummy: Simple Bot

API

golang
Mean-field games

Examples

DQNBR: computing an approximate best response using DQN
FSICFR in Liar's Dice
JSPRO usage example
MCCFR on imperfect recall games
Mean-field games: JAX DQN

Support and Process changes:

Building and testing of pip binary distribution wheels via cibuildwheel (nox tests removed)
Python dependencies: make most dependencies optional, depend only on those truly required
pybind11: use smart_holder (and depend on smart_holder branch)
Support g++ again (used for building bdist wheels)
Support Python 3.9

Misc

AlphaZero (C++ Libtorch) support for checkpointing
Connect Four: add ResampleFromInformationState
Gin Rummy: observer and parameterizing the game size
Game-specific functions: chess, backgammon
Poker: add half-pot abstraction, add total money, support subgames
Utilities: bit permutation function

Fixes and Documentation

We added two video tutorials (by Marc & Ed) linked from the main site. We also added a link to the main page about building and using OpenSpiel as a C++ library.

AlphaZero Libtorch: construct loss from policy logits
AlphaZero (TF-based): document status externally (unsupported)
Alpha-rank visualization: minor fix to deprecated matplotlib function
Argslib: fix multiple command-line arguments
Build: fix to enable optimizations by default
Docker build: fix to commands and documentation improvements
CFR tabular average policy computation more efficient
Refactor and cleanup of Python MCCFR
OshiZumo: fix side swap
PolicyBot: use keys instead of legal actions
Poker: add card set and other tests
Python games: many improvements and better overall support
PyTorch Deep CFR: fix MLP initialization issues and policy_net sizes parameter
Scripts: add caching to install and build scripts
Tabular Sarsa and Q-Learning (C++)
Tests: refactoring, add new tests that allow disabling of legal masks check and game-specific state-checker hook

Thanks

Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.

Thanks to all of our contributors:

Core Team: https://github.com/deepmind/open_spiel/blob/master/docs/authors.md
All Contributors: https://github.com/deepmind/open_spiel/graphs/contributors

v0.3.1

3 years ago

This addresses the problem that the new Python games were not compatible with the old version of OpenSpiel used by our pip package since it is too far behind (i.e. a fix to https://github.com/deepmind/open_spiel/issues/503).

This version has no differences from 0.3.0. It only exists to match the version required by replacing the package hosted on PyPI. Hence, this release is identical to 0.3.0.

New Functionality (from 0.2.0)

Games

Add Dark Hex
Add Kuhn (new Python game)
Tic-Tac-Toe (Python game, updated to new API)
Liar's Dice: new bidding variant, and configurable number of faces
Trade Comm: add info state string
Backgammon: expos action conversion functions available to Python

Algorithms

Deep CFR (PyTorch)
EVA (PyTorch)
Policy gradients (PyTorch)
Tabular Sarsa (C++)
Tabular Q-Learning (C++)
Sequence-form linear programming (C++)
Variance Reduction baselines (VR-MCCFR) in Python

Metrics

Distance to correlated and coarse-correlated equilibrium, for extensive-form games): CEDist and CCEDist (C++)

Examples

Poker "fold, call, pot, all-in" (FCPA) abstracted no-limit example (Python)
Generating multiple equilibria using CFR with random initial regrets and MCCFR

Process

Move from Travis to Github Actions for continuous integration (CI)
Update versions: TF, Jax, Julia

Misc.

Allow registration of observers
Distinction between perfect recall and imperfect recall example
Information state trees
JupyterLab environment (two Dockerfiles)
Allow Python games from C++, now fully compatible with main C++ API
Expose repr (Python)

Fixes and Documentation updates

Fix network construction in Exploitability Descent example
Fix GameParameters template compiler causing issues on older compilers
Fix Observation consistency in Trade Comm
Fix joining processes in Python AlphaZero
Fix HistoryString as identifier in HistoryTree
Improved documentation for the new observation API
Many other smaller fixes

Thanks

Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.

Thanks to all of our contributors:

Core Team: https://github.com/deepmind/open_spiel/blob/master/docs/authors.md
All Contributors: https://github.com/deepmind/open_spiel/graphs/contributors

Files

The open_spiel-source-0.3.1.tar.gz bundles the necessary and some optional dependencies along with the core code (pybind11, Hanabi, ACPC, etc.) and should be able to be built directly without any additional downloads

v0.3.0

3 years ago

This release aims to address the problem that the new Python games were not compatible with the old version of OpenSpiel used by our pip package since it is too far behind (i.e. a fix to https://github.com/deepmind/open_spiel/issues/503).

New Functionality

Games

Add Dark Hex
Add Kuhn (new Python game)
Tic-Tac-Toe (Python game, updated to new API)
Liar's Dice: new bidding variant, and configurable number of faces
Trade Comm: add info state string
Backgammon: expos action conversion functions available to Python

Algorithms

Deep CFR (PyTorch)
EVA (PyTorch)
Policy gradients (PyTorch)
Tabular Sarsa (C++)
Tabular Q-Learning (C++)
Sequence-form linear programming (C++)
Variance Reduction baselines (VR-MCCFR) in Python

Metrics

Distance to correlated and coarse-correlated equilibrium, for extensive-form games): CEDist and CCEDist (C++)

Examples

Poker "fold, call, pot, all-in" (FCPA) abstracted no-limit example (Python)
Generating multiple equilibria using CFR with random initial regrets and MCCFR

Process

Move from Travis to Github Actions for continuous integration (CI)
Update versions: TF, Jax, Julia

Misc.

Allow registration of observers
Distinction between perfect recall and imperfect recall example
Information state trees
JupyterLab environment (two Dockerfiles)
Allow Python games from C++, now fully compatible with main C++ API
Expose repr (Python)

Fixes and Documentation updates

Fix network construction in Exploitability Descent example
Fix GameParameters template compiler causing issues on older compilers
Fix Observation consistency in Trade Comm
Fix joining processes in Python AlphaZero
Fix HistoryString as identifier in HistoryTree
Improved documentation for the new observation API
Many other smaller fixes

Thanks

Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.

Thanks to all of our contributors:

Core Team: https://github.com/deepmind/open_spiel/blob/master/docs/authors.md
All Contributors: https://github.com/deepmind/open_spiel/graphs/contributors

Files

The open_spiel-source-0.3.0.tar.gz bundles the necessary and some optional dependencies along with the core code (pybind11, Hanabi, ACPC, etc.) and should be able to be built directly without any additional downloads