AlphaZero.jl Versions Save

A generic, simple and fast implementation of Deepmind's AlphaZero algorithm.

v0.5.4

1 year ago

AlphaZero v0.5.4

Diff since v0.5.3

Closed issues:

Deprecate Util.mapreduce in favor of something more standard (#54)
Use Base.Logging and ProgressLogging (#55)
How to use multiple GPUs on a single node (#69)
Cloud computing (#80)
Iterative vs continuous learning (#81)
Does Alpha Zero require a static representation of a scenario (#83)
Does it make sense to attempt to apply AlphaZero to "Agricola" (#100)
Multiplayer capability (#101)
Issue while running this in local (#107)
The strength of the Mancala bot (#110)
6 dependencies errored (#112)
Issue with dummy_run() (#114)
StackOverflowError (during training) (#116)
To continue a training (#118)
How important is GI.vectorize_state function? (#119)
When exploring a position, what these abbreviations mean? (#121)
Cloud service for AlphaZero.jl (#122)
Number of network parameters (#126)
Scripts.explore (#136)
How to disable benchmarks? (#137)
A log of played games during training (#138)
Which hyperparameters? (#139)
Cannot run sample (#143)
GPU vs CPU (#144)
MCTS.RolloutOracle(gspec) (#145)
num_filters=128 (#150)
NVIDIA GeForce GTX 1650 isn't good? (#151)
What's the best OS for AlphaZero.jl ? (#153)
Does it work with Tesla? (#154)

Merged pull requests:

CompatHelper: bump compat for Distributions to 0.25, (keep existing compat) (#98) (@github-actions[bot])
CompatHelper: bump compat for Flux to 0.13, (keep existing compat) (#108) (@github-actions[bot])
Fix typo in readme (#109) (@LilithHafner)
Update report.jl (#111) (@gwario)
CompatHelper: bump compat for Setfield to 1, (keep existing compat) (#128) (@github-actions[bot])
Fix parameter access (#130) (@gwario)
CompatHelper: bump compat for LoggingExtras to 1, (keep existing compat) (#152) (@github-actions[bot])

v0.5.3

2 years ago

AlphaZero v0.5.3

Diff since v0.5.2

Closed issues:

CUDA Error (#57)
τ=0.5 errors (#67)
Using continuous rewards (i.e., non ternary games) (#77)
Support for singleplayer games (#79)
Debugging in VS Code (#84)
Desired Type hierarchy for adding GNN's (#85)
MCTS.explore! must be called before MCTS.policy (#86)

Merged pull requests:

Extend documentation in CommonRLInterface (#70) (@johannes-fischer)
Fix error with discounting in RolloutOracle (#73) (@johannes-fischer)
Replace mkdir by mkpath (#74) (@johannes-fischer)
Use joinpath to make code more robust on Windows machines (#75) (@johannes-fischer)
Update experiment.md (#82) (@yutaizhou)
Fix memory analysis (#89) (@johannes-fischer)
call batch on vectors (not generators) (#91) (@CarloLucibello)
add CompatHelper (#92) (@CarloLucibello)
CompatHelper: bump compat for Setfield to 0.8, (keep existing compat) (#94) (@github-actions[bot])
CompatHelper: bump compat for ExprTools to 0.1, (keep existing compat) (#95) (@github-actions[bot])
CompatHelper: bump compat for Documenter to 0.27, (keep existing compat) (#96) (@github-actions[bot])
CompatHelper: bump compat for Distributions to 0.25, (keep existing compat) (#97) (@github-actions[bot])

v0.5.2

2 years ago

AlphaZero v0.5.2

Diff since v0.5.1

Closed issues:

Support for OpenSpiel games? (#15)
How to use AlphaZero.jl for Openspiel games? (#46)
Current status of Multi-threading MCTS Benchmarking? (#56)
Performance Docs (#58)
isprobvec(p) error? (#59)
Do these readout look correct? (#60)
Benchmark Questions? (#61)
Does Scripts.play("connect-four") cheat? (#62)
isprobvec(p) whenever using Benchmark.NetworkOnly(τ=0.5) (#63)
How exactly does Alphazero's MCTS work? (#64)
Any idea what's causing this? (#65)

Merged pull requests:

OpenSpiel.jl support (#68) (@michelangelo21)

v0.5.1

2 years ago

AlphaZero v0.5.1

Diff since v0.5.0

Closed issues:

API discussion (#4)
self play takes more and more time (#41)
Supervised learning (#48)
MCTS Optimization for sparse actions (#49)
Training on the cloud / multiple instances / clusters (#50)
Any Tips for per-player tracking? (#51)
Sanity Checks (#52)
Speed issues? (#53)

Merged pull requests:

Mancala - fixed set_state!() (#44) (@michelangelo21)
Invert temperature in formula (documentation) (#45) (@johannes-fischer)

v0.5.0

3 years ago

AlphaZero v0.5.0

Diff since v0.4.0

Improved the inference server so that it is now possible to keep MCTS workers running while a batch of requests is being processed by the GPU. Concretely, this translates into SimParams now having two separate num_workers and batch_size parameters.
The inference server is now spawned on a separate thread to ensure minimal latency.

Together, the two aforementioned improvements result in a 30% global speedup on the connect-four benchmark.

v0.4.0

3 years ago

AlphaZero v0.4.0

This release brings many new features to AlphaZero.jl including:

Added support for CommonRLInterface.jl.
Added a grid-world MDP example illustrating this new interface.
Added support for distributed training: it is now equally easy to train an agent on a cluster of machines than on a single computer.
Replaced the async MCTS implementation by a more straightforward synchronous implementation. Network inference requests are now batched across game simulations.
Added the Experiment and Scripts module to simplify common tasks.

See CHANGELOD.md for details.

Closed issues:

Connect Four training must be restarted about every 24 hours due to an OOM error (#1)
The Flux backend is currently broken (#2)
Importation of training parameters from JSON is broken (#3)
UndefVarError: lib not defined when training a connect four agent (#5)
Possibility to skip initial benchmark (#6)
Assertion error during apply_symmetry (#7)
Checkpoint evaluation randomly fails (#8)
MDP Version (#9)
Suggestion: replace Oracle with just a function (#10)
@unimplemented (#11)
Some issues with installing the package (#12)
Register package with General registry (#13)
Missing repository's website (#16)
fail to explore (#17)
CuDNN error (#18)
using AlphaZero (#19)
UndefVarError: lib not defined (#20)
LoadError: CUBLASError (#21)
Error building Knet (#22)
LoadError: InitError: CUDA.jl does not yet support CUDA with nvdisasm 11.1.74; (#23)
CuDNN error 8 on Ubuntu 18.04, Julia 1.5.2 (#24)
Stateful Game-structs throw errors (#25)
LSTM support (#28)
CUDA vs CUDAnative? (#29)
Embed trained network in javascript web app for browser-based inference? (#30)
Connect Four iteration training time is taking a long time (#31)
Question about symmetries (#32)
Question about function test_symmetry (#33)
Migrate neural net agents across AlphaZero.jl instances? (#34)
Can a game know its players' types? (#35)
Exploit several CPU (#36)
Exploit multiple GPUs (#37)
Enumerating actions without state (#38)
fatal: Remote branch v0.4.0 not found in upstream origin (#39)

Merged pull requests:

Mancala (#42) (@michelangelo21)