LightZero Versions Save

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios

main

1 month ago

v0.0.5

1 month ago

Environment

  1. MemoryEnv (#197)
  2. MountainCar (#181)

Algorithm

  1. Gumbel AlphaZero in ctree (#212)

Enhancement

  1. add eval_offline option (#188)
  2. save the updated searched policy and value to the buffer during reanalyze (#190)
  3. add muzero visualization (#181)
  4. add efficientzero tictactoe configs (#204)
  5. add 2 mcts related iclr2024 papers
  6. add load pretrained model option in test_game_segment (#194)
  7. polish _forward_learn() and some data process operations (#191)

Fix

  1. fix sync_gradients and log in DDP settings (#200)
  2. fix channel_last bug
  3. fix total_episode_count bug in collector
  4. fix memory_lightzero_env return bug
  5. fix obs_max_scale bug in memory_env

Style

  1. add ZeroPal and discord link (#209)
  2. add unittest for game_buffer_muzero (#186)
  3. add customization documentation section in readme

Full Changelog: https://github.com/opendilab/LightZero/compare/v0.0.4...v0.0.5

Contributors: @karroyan @HarryXuancy @nighood @puyuan1996

v0.0.4

2 months ago

Enhancement

  1. add agent configurations & polish replay video saving method (#184)
  2. polish comments in worker files
  3. polish comments in tree search files (#185)
  4. rename mcts_mode to battle_mode_in_simulation_env, add sampled alphazero config for tictactoe (#179)
  5. polish redundant data squeeze operations (#177)
  6. polish the continuous action process in sez model
  7. polish bipedalwalker env

Fix

  1. fix completed value inf bug when zero exists in action_mask in gumbel muzero (#178)
  2. fix render settings when using gymnasium (#173)
  3. fix lstm_hidden_size in sampled_efficientzero_model.py
  4. fix action_mask in bipedalwalker_cont_disc_env, fix device bug in sampled efficientzero (#168)

Full Changelog: https://github.com/opendilab/LightZero/compare/v0.0.3...v0.0.4

Contributors: @karroyan @HarryXuancy @puyuan1996 @zjowowen

v0.0.3

5 months ago

Env

  1. MiniGrid env (#110)
  2. Bsuite env (#110)
  3. GoBigger env (#39)

Algorithm

  1. Sampled AlphaZero (#141)
  2. MuZero+RND (#110)
  3. Multi-Agent MuZero/EfficientZero (#39)

Enhancement

  1. add ctree version of mcts in alphazero (#142)
  2. upgrade the dependency on gym with gymnasium (#150)
  3. add agent class to support LightZero's HuggingFace Model Zoo (#163)
  4. add recent MCTS-related papers in readme (#159)
  5. add muzero config for connect4 (#107)
  6. add CONTRIBUTING.md (#119)
  7. add .gitpod.yml and .gitpod.Dockerfile (#123)
  8. add contributors subsection in README (#132)
  9. add CODE_OF_CONDUCT.md (#127)
  10. polish comments and render_eval configs for various common envs (#154) (#161)
  11. polish action_type and env_type, fix test.yml, fix unittest (#160)
  12. update env and algo tutorial doc (#106)
  13. polish gomoku env (#141)
  14. add random_policy support for continuous env (#118)
  15. polish simulation method of ptree_az (#120)
  16. polish comments of game_segment_to_array

Fix

  1. fix render method for various common envs (#154) (#161)
  2. fix gumbel muzero collector bug, fix gumbel typo (#144)
  3. fix assert bug in game_segment.py (#138)
  4. fix visit_count_distributions name in muzero_evaluator
  5. fix mcts and alphabeta bot unittest (#120)
  6. fix typos in ptree_mz.py (#113)
  7. fix root_sampled_actions_tmp shape bug in sez ptree
  8. fix policy utils unittest
  9. fix typo in readme and add a 'back to top' button in readme (#104) (#109) (#111)

Style

  1. add NeurIPS 2023 paper link

News

  1. NeurIPS 2023 Spotlight: LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios

Full Changelog: https://github.com/opendilab/LightZero/compare/v0.0.2...v0.0.3

Contributors: @PaParaZz1 @karroyan @nighood @jayyoung0802 @timothijoe @TuTuHuss @HarryXuancy @puyuan1996 @HansBug @mohitd404 @@PentesterPriyanshu @0Armaan025 @prajjwalyd @suravshresth @sohamtembhurne @eltociear

v0.0.2

7 months ago

Env

  1. MuJoCo env (#50)
  2. 2048 env (#64)
  3. Connect4 env (#63)

Algorithm

  1. Gumbel MuZero (#22)
  2. Stochastic MuZero (#64)

Enhancement

  1. polish mcts and ptree_az (#57) (#61)
  2. polish readme (#36) (#47) (#51) (#77) (#95) (#96)
  3. update paper notes (#89) (#91)
  4. polish model and configs (#26) (#27) (#50)
  5. add Dockerfile and its usage instructions (#95)
  6. add doc about how to customize envs and algos (#78)
  7. add pytorch ddp support (#68)
  8. add eps greedy and random collect option in train_muzero_entry (#54)
  9. add atari visualization option (#40)
  10. add log_buffer_memory_usage utils (#30)

Fix

  1. fix priority bug in muzero collector (#74)

Style

  1. update github action (#71) (#72) (#73) (#81) (#83) (#84) (#90)

Full Changelog: https://github.com/opendilab/LightZero/compare/v0.0.1...v0.0.2

Contributors: @PaParaZz1 @karroyan @nighood @jayyoung0802 @timothijoe @TuTuHuss @HarryXuancy @puyuan1996 @HansBug

v0.0.1

1 year ago