ReinforcementLearning.jl Versions Save

A reinforcement learning package for Julia

ReinforcementLearningEnvironments-v0.9.1

5 days ago

ReinforcementLearningEnvironments ReinforcementLearningEnvironments-v0.9.1

Diff since ReinforcementLearningEnvironments-v0.9.0

Merged pull requests:

  • Add missing Flux compat (#1059) (@jeremiahpslewis)
  • Fix docs / website build (#1064) (@jeremiahpslewis)
  • Correct Pendulum x-y coordinates (#1065) (@HenriDeh)
  • Make QBasedPolicy general for AbstractLearner s (#1069) (@dharux)
  • Fix hooks for multiplayer case (#1071) (@jeremiahpslewis)
  • Fix doc build errors (#1072) (@jeremiahpslewis)
  • bump rlcore version (#1073) (@jeremiahpslewis)
  • Make FluxApproximator work with QBasedPolicy (#1075) (@jeremiahpslewis)
  • Fix RLEnvs version (#1076) (@jeremiahpslewis)

Closed issues:

  • Next Release Plan (v0.11) (#614)
  • Package Stabilization Plan (#792)
  • test/runtests.jl empty (+ arch discussion) (#843)
  • policy(env) returns no legal action -inf initialized Q-table (#852)
  • Refactor CI into separate Workflows per package (and separate codecov projects per package) (#869)
  • Add deprecation warnings to non-refactored policies (#892)
  • Vectorized environments (#908)
  • Loading a Gym Environment (#912)
  • PPO with MaskedPPOTrajectory (#917)
  • Devmode is not working (#918)
  • TD3 Policy unable to handle environments with multidimensional action spaces (#951)
  • Spin off core packages (#960)
  • experiments failed (#982)
  • Breaking the tutorial by getting TotalRewardPerEpisode out of sync with the stopping condition in a run call (#1000)
  • Transfer Algorithms to RLFarm (#1028)
  • Update Buildkite script for gpu testing so it's sub package compatible (#1030)
  • Website: A practical introduction to RL: Does not introduce, source code is broken (#1036)
  • ElasticArraySARTSTraces does not record the trajectories of MountainCarEnv() correctly (#1067)
  • Algorithm implementations (#1070)
  • No method matching iterate ArrayProductDomain (#1074)

ReinforcementLearningCore-v0.15.3

5 days ago

ReinforcementLearningCore ReinforcementLearningCore-v0.15.3

Diff since ReinforcementLearningCore-v0.15.2

Merged pull requests:

  • Make FluxApproximator work with QBasedPolicy (#1075) (@jeremiahpslewis)

ReinforcementLearningBase-v0.13.1

5 days ago

ReinforcementLearningBase ReinforcementLearningBase-v0.13.1

Diff since ReinforcementLearningBase-v0.13.0

Merged pull requests:

  • Add missing Flux compat (#1059) (@jeremiahpslewis)
  • Fix docs / website build (#1064) (@jeremiahpslewis)
  • Correct Pendulum x-y coordinates (#1065) (@HenriDeh)
  • Make QBasedPolicy general for AbstractLearner s (#1069) (@dharux)
  • Fix hooks for multiplayer case (#1071) (@jeremiahpslewis)
  • Fix doc build errors (#1072) (@jeremiahpslewis)
  • bump rlcore version (#1073) (@jeremiahpslewis)
  • Make FluxApproximator work with QBasedPolicy (#1075) (@jeremiahpslewis)

Closed issues:

  • Next Release Plan (v0.11) (#614)
  • Package Stabilization Plan (#792)
  • test/runtests.jl empty (+ arch discussion) (#843)
  • policy(env) returns no legal action -inf initialized Q-table (#852)
  • Refactor CI into separate Workflows per package (and separate codecov projects per package) (#869)
  • Add deprecation warnings to non-refactored policies (#892)
  • Vectorized environments (#908)
  • Loading a Gym Environment (#912)
  • PPO with MaskedPPOTrajectory (#917)
  • Devmode is not working (#918)
  • TD3 Policy unable to handle environments with multidimensional action spaces (#951)
  • Spin off core packages (#960)
  • experiments failed (#982)
  • Breaking the tutorial by getting TotalRewardPerEpisode out of sync with the stopping condition in a run call (#1000)
  • Transfer Algorithms to RLFarm (#1028)
  • Update Buildkite script for gpu testing so it's sub package compatible (#1030)
  • Website: A practical introduction to RL: Does not introduce, source code is broken (#1036)
  • ElasticArraySARTSTraces does not record the trajectories of MountainCarEnv() correctly (#1067)
  • Algorithm implementations (#1070)

ReinforcementLearningCore-v0.15.2

1 month ago

ReinforcementLearningCore ReinforcementLearningCore-v0.15.2

Diff since ReinforcementLearningCore-v0.15.1

Merged pull requests:

  • Make QBasedPolicy general for AbstractLearner s (#1069) (@dharux)
  • bump rlcore version (#1073) (@jeremiahpslewis)

ReinforcementLearningCore-v0.15.1

1 month ago

ReinforcementLearningCore ReinforcementLearningCore-v0.15.1

Diff since ReinforcementLearningCore-v0.15.0

Merged pull requests:

  • Add missing Flux compat (#1059) (@jeremiahpslewis)
  • Fix docs / website build (#1064) (@jeremiahpslewis)
  • Correct Pendulum x-y coordinates (#1065) (@HenriDeh)
  • Fix hooks for multiplayer case (#1071) (@jeremiahpslewis)
  • Fix doc build errors (#1072) (@jeremiahpslewis)

Closed issues:

  • Next Release Plan (v0.11) (#614)
  • Package Stabilization Plan (#792)
  • test/runtests.jl empty (+ arch discussion) (#843)
  • policy(env) returns no legal action -inf initialized Q-table (#852)
  • Refactor CI into separate Workflows per package (and separate codecov projects per package) (#869)
  • Add deprecation warnings to non-refactored policies (#892)
  • Vectorized environments (#908)
  • Loading a Gym Environment (#912)
  • PPO with MaskedPPOTrajectory (#917)
  • Devmode is not working (#918)
  • TD3 Policy unable to handle environments with multidimensional action spaces (#951)
  • Spin off core packages (#960)
  • experiments failed (#982)
  • Breaking the tutorial by getting TotalRewardPerEpisode out of sync with the stopping condition in a run call (#1000)
  • Transfer Algorithms to RLFarm (#1028)
  • Update Buildkite script for gpu testing so it's sub package compatible (#1030)
  • Website: A practical introduction to RL: Does not introduce, source code is broken (#1036)
  • ElasticArraySARTSTraces does not record the trajectories of MountainCarEnv() correctly (#1067)
  • Algorithm implementations (#1070)

v0.11.0

1 month ago

ReinforcementLearning v0.11.0

Diff since v0.10.2

Merged pull requests:

  • Reactivate some tests for RLExperiments (#790) (@jeremiahpslewis)
  • Drop RL.jl as dependency from Experiments (#795) (@jeremiahpslewis)
  • Fix compat for RLBase (#796) (@jeremiahpslewis)
  • Fix RLCore version, prep for bump (#797) (@jeremiahpslewis)
  • Add reexport compat (#798) (@jeremiahpslewis)
  • Bump compat helper (#799) (@jeremiahpslewis)
  • Fix IntervalSets compat for RLEnvironments (#800) (@jeremiahpslewis)
  • Bump RLZoo.jl version for release (#815) (@jeremiahpslewis)
  • Fix RLExperiments compat (#816) (@jeremiahpslewis)
  • Expand RLZoo compat (#817) (@jeremiahpslewis)
  • Bump RLExperiments, require 0.11 (#818) (@jeremiahpslewis)
  • Pin ReinforcementLearningZoo.jl to 0.6 in RLExperiments (#819) (@jeremiahpslewis)
  • Drop RL.jl from CompatHelper (until refactor complete) (#824) (@jeremiahpslewis)
  • Bump Github Actions cache version (#825) (@jeremiahpslewis)
  • Basic allocation fixes for RandomWalk / RandomPolicy (#827) (@jeremiahpslewis)
  • Bump CI.yml GitHub action versions (#828) (@jeremiahpslewis)
  • Add tests, improve performance of RewardsPerEpisode (#829) (@jeremiahpslewis)
  • Refactor and add tests to TotalBatchRewardPerEpisode (#830) (@jeremiahpslewis)
  • Tests, refactor for TimePerStep (#831) (@jeremiahpslewis)
  • DoEveryNStep tests, performance tweaks (#832) (@jeremiahpslewis)
  • Add DoOnExit test (#833) (@jeremiahpslewis)
  • Expand PR Template (#835) (@jeremiahpslewis)
  • Fix branch name (master -> main) (#837) (@jeremiahpslewis)
  • Add test_noop! to remaining hooks (#840) (@jeremiahpslewis)
  • Make TimePerStep test robust (#841) (@jeremiahpslewis)
  • Reactivate docs (#842) (@jeremiahpslewis)
  • Add activate_devmode!() explanation to tips.md (#845) (@jeremiahpslewis)
  • Bump compat of RL.jl to 0.11.0-dev (#846) (@jeremiahpslewis)
  • add kwargs to agent (#847) (@HenriDeh)
  • Gaussian network refactor and tests (#849) (@HenriDeh)
  • Agent Refactor (#850) (@jeremiahpslewis)
  • Bump RLCore (#851) (@jeremiahpslewis)
  • Include codecov in CI (#854) (@HenriDeh)
  • Fix a typo in MPO (#855) (@HenriDeh)
  • DoEvery should not trigger on t = 1 (#856) (@HenriDeh)
  • update CI Julia version (#857) (@jeremiahpslewis)
  • Tweak CI to check on dep changes (#858) (@HenriDeh)
  • CompatHelper: bump compat for FillArrays to 1 for package ReinforcementLearningCore, (keep existing compat) (#859) (@github-actions[bot])
  • MultiAgent Proposal (#861) (@jeremiahpslewis)
  • CompatHelper: add new compat entry for ReinforcementLearningCore at version 0.9 for package ReinforcementLearningEnvironments, (keep existing compat) (#865) (@github-actions[bot])
  • Multiplayer Fixes (Clean up errors) (#867) (@jeremiahpslewis)
  • Added a section to the home page about getting help for Reinforcement… (#868) (@LooseTerrifyingSpaceMonkey)
  • Bump StatsBase compat (#873) (@jeremiahpslewis)
  • ComposedHooks, MultiHook fixes (#874) (@jeremiahpslewis)
  • Fix RLEnvs compat (#875) (@jeremiahpslewis)
  • Add back ComposedStop (#876) (@jeremiahpslewis)
  • Bump RLBase to v0.11.1 (#877) (@jeremiahpslewis)
  • Further refinements (#879) (@jeremiahpslewis)
  • Use multiple dispatch / methods plan! and act! (#880) (@jeremiahpslewis)
  • RLCore.update! -> Base.push! API change (#884) (@jeremiahpslewis)
  • Add compat for CommonRLInterface (#886) (@jeremiahpslewis)
  • Fix hook issues (#887) (@jeremiahpslewis)
  • CompatHelper: bump compat for ReinforcementLearningZoo to 0.6 for package ReinforcementLearningExperiments, (keep existing compat) (#888) (@github-actions[bot])
  • Stacknamespace (#889) (@HenriDeh)
  • allow more recent versions (#890) (@HenriDeh)
  • Fix stack (#891) (@HenriDeh)
  • CompatHelper: add new compat entry for DelimitedFiles at version 1 for package ReinforcementLearningEnvironments, (keep existing compat) (#894) (@github-actions[bot])
  • Update implement new alg docs (#896) (@jeremiahpslewis)
  • NFQ (#897) (@CasBex)
  • fixed problem with sequential multi agent envs (#898) (@Mytolo)
  • Sketch out optimise! refactor (#899) (@jeremiahpslewis)
  • Bug fix optimise! (#902) (@jeremiahpslewis)
  • Breaking changes to optimise! interface: Bump RLCore to v0.11 and RLZoo to v0.8 (#903) (@jeremiahpslewis)
  • Swap out rng code (#905) (@jeremiahpslewis)
  • CompatHelper: bump compat for NNlib to 0.9 for package ReinforcementLearningZoo, (keep existing compat) (#906) (@github-actions[bot])
  • Fix dispatch and update documentation (#907) (@HenriDeh)
  • QBasedPolicy optimise! forwards to learner. (#909) (@HenriDeh)
  • Bump RLZoo version for NNlib (#911) (@jeremiahpslewis)
  • Add performance testing run loop (#914) (@jeremiahpslewis)
  • Fix Timer bug (#915) (@jeremiahpslewis)
  • couple of improvements to MPO (#919) (@HenriDeh)
  • Rework the run loop (#921) (@HenriDeh)
  • adjusted pettingzoo to PettingZooEnv simultaneous environment more co… (#925) (@Mytolo)
  • fixed devmode / project files (#932) (@Mytolo)
  • fixed DQNLearner Gpu isse (#933) (@Mytolo)
  • fixing prob. /w symbol/ string correspondence (#934) (@Mytolo)
  • Bump flux compat (#935) (@jeremiahpslewis)
  • Reduce find_all_max allocations and increase speed based on chatgpt s… (#938) (@jeremiahpslewis)
  • Add Buildkite / GPU tests (#942) (@jeremiahpslewis)
  • Add RLZoo and RLExperiments to Buildkite (#943) (@jeremiahpslewis)
  • Drop deprecated @provide interface (#944) (@jeremiahpslewis)
  • CI Improvements (#946) (@jeremiahpslewis)
  • Github Actions Fixes (#947) (@jeremiahpslewis)
  • gpu updates RLExperiments, RLZoo (#949) (@jeremiahpslewis)
  • Bump RLCore version (#950) (@jeremiahpslewis)
  • Refactor TRPO and VPG with EpisodesSampler (#952) (@HenriDeh)
  • Fix TotalRewardPerEpisode bug (#953) (@jeremiahpslewis)
  • update docs to loop refactor (#955) (@HenriDeh)
  • CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningEnvironments, (keep existing compat) (#956) (@github-actions[bot])
  • CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningZoo, (keep existing compat) (#957) (@github-actions[bot])
  • CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningExperiments, (keep existing compat) (#958) (@github-actions[bot])
  • CompatHelper: add new compat entry for cuDNN at version 1 for package ReinforcementLearningCore, (keep existing compat) (#962) (@github-actions[bot])
  • CompatHelper: add new compat entry for cuDNN at version 1 for package ReinforcementLearningZoo, (keep existing compat) (#963) (@github-actions[bot])
  • CompatHelper: add new compat entry for cuDNN at version 1 for package ReinforcementLearningExperiments, (keep existing compat) (#964) (@github-actions[bot])
  • TargetNetwork (#966) (@HenriDeh)
  • CompatHelper: bump compat for GPUArrays to 9 for package ReinforcementLearningCore, (keep existing compat) (#969) (@github-actions[bot])
  • Prioritised DQN GPU (#974) (@CasBex)
  • CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningZoo, (keep existing compat) (#975) (@github-actions[bot])
  • CompatHelper: bump compat for ReinforcementLearningZoo to 0.8 for package ReinforcementLearningExperiments, (keep existing compat) (#976) (@github-actions[bot])
  • CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningExperiments, (keep existing compat) (#977) (@github-actions[bot])
  • Nfq refactor (#980) (@CasBex)
  • Fix and refactor SAC (#985) (@HenriDeh)
  • CompatHelper: bump compat for DomainSets to 0.7 for package ReinforcementLearningBase, (keep existing compat) (#986) (@github-actions[bot])
  • remove rlenv dep for tests (#989) (@HenriDeh)
  • CompatHelper: add new compat entry for CUDA at version 5 for package ReinforcementLearningExperiments, (keep existing compat) (#991) (@github-actions[bot])
  • CompatHelper: add new compat entry for IntervalSets at version 0.7 for package ReinforcementLearningExperiments, (keep existing compat) (#994) (@github-actions[bot])
  • Conservative Q-Learning (#995) (@HenriDeh)
  • CompatHelper: add new compat entry for Parsers at version 2 for package ReinforcementLearningCore, (keep existing compat) (#997) (@github-actions[bot])
  • CompatHelper: add new compat entry for MLUtils at version 0.4 for package ReinforcementLearningZoo, (keep existing compat) (#998) (@github-actions[bot])
  • CompatHelper: add new compat entry for Statistics at version 1 for package ReinforcementLearningCore, (keep existing compat) (#999) (@github-actions[bot])
  • Update CQL_SAC.jl (#1003) (@HenriDeh)
  • Bump tj-actions/changed-files from 35 to 41 in /.github/workflows (#1006) (@dependabot[bot])
  • Make it compatible with Adapt 4 and Metal 1 (#1008) (@joelreymont)
  • Bump RLCore, RLEnv (#1012) (@jeremiahpslewis)
  • Fix PPO per #1007 (#1013) (@jeremiahpslewis)
  • Fix RLCore version (#1018) (@jeremiahpslewis)
  • Add Devcontainer, handle DomainSets 0.7 (#1019) (@jeremiahpslewis)
  • Initial GPUArray transition (#1020) (@jeremiahpslewis)
  • Update TagBot.yml for subprojects (#1021) (@jeremiahpslewis)
  • Fix offline agent test (#1025) (@joelreymont)
  • Fix spell check CI errors (#1027) (@joelreymont)
  • GPU Code Migration Part 2.1 (#1029) (@jeremiahpslewis)
  • Bump RLZoo to v0.8 (#1031) (@jeremiahpslewis)
  • Fix RLZoo version (#1032) (@jeremiahpslewis)
  • Drop devmode, prepare RL.jl v0.11 for release (#1035) (@jeremiahpslewis)
  • Update docs script for new 'limited' RL.jl release (#1038) (@jeremiahpslewis)
  • Tabular Approximator fixes (pre v0.11 changes) (#1040) (@jeremiahpslewis)
  • Swap RLZoo for RLFarm in CI, drop RLExperiments (#1041) (@jeremiahpslewis)
  • Buildkite tweaks for monorepo (#1042) (@jeremiahpslewis)
  • Drop archived projects (#1043) (@jeremiahpslewis)
  • Simplify Experiment code after dropping RLExperiment (#1044) (@jeremiahpslewis)
  • Fix code coverage scope so it ignores test dir (#1045) (@jeremiahpslewis)
  • Fix reset and stop conditions (#1046) (@jeremiahpslewis)
  • Drop Functors and use Flux.@layer (#1048) (@jeremiahpslewis)
  • Fix naming consistency and add missing hook tests (#1049) (@jeremiahpslewis)
  • Add SARS tdlearning back to lib (#1050) (@jeremiahpslewis)
  • Update FluxModelApproximator references to FluxApproximator (#1051) (@jeremiahpslewis)
  • Epsilon Speedy Explorer (#1052) (@jeremiahpslewis)
  • Add TotalRewardPerEpisodeLastN hook (#1053) (@jeremiahpslewis)
  • Fix abstract_learner for multiplayer games (#1054) (@jeremiahpslewis)
  • Update versions (#1055) (@jeremiahpslewis)
  • Update Docs for v0.11 release (#1056) (@jeremiahpslewis)
  • Update Katex version, fix vulnerability (#1058) (@jeremiahpslewis)

Closed issues:

  • A3C (#133)
  • Implement TRPO/ACER (#134)
  • ViZDoom is broken (#130)
  • bullet3 environment (#128)
  • Box2D environment (#127)
  • Add MAgent (#125)
  • Regret Policy Gradients (#131)
  • Add MCTS related algorithms (#132)
  • bsuite (#124)
  • Implement Fully Parameterized Quantile Function for Distributional Reinforcement Learning. (#135)
  • Experimental support of Torch.jl (#136)
  • Add Game 2048 (#122)
  • add CUDA accelerated Env (#121)
  • Unify common network architectures and patterns (#139)
  • Asynchronous Methods for Deep Reinforcement Learning (#142)
  • R2D2 (#143)
  • Recurrent Models (#144)
  • Cross language support (#103)
  • Add an example running in K8S (#100)
  • Flux as service (#154)
  • Change clip_by_global_norm! into a Optimizer (#193)
  • Derivative-Free Reinforcement Learning (#206)
  • Support Tables.jl and PrettyTables.jl for Trajectories (#232)
  • Reinforcement Learning and Combinatorial Optimization (#250)
  • Model based reinforcement learning (#262)
  • Support CircularVectorSARTTrajectory RLZoo (#316)
  • Rename some functions to help beginners navigate source code (#326)
  • Support multiple discrete action space (#347)
  • Combine transformers and RL (#392)
  • How to display/render AtariEnv? (#546)
  • Refactor of DQN Algorithms (#557)
  • JuliaRL_BasicDQN_CartPole example fails (#568)
  • Gain in VPGPolicy does not account for terminal states? (#578)
  • Question: Can ReinforcementLearning.jl handle Partially Observed Markov Processes (POMDPs)? (#608)
  • Explain current implementation of PPO in detail (#620)
  • Make documentation on trace normalization (#633)
  • TDLearner time step parameter (#648)
  • estimate v.s. basis in policies (#677)
  • Q-learning update timing (#702)
  • various eligibility trace-equipped TD methods (#709)
  • Improve the logging mechanism during training (#725)
  • questions while looking at implementation of VPG (#729)
  • SAC example experiment does not work (#736)
  • Custom environment action and state space explanation (#738)
  • how to load a saved model and test it? (#755)
  • Bounds Error at UPDATE_FREQ Step (#758)
  • StopAfterEpisode returns 1 more episode using StepsPerEpisode() hook (#759)
  • Move basic definition of environment wrapper into RLBase (#760)
  • Precompilation error - DomainSets not in dependencies (#761)
  • How to set RLBase.state_space() if the number of state space is uncertain (#762)
  • how to use MultiAgentManager on different algorithms? (#764)
  • Example run of Offline RL that totally depends on dataset without online environment (#765)
  • Deep RL example for LSTM (#772)
  • MonteCarloLearner incorrect PreActStage behavior (#779)
  • Prioritised Experience Replay (#780)
  • Outdated dependencies (#781)
  • Running experiments throws a "UndefVarError: params not defined" message (#784)
  • Failing MPOCovariance experiment (#791)
  • Logo image was not found (#836)
  • Reactivate docs on CI/CD (#838)
  • update docs: Tips for developers (#844)
  • Package dependencies not compatible (#860)
  • need help from an expert (#862)
  • Installing ReinforcementLearning.jl downgrades Flux.jl (#864)
  • Fix Experiment type setup (#881)
  • LoadError: UndefVarError: params not defined (#882)
  • Rename update! to push! (#883)
  • Contribute Neural Fitted Q-iteration algorithm (#895)
  • PPo policy experiments failing (#910)
  • Executing RLBase.plan! after end of experiment (#913)
  • EpisodeSampler in Trajectories (#927)
  • Hook RewardsPerEpisode broken (#945)
  • Can implement this ARZ algorithm ? (#965)
  • AssertionError: action in env.action_space (#967)
  • Fixing SAC Policy (#970)
  • Prioritized DQN experiment nonfunctional (#971)
  • Prioritised DQN failing on GPU (#973)
  • An error (#983)
  • params() is no longer supported in Flux (#996)
  • GPU Compile error on PPO with MaskedPPOTrajectory (#1007)
  • RL Core tests fail sporadically (#1010)
  • RL Env tests fail with latest OpenSpiel patches (#1011)
  • Tutorial OpenSpiel KuhnOpenNSFP fails (#1024)
  • CI: Should spell check be dropped or fixed? (#1026)
  • Simple ReinforcementLearning example crashes (#1034)
  • Website: How do implement a new algorithm is outdated (#1037)
  • Review TabularApproximator (#1039)

ReinforcementLearningEnvironments-v0.9.0

1 month ago

ReinforcementLearningEnvironments ReinforcementLearningEnvironments-v0.9.0

Diff since ReinforcementLearningEnvironments-v0.8.8

Merged pull requests:

  • Bump RLZoo to v0.8 (#1031) (@jeremiahpslewis)
  • Fix RLZoo version (#1032) (@jeremiahpslewis)
  • Drop devmode, prepare RL.jl v0.11 for release (#1035) (@jeremiahpslewis)
  • Update docs script for new 'limited' RL.jl release (#1038) (@jeremiahpslewis)
  • Tabular Approximator fixes (pre v0.11 changes) (#1040) (@jeremiahpslewis)
  • Swap RLZoo for RLFarm in CI, drop RLExperiments (#1041) (@jeremiahpslewis)
  • Buildkite tweaks for monorepo (#1042) (@jeremiahpslewis)
  • Drop archived projects (#1043) (@jeremiahpslewis)
  • Simplify Experiment code after dropping RLExperiment (#1044) (@jeremiahpslewis)
  • Fix code coverage scope so it ignores test dir (#1045) (@jeremiahpslewis)
  • Fix reset and stop conditions (#1046) (@jeremiahpslewis)
  • Drop Functors and use Flux.@layer (#1048) (@jeremiahpslewis)
  • Fix naming consistency and add missing hook tests (#1049) (@jeremiahpslewis)
  • Add SARS tdlearning back to lib (#1050) (@jeremiahpslewis)
  • Update FluxModelApproximator references to FluxApproximator (#1051) (@jeremiahpslewis)
  • Epsilon Speedy Explorer (#1052) (@jeremiahpslewis)
  • Add TotalRewardPerEpisodeLastN hook (#1053) (@jeremiahpslewis)
  • Fix abstract_learner for multiplayer games (#1054) (@jeremiahpslewis)
  • Update versions (#1055) (@jeremiahpslewis)
  • Update Docs for v0.11 release (#1056) (@jeremiahpslewis)
  • Update Katex version, fix vulnerability (#1058) (@jeremiahpslewis)

Closed issues:

  • Simple ReinforcementLearning example crashes (#1034)
  • Website: How do implement a new algorithm is outdated (#1037)
  • Review TabularApproximator (#1039)

ReinforcementLearningCore-v0.15.0

1 month ago

ReinforcementLearningCore ReinforcementLearningCore-v0.15.0

Diff since ReinforcementLearningCore-v0.14.0

Merged pull requests:

  • Bump RLZoo to v0.8 (#1031) (@jeremiahpslewis)
  • Fix RLZoo version (#1032) (@jeremiahpslewis)
  • Drop devmode, prepare RL.jl v0.11 for release (#1035) (@jeremiahpslewis)
  • Update docs script for new 'limited' RL.jl release (#1038) (@jeremiahpslewis)
  • Tabular Approximator fixes (pre v0.11 changes) (#1040) (@jeremiahpslewis)
  • Swap RLZoo for RLFarm in CI, drop RLExperiments (#1041) (@jeremiahpslewis)
  • Buildkite tweaks for monorepo (#1042) (@jeremiahpslewis)
  • Drop archived projects (#1043) (@jeremiahpslewis)
  • Simplify Experiment code after dropping RLExperiment (#1044) (@jeremiahpslewis)
  • Fix code coverage scope so it ignores test dir (#1045) (@jeremiahpslewis)
  • Fix reset and stop conditions (#1046) (@jeremiahpslewis)
  • Drop Functors and use Flux.@layer (#1048) (@jeremiahpslewis)
  • Fix naming consistency and add missing hook tests (#1049) (@jeremiahpslewis)
  • Add SARS tdlearning back to lib (#1050) (@jeremiahpslewis)
  • Update FluxModelApproximator references to FluxApproximator (#1051) (@jeremiahpslewis)
  • Epsilon Speedy Explorer (#1052) (@jeremiahpslewis)
  • Add TotalRewardPerEpisodeLastN hook (#1053) (@jeremiahpslewis)
  • Fix abstract_learner for multiplayer games (#1054) (@jeremiahpslewis)
  • Update versions (#1055) (@jeremiahpslewis)
  • Update Docs for v0.11 release (#1056) (@jeremiahpslewis)
  • Update Katex version, fix vulnerability (#1058) (@jeremiahpslewis)

Closed issues:

  • Simple ReinforcementLearning example crashes (#1034)
  • Website: How do implement a new algorithm is outdated (#1037)
  • Review TabularApproximator (#1039)

ReinforcementLearningBase-v0.13.0

1 month ago

ReinforcementLearningBase ReinforcementLearningBase-v0.13.0

Diff since ReinforcementLearningBase-v0.12.2

Merged pull requests:

  • Fix offline agent test (#1025) (@joelreymont)
  • Fix spell check CI errors (#1027) (@joelreymont)
  • GPU Code Migration Part 2.1 (#1029) (@jeremiahpslewis)
  • Bump RLZoo to v0.8 (#1031) (@jeremiahpslewis)
  • Fix RLZoo version (#1032) (@jeremiahpslewis)
  • Drop devmode, prepare RL.jl v0.11 for release (#1035) (@jeremiahpslewis)
  • Update docs script for new 'limited' RL.jl release (#1038) (@jeremiahpslewis)
  • Tabular Approximator fixes (pre v0.11 changes) (#1040) (@jeremiahpslewis)
  • Swap RLZoo for RLFarm in CI, drop RLExperiments (#1041) (@jeremiahpslewis)
  • Buildkite tweaks for monorepo (#1042) (@jeremiahpslewis)
  • Drop archived projects (#1043) (@jeremiahpslewis)
  • Simplify Experiment code after dropping RLExperiment (#1044) (@jeremiahpslewis)
  • Fix code coverage scope so it ignores test dir (#1045) (@jeremiahpslewis)
  • Fix reset and stop conditions (#1046) (@jeremiahpslewis)
  • Drop Functors and use Flux.@layer (#1048) (@jeremiahpslewis)
  • Fix naming consistency and add missing hook tests (#1049) (@jeremiahpslewis)
  • Add SARS tdlearning back to lib (#1050) (@jeremiahpslewis)
  • Update FluxModelApproximator references to FluxApproximator (#1051) (@jeremiahpslewis)
  • Epsilon Speedy Explorer (#1052) (@jeremiahpslewis)
  • Add TotalRewardPerEpisodeLastN hook (#1053) (@jeremiahpslewis)
  • Fix abstract_learner for multiplayer games (#1054) (@jeremiahpslewis)
  • Update versions (#1055) (@jeremiahpslewis)
  • Update Docs for v0.11 release (#1056) (@jeremiahpslewis)
  • Update Katex version, fix vulnerability (#1058) (@jeremiahpslewis)

Closed issues:

  • RL Core tests fail sporadically (#1010)
  • Tutorial OpenSpiel KuhnOpenNSFP fails (#1024)
  • CI: Should spell check be dropped or fixed? (#1026)
  • Simple ReinforcementLearning example crashes (#1034)
  • Website: How do implement a new algorithm is outdated (#1037)
  • Review TabularApproximator (#1039)

ReinforcementLearningExperiments-v0.4.0

2 months ago

ReinforcementLearningExperiments ReinforcementLearningExperiments-v0.4.0

Merged pull requests:

  • Fix deprecations (#10) (@femtocleaner[bot])
  • implement epsilon-greedy policy with parametric type (#12) (@jbrea)
  • improve docs (#13) (@jbrea)
  • refactor policies (#15) (@jbrea)
  • Add ReinforcementLearningBase as dependent (#16) (@jbrea)
  • fix examples (#18) (@jbrea)
  • refactor existing components (#26) (@findmyway)
  • Prioritized dqn (#29) (@findmyway)
  • add double dqn (#30) (@findmyway)
  • add rainbow (#31) (@findmyway)
  • use new api in ReinforcementLearningEnvironments.jl (#33) (@findmyway)
  • bugfix and api simplification (#34) (@findmyway)
  • Switch Tracker.jl to Zygote.jl (#37) (@findmyway)
  • Support both Knet and Flux(with Zygote) (#38) (@findmyway)
  • add docs (#39) (@findmyway)
  • export AbstractActionSelector and add more comments (#42) (@findmyway)
  • Refactor buffer (#45) (@findmyway)
  • fix example in doc && update examples (#46) (@findmyway)
  • fix a performance bug in rainbow (#47) (@findmyway)
  • update dependencies (#48) (@findmyway)
  • update dependencies and docs (#49) (@findmyway)
  • update benchmark for circular_array_buffer (#50) (@findmyway)
  • Install TagBot as a GitHub Action (#53) (@JuliaTagBot)
  • clean up code (#54) (@findmyway)
  • add compat (#55) (@findmyway)
  • CompatHelper: add new compat entry for "Reexport" at version "0.2" (#56) (@github-actions[bot])
  • add documentation stage in travis (#57) (@findmyway)
  • Add doc in travis (#58) (@findmyway)
  • Fix link in docs/src/index.md (#60) (@amanbh)
  • Update doc (#61) (@findmyway)
  • Update README.md & website link (#70) (@findmyway)
  • Update dependency (#78) (@findmyway)
  • MassInstallAction: Install the CompatHelper workflow on this repository (#99) (@findmyway)
  • CompatHelper: bump compat for "Reexport" to "1.0" (#172) (@github-actions[bot])
  • update dependency (#177) (@findmyway)
  • Add Dockerfile (#187) (@findmyway)
  • Update readme (#188) (@findmyway)
  • docs: add findmyway as a contributor (#189) (@allcontributors[bot])
  • docs: add drozzy as a contributor (#195) (@allcontributors[bot])
  • docs: add rcnlee as a contributor (#199) (@allcontributors[bot])
  • docs: add norci as a contributor (#200) (@allcontributors[bot])
  • docs: add xiruizhao as a contributor (#203) (@allcontributors[bot])
  • docs: add metab0t as a contributor (#204) (@allcontributors[bot])
  • docs: add albheim as a contributor (#207) (@allcontributors[bot])
  • docs: add michelangelo21 as a contributor (#214) (@allcontributors[bot])
  • docs: add pilgrimygy as a contributor (#216) (@allcontributors[bot])
  • docs: add Mobius1D as a contributor (#218) (@allcontributors[bot])
  • docs: add ilancoulon as a contributor (#224) (@allcontributors[bot])
  • docs: add pilgrimygy as a contributor (#230) (@allcontributors[bot])
  • docs: add JinraeKim as a contributor (#243) (@allcontributors[bot])
  • Prepare v0.9 (#252) (@findmyway)
  • docs: add luigiannelli as a contributor (#254) (@allcontributors[bot])
  • docs: add JBoerma as a contributor (#255) (@allcontributors[bot])
  • CompatHelper: bump compat for "ReinforcementLearningEnvironments" to "0.5" (#260) (@github-actions[bot])
  • Fix inconsitencies in wrappers (#263) (@albheim)
  • setup CI for each subpackages (#264) (@findmyway)
  • Fix atari experiments (#265) (@Mobius1D)
  • Add timeperstep hook to qrdqn to fix test error (#266) (@albheim)
  • Update Flux version (#267) (@findmyway)
  • Setup docs generation pipeline (#269) (@findmyway)
  • Misc doc related fixes (#270) (@findmyway)
  • Update README.md (#271) (@findmyway)
  • docs: add JinraeKim as a contributor (#272) (@allcontributors[bot])
  • Improve docs GitHub action (#273) (@findmyway)
  • Fix docs pipeline (#275) (@findmyway)
  • update readme (#276) (@findmyway)
  • CompatHelper: add new compat entry for "UnicodePlots" at version "1.3" for package ReinforcementLearningCore (#277) (@github-actions[bot])
  • CompatHelper: bump compat for "Distributions" to "0.25" for package ReinforcementLearningCore (#278) (@github-actions[bot])
  • CompatHelper: bump compat for "Distributions" to "0.25" for package ReinforcementLearningZoo (#279) (@github-actions[bot])
  • docs: add plu70n as a contributor (#282) (@allcontributors[bot])
  • Fix bug in CI (#283) (@findmyway)
  • Use Weave.jl to generate RLExperiments (#284) (@findmyway)
  • QRDQN experiment reproducibility fix (#294) (@ashwani-rathee)
  • Add Manifest.toml (#295) (@findmyway)
  • docs: add ashwani-rathee as a contributor (#296) (@allcontributors[bot])
  • Add basic doc structure (#300) (@findmyway)
  • Update guide (#302) (@findmyway)
  • Update experiments (#303) (@findmyway)
  • fix figs (#304) (@findmyway)
  • Fix some simple experiments (#308) (@findmyway)
  • add plotting for cartpole and mountaincar with Plots.jl (#309) (@jamblejoe)
  • Remove GR in RLEnvs (#310) (@findmyway)
  • docs: add jamblejoe as a contributor (#311) (@allcontributors[bot])
  • Add compat of [email protected] in ReinforcementLearningExperiments (#312) (@findmyway)
  • Add example of SimplexSpace (#313) (@findmyway)
  • Improve tutorial (#314) (@findmyway)
  • Fix Atari related experiments (#315) (@findmyway)
  • CompatHelper: add new compat entry for "ImageTransformations" at version "0.8" for package ReinforcementLearningExperiments (#317) (@github-actions[bot])
  • CompatHelper: add new compat entry for "ArcadeLearningEnvironment" at version "0.2" for package ReinforcementLearningExperiments (#318) (@github-actions[bot])
  • CompatHelper: add new compat entry for "CUDA" at version "3" for package ReinforcementLearningExperiments (#319) (@github-actions[bot])
  • update tips (#321) (@findmyway)
  • CompatHelper: bump compat for "GPUArrays" to "7" for package ReinforcementLearningCore (#322) (@github-actions[bot])
  • docs: add albheim as a contributor for doc (#323) (@allcontributors[bot])
  • Fix broken test (#325) (@Mobius1D)
  • Add a warning in docstring of state (#327) (@findmyway)
  • Update doc string of PrioritizedDQNLearner (#329) (@findmyway)
  • Expand DDPG to multi action spaces (#330) (@Mobius1D)
  • CompatHelper: bump compat for "StructArrays" to "0.6" for package ReinforcementLearningZoo (#331) (@github-actions[bot])
  • fix 332 (#333) (@findmyway)
  • correct spelling in FAQ (#334) (@ultradian)
  • docs: add ultradian as a contributor for doc (#335) (@allcontributors[bot])
  • fix typo (#338) (@findmyway)
  • docs: add eltociear as a contributor for doc (#339) (@allcontributors[bot])
  • CompatHelper: bump compat for "FillArrays" to "0.12" for package ReinforcementLearningCore (#340) (@github-actions[bot])
  • Add copyto function (#345) (@pilgrimygy)
  • add Base.:(==) and Base.hash for AbstractEnv and test nash_conv on KuhnPokerEnv (#348) (@peterchen96)
  • Fix legal_actions_mask indexing error in CircularSLART (#350) (@findmyway)
  • bump version of RLCore (#351) (@findmyway)
  • bump version of RLBae (#352) (@findmyway)
  • add LICENSE in RLBase (#353) (@findmyway)
  • bump version of RLZoo (#355) (@findmyway)
  • docs: add 00krishna as a contributor for bug (#356) (@allcontributors[bot])
  • Add the tuning entropy component (#365) (@pilgrimygy)
  • Make general components (#370) (@pilgrimygy)
  • add weighted_softmax_explorer in the explorers.jl (#382) (@peterchen96)
  • Supplement functions in ReservoirTrajectory and BehaviorCloningPolicy (#390) (@peterchen96)
  • Update Flux version (#391) (@findmyway)
  • AddSequentialEnv (#394) (@findmyway)
  • Throw error in MultiAgentManager if it is applied to a SIMULTANEOUS env (#395) (@findmyway)
  • docs: add pkienscherf as a contributor for bug (#396) (@allcontributors[bot])
  • Implementation of NFSP and NFSP_KuhnPoker experiment (#402) (@peterchen96)
  • Updated RLDatasets.jl (#403) (@Mobius1D)
  • Gym d4rl extension (#405) (@Mobius1D)
  • updates as per GridWorlds v0.5.0 (#406) (@Sid-Bhatia-0)
  • Reduce allocations and update docstring for GaussianNetwork (#414) (@albheim)
  • Fix a bug (#415) (@pilgrimygy)
  • Expand to d4rl-pybullet (#416) (@Mobius1D)
  • docs: add pilgrimygy as a contributor for bug (#417) (@allcontributors[bot])
  • Fix 418 (#420) (@findmyway)
  • docs: add Krastanov as a contributor for doc (#422) (@allcontributors[bot])
  • Make SAC policy use internal rng (#423) (@albheim)
  • Add wrapped_env[!] to access env inside wrappers (#426) (@albheim)
  • add stock trading env (#428) (@findmyway)
  • Add Atari datasets released by Google Research (#429) (@Mobius1D)
  • add kwargs to plot(env::) (#430) (@jamblejoe)
  • Unify parameter names (#437) (@pilgrimygy)
  • docs: add albheim as a contributor for maintenance (#438) (@allcontributors[bot])
  • correct nfsp implementation (#439) (@peterchen96)
  • update the nfsp experiment's parameters (#440) (@peterchen96)
  • Tiny text typo (#441) (@Nthman)
  • docs: add LaarsOman as a contributor for doc (#442) (@allcontributors[bot])
  • Add pre-train step; VAE component; CRR and PLAS algorithms (#443) (@pilgrimygy)
  • add MADDPG algorithm (#444) (@peterchen96)
  • add_Graph_Shortest_Path (#445) (@findmyway)
  • try to fix bugs of ActionTransformedEnv (#447) (@peterchen96)
  • Update report (#448) (@pilgrimygy)
  • Summer ospp project 210370190 mid-term report (#449) (@peterchen96)
  • add maxdepth kwarg to remove print_tree() deprecation warning (#450) (@burmecia)
  • docs: add burmecia as a contributor for code (#451) (@allcontributors[bot])
  • RL unplugged implementation with tests (#452) (@Mobius1D)
  • Update report (#453) (@Mobius1D)
  • disable notebook generation (#454) (@johnnychen94)
  • Revert "Update report" (#456) (@findmyway)
  • Update report (#457) (@pilgrimygy)
  • fix installation docs (#458) (@Mobius1D)
  • docs: add peterchen96 as a contributor for code, doc (#459) (@allcontributors[bot])
  • Create LICENSE (#461) (@Mobius1D)
  • Add docs (#462) (@Mobius1D)
  • Fix make.jl (#463) (@Mobius1D)
  • Delete LICENSE (#465) (@Mobius1D)
  • fix CI (#466) (@findmyway)
  • Fix RLDatasets.jl documentation (#467) (@Mobius1D)
  • update report (#468) (@peterchen96)
  • Fix ci (#469) (@Mobius1D)
  • Update maddpg and the report (#470) (@peterchen96)
  • Report (#474) (@pilgrimygy)
  • Control whether run displays description of experiment (#477) (@ShuhuaGao)
  • docs: add ShuhuaGao as a contributor for code, question (#478) (@allcontributors[bot])
  • Chancestyle doc update (#479) (@ShuhuaGao)
  • FisherBRC algorithm and update docs (#480) (@pilgrimygy)
  • Add the experiment of MADDPG. (#481) (@peterchen96)
  • Add bsuite datasets (#482) (@Mobius1D)
  • update report and add reward bonus (#483) (@pilgrimygy)
  • Update manifest (#484) (@Mobius1D)
  • add GPU support for GaussianNetwork, fix #455 (#486) (@burmecia)
  • Update experiments of maddpg (#487) (@peterchen96)
  • Update prob function of QBasedPolicy. (#488) (@peterchen96)
  • Update report. (#489) (@peterchen96)
  • Fix gsutil for windows and fix docs (#491) (@Mobius1D)
  • add vmpo algorithm (#492) (@burmecia)
  • update vae (#494) (@pilgrimygy)
  • Add dm datasets (#495) (@Mobius1D)
  • Play OpenSpiel envs with NFSP and try to add ED algorithm. (#496) (@peterchen96)
  • fix bug (#497) (@pilgrimygy)
  • Update BEAR algorithm (#498) (@pilgrimygy)
  • More efficient float32 randn (#499) (@albheim)
  • Add support for deep ope in RLDatasets.jl (#500) (@Mobius1D)
  • update BCQ (#501) (@pilgrimygy)
  • update discrete BCQ (#502) (@pilgrimygy)
  • update offline RL experiment (#507) (@pilgrimygy)
  • Update ED algorithm and the report. (#508) (@peterchen96)
  • cpu and gpu test (#509) (@pilgrimygy)
  • Fix dispatch for is_discrete_space (#510) (@johannes-fischer)
  • docs: add johannes-fischer as a contributor for code (#511) (@allcontributors[bot])
  • update report (#512) (@pilgrimygy)
  • update report (#513) (@peterchen96)
  • Fix random net init in sac example (#514) (@albheim)
  • WIP to implement FQE (#515) (@Mobius1D)
  • OSPP Report for RLDatasets.jl (#516) (@Mobius1D)
  • Update report (#518) (@Mobius1D)
  • Update reward wrappers to be more consistent (#519) (@albheim)
  • fixed findmax unconsistency (#521) (@3rdCore)
  • docs: add 3rdCore as a contributor for bug, code (#522) (@allcontributors[bot])
  • close #493 (#523) (@findmyway)
  • Update compat & version (#524) (@findmyway)
  • Fix rlexps (#525) (@findmyway)
  • Bump rlenvs (#526) (@findmyway)
  • close #527 (#528) (@bhatiaabhinav)
  • docs: add bhatiaabhinav as a contributor for bug, code (#529) (@allcontributors[bot])
  • Refine the doc and make minor changes of TabularApproximator (#532) (@ShuhuaGao)
  • Fix bug in MaskedPPOTrajectory (#533) (@findmyway)
  • bugfix with ZeroTo (#534) (@findmyway)
  • Revert unexpected change in PPO (#535) (@findmyway)
  • Fix 530 (#536) (@findmyway)
  • Improves plotting for classical control experiments (#537) (@harwiltz)
  • Fix rldatasets (#538) (@findmyway)
  • docs: add harwiltz as a contributor for code, doc (#539) (@allcontributors[bot])
  • Bump version (#540) (@findmyway)
  • fix RLIntro#63 (#541) (@findmyway)
  • fix RLIntro#64 (#542) (@findmyway)
  • Added a continuous option for CartPoleEnv (#543) (@dylan-asmar)
  • docs: add dylan-asmar as a contributor for code (#544) (@allcontributors[bot])
  • Bump version (#545) (@findmyway)
  • Fix bug in cart pole float32 (#547) (@findmyway)
  • Update links to RLIntro (#548) (@findmyway)
  • Make experiments GPU compatible (#549) (@findmyway)
  • Add compat (#550) (@findmyway)
  • Bugfix with cart pole env (#552) (@findmyway)
  • make bc gpu compatable (#553) (@findmyway)
  • docs: add andreyzhitnikov as a contributor for bug (#554) (@allcontributors[bot])
  • Small typo (#555) (@kir0ul)
  • docs: add kir0ul as a contributor for doc (#556) (@allcontributors[bot])
  • Fix/rand dummy action (#559) (@mo8it)
  • Fix warning about kwargs.data (#560) (@mo8it)
  • docs: add Mo8it as a contributor for code (#561) (@allcontributors[bot])
  • Fix dummy action for continuous action spaces (#562) (@mo8it)
  • Fix/rand interval (#563) (@mo8it)
  • Remove unneeded method (#564) (@mo8it)
  • Fix typo in ospp_final_term_report_210370741/index.md (#565) (@eltociear)
  • Fix 566 (#567) (@findmyway)
  • Fix documentation for environments (#570) (@blegat)
  • docs: add blegat as a contributor for doc (#571) (@allcontributors[bot])
  • fix #569 (#573) (@findmyway)
  • CompatHelper: bump compat for "ArrayInterface" to "4" for package ReinforcementLearningCore (#574) (@github-actions[bot])
  • bump version of RLCore and RLZoo (#576) (@findmyway)
  • Update EpsilonGreedyExplorer example (#577) (@kir0ul)
  • CompatHelper: bump compat for "FillArrays" to "0.13" for package ReinforcementLearningCore (#583) (@github-actions[bot])
  • Default qnetwork initializer (#586) (@HenriDeh)
  • docs: add HenriDeh as a contributor for code, doc (#587) (@allcontributors[bot])
  • using act_limit parameter in target_actor (#588) (@NPLawrence)
  • docs: add NPLawrence as a contributor for code (#589) (@allcontributors[bot])
  • CompatHelper: bump compat for "ArrayInterface" to "5" for package ReinforcementLearningCore (#590) (@github-actions[bot])
  • Fix documentation typo (#591) (@kir0ul)
  • Fixing and generalizing GaussianNetwork (#592) (@HenriDeh)
  • Fix typos in docs (#593) (@bileamScheuvens)
  • docs: add bileamScheuvens as a contributor for doc (#594) (@allcontributors[bot])
  • Add CovGaussianNetwork to work with covariance (#597) (@HenriDeh)
  • Fixing Gaussian Network gradient (#598) (@HenriDeh)
  • Update Supporting (#599) (@findmyway)
  • docs: add harwiltz as a contributor for bug (#601) (@allcontributors[bot])
  • Rewrite initialization of StackFrames (#602) (@findmyway)
  • fix test logdetLorU with Float64 (#603) (@HenriDeh)
  • WIP: Add MPO in zoo (#604) (@HenriDeh)
  • fix #605 (#606) (@findmyway)
  • docs: add jarbus as a contributor for bug (#607) (@allcontributors[bot])
  • Add a reward normalizer (#609) (@HenriDeh)
  • Episode reset condition (#621) (@HenriDeh)
  • cspell add Optimise (#622) (@HenriDeh)
  • Add a categorical Network (#625) (@HenriDeh)
  • write doc (#627) (@HenriDeh)
  • fix #624 (#628) (@findmyway)
  • docs: add tyleringebrand as a contributor for bug (#629) (@allcontributors[bot])
  • Update How_to_implement_a_new_algorithm.md (#630) (@HenriDeh)
  • add a new notebook (#631) (@findmyway)
  • Use Trajectories.jl instead (#632) (@findmyway)
  • created fallback implementation for legal_action_space_mask (#644) (@baedan)
  • update node version (#645) (@findmyway)
  • docs: add baedan as a contributor for code (#646) (@allcontributors[bot])
  • Tag the latest code as v0.10.1 (#647) (@findmyway)
  • added basic doc for TDLearner (#649) (@baedan)
  • Add JuliaRL_DQN_CartPole (#650) (@findmyway)
  • enable OpenSpiel (#691) (@findmyway)
  • Small improvements for TicTacToeEnv (#692) (@jonathan-laurent)
  • Update the "how to implement a new algorithm" (#695) (@HenriDeh)
  • Fix typo in algorithm implementation docs (#697) (@mplemay)
  • add PrioritizedDQN (#698) (@findmyway)
  • add QRDQN (#699) (@findmyway)
  • add REMDQN (#708) (@findmyway)
  • add IQN (#710) (@findmyway)
  • checkin Mainifest.toml (#711) (@findmyway)
  • CompatHelper: bump compat for "ReinforcementLearningCore" to "0.8" (#712) (@github-actions[bot])
  • CompatHelper: bump compat for "ReinforcementLearningEnvironments" to "0.6" (#713) (@github-actions[bot])
  • CompatHelper: bump compat for "ReinforcementLearningZoo" to "0.5" (#714) (@github-actions[bot])
  • CompatHelper: bump compat for "AbstractTrees" to "0.4" for package ReinforcementLearningBase (#715) (@github-actions[bot])
  • CompatHelper: bump compat for "Functors" to "0.3" for package ReinforcementLearningCore (#717) (@github-actions[bot])
  • CompatHelper: bump compat for "UnicodePlots" to "3" for package ReinforcementLearningCore (#718) (@github-actions[bot])
  • CompatHelper: bump compat for "ReinforcementLearningCore" to "0.8" for package ReinforcementLearningZoo (#720) (@github-actions[bot])
  • CompatHelper: bump compat for "Functors" to "0.3" for package ReinforcementLearningZoo (#721) (@github-actions[bot])
  • CompatHelper: add new compat entry for "StableRNGs" at version "1" for package ReinforcementLearningExperiments (#722) (@github-actions[bot])
  • CompatHelper: bump compat for "ReinforcementLearning" to "0.10" for package ReinforcementLearningExperiments (#723) (@github-actions[bot])
  • add rainbow (#724) (@findmyway)
  • Adapted SAC to support MultiThreadedEnv (#726) (@BigFood2307)
  • Add the number of episodes (#727) (@ll7)
  • docs: add ll7 as a contributor for doc (#728) (@allcontributors[bot])
  • Add struct view (#732) (@findmyway)
  • add VPG (#733) (@findmyway)
  • CompatHelper: add new compat entry for "Distributions" at version "0.25" for package ReinforcementLearningZoo (#734) (@github-actions[bot])
  • CompatHelper: add new compat entry for "Distributions" at version "0.25" for package ReinforcementLearningExperiments (#735) (@github-actions[bot])
  • fixed hyperlink in readme (#742) (@mplemay)
  • docs: add mplemay as a contributor for doc (#743) (@allcontributors[bot])
  • Create FUNDING.yml (#746) (@findmyway)
  • TRPO (#747) (@baedan)
  • CompatHelper: bump compat for "CommonRLSpaces" to "0.2" for package ReinforcementLearningBase (#748) (@github-actions[bot])
  • Fix parameter names for AsyncTrajectoryStyle (#749) (@ludvigk)
  • Update DoEveryNEpisode hook to new api (#750) (@ludvigk)
  • docs: add ludvigk as a contributor for code (#751) (@allcontributors[bot])
  • Update TwinNetwork (#752) (@ludvigk)
  • Typo in hooks docs (#754) (@kir0ul)
  • CommonRLSpace -> DomainSets (#756) (@findmyway)
  • Fix typo (#767) (@jeremiahpslewis)
  • Fix typo (#768) (@jeremiahpslewis)
  • Fix TD Learner so that it handles MultiAgent/Simultaneous with NoOp (#769) (@jeremiahpslewis)
  • Bump RLBase compat to 0.11 (#771) (@HenriDeh)
  • Remove manifest from the repo (#773) (@HenriDeh)
  • import params and gradient (#774) (@HenriDeh)
  • fix compat (#775) (@HenriDeh)
  • Trying to reimplement experiments (#776) (@HenriDeh)
  • Add a developer mode (#777) (@HenriDeh)
  • added pettingzoo and one single agent example (#782) (@Mytolo)
  • Update mpo.jl (#783) (@HenriDeh)
  • Reduce unnecessary array allocations (#785) (@jeremiahpslewis)
  • Temporarily disable failing experiment so project tests pass (#787) (@jeremiahpslewis)
  • Fix spellcheck errors (#788) (@jeremiahpslewis)
  • Bug fixes and dependency bump (#789) (@jeremiahpslewis)
  • Reactivate some tests for RLExperiments (#790) (@jeremiahpslewis)
  • Pin ReinforcementLearning.jl to pre-refactor versions (#793) (@jeremiahpslewis)
  • Update TagBot.yml (#794) (@jeremiahpslewis)
  • Drop RL.jl as dependency from Experiments (#795) (@jeremiahpslewis)
  • Fix compat for RLBase (#796) (@jeremiahpslewis)
  • Fix RLCore version, prep for bump (#797) (@jeremiahpslewis)
  • Add reexport compat (#798) (@jeremiahpslewis)
  • Bump compat helper (#799) (@jeremiahpslewis)
  • Fix IntervalSets compat for RLEnvironments (#800) (@jeremiahpslewis)
  • Bump RLZoo.jl version for release (#815) (@jeremiahpslewis)
  • Fix RLExperiments compat (#816) (@jeremiahpslewis)
  • Expand RLZoo compat (#817) (@jeremiahpslewis)
  • Bump RLExperiments, require 0.11 (#818) (@jeremiahpslewis)
  • Pin ReinforcementLearningZoo.jl to 0.6 in RLExperiments (#819) (@jeremiahpslewis)
  • Drop RL.jl from CompatHelper (until refactor complete) (#824) (@jeremiahpslewis)
  • Bump Github Actions cache version (#825) (@jeremiahpslewis)
  • Basic allocation fixes for RandomWalk / RandomPolicy (#827) (@jeremiahpslewis)
  • Bump CI.yml GitHub action versions (#828) (@jeremiahpslewis)
  • Add tests, improve performance of RewardsPerEpisode (#829) (@jeremiahpslewis)
  • Refactor and add tests to TotalBatchRewardPerEpisode (#830) (@jeremiahpslewis)
  • Tests, refactor for TimePerStep (#831) (@jeremiahpslewis)
  • DoEveryNStep tests, performance tweaks (#832) (@jeremiahpslewis)
  • Add DoOnExit test (#833) (@jeremiahpslewis)
  • Expand PR Template (#835) (@jeremiahpslewis)
  • Fix branch name (master -> main) (#837) (@jeremiahpslewis)
  • Add test_noop! to remaining hooks (#840) (@jeremiahpslewis)
  • Make TimePerStep test robust (#841) (@jeremiahpslewis)
  • Reactivate docs (#842) (@jeremiahpslewis)
  • Add activate_devmode!() explanation to tips.md (#845) (@jeremiahpslewis)
  • Bump compat of RL.jl to 0.11.0-dev (#846) (@jeremiahpslewis)
  • add kwargs to agent (#847) (@HenriDeh)
  • Gaussian network refactor and tests (#849) (@HenriDeh)
  • Agent Refactor (#850) (@jeremiahpslewis)
  • Bump RLCore (#851) (@jeremiahpslewis)
  • Include codecov in CI (#854) (@HenriDeh)
  • Fix a typo in MPO (#855) (@HenriDeh)
  • DoEvery should not trigger on t = 1 (#856) (@HenriDeh)
  • update CI Julia version (#857) (@jeremiahpslewis)
  • Tweak CI to check on dep changes (#858) (@HenriDeh)
  • CompatHelper: bump compat for FillArrays to 1 for package ReinforcementLearningCore, (keep existing compat) (#859) (@github-actions[bot])
  • MultiAgent Proposal (#861) (@jeremiahpslewis)
  • CompatHelper: add new compat entry for ReinforcementLearningCore at version 0.9 for package ReinforcementLearningEnvironments, (keep existing compat) (#865) (@github-actions[bot])
  • Multiplayer Fixes (Clean up errors) (#867) (@jeremiahpslewis)
  • Added a section to the home page about getting help for Reinforcement… (#868) (@LooseTerrifyingSpaceMonkey)
  • Bump StatsBase compat (#873) (@jeremiahpslewis)
  • ComposedHooks, MultiHook fixes (#874) (@jeremiahpslewis)
  • Fix RLEnvs compat (#875) (@jeremiahpslewis)
  • Add back ComposedStop (#876) (@jeremiahpslewis)
  • Bump RLBase to v0.11.1 (#877) (@jeremiahpslewis)
  • Further refinements (#879) (@jeremiahpslewis)
  • Use multiple dispatch / methods plan! and act! (#880) (@jeremiahpslewis)
  • RLCore.update! -> Base.push! API change (#884) (@jeremiahpslewis)
  • Add compat for CommonRLInterface (#886) (@jeremiahpslewis)
  • Fix hook issues (#887) (@jeremiahpslewis)
  • CompatHelper: bump compat for ReinforcementLearningZoo to 0.6 for package ReinforcementLearningExperiments, (keep existing compat) (#888) (@github-actions[bot])
  • Stacknamespace (#889) (@HenriDeh)
  • allow more recent versions (#890) (@HenriDeh)
  • Fix stack (#891) (@HenriDeh)
  • CompatHelper: add new compat entry for DelimitedFiles at version 1 for package ReinforcementLearningEnvironments, (keep existing compat) (#894) (@github-actions[bot])
  • Update implement new alg docs (#896) (@jeremiahpslewis)
  • NFQ (#897) (@CasBex)
  • fixed problem with sequential multi agent envs (#898) (@Mytolo)
  • Sketch out optimise! refactor (#899) (@jeremiahpslewis)
  • Bug fix optimise! (#902) (@jeremiahpslewis)
  • Breaking changes to optimise! interface: Bump RLCore to v0.11 and RLZoo to v0.8 (#903) (@jeremiahpslewis)
  • Swap out rng code (#905) (@jeremiahpslewis)
  • CompatHelper: bump compat for NNlib to 0.9 for package ReinforcementLearningZoo, (keep existing compat) (#906) (@github-actions[bot])
  • Fix dispatch and update documentation (#907) (@HenriDeh)
  • QBasedPolicy optimise! forwards to learner. (#909) (@HenriDeh)
  • Bump RLZoo version for NNlib (#911) (@jeremiahpslewis)
  • Add performance testing run loop (#914) (@jeremiahpslewis)
  • Fix Timer bug (#915) (@jeremiahpslewis)
  • couple of improvements to MPO (#919) (@HenriDeh)
  • Rework the run loop (#921) (@HenriDeh)
  • adjusted pettingzoo to PettingZooEnv simultaneous environment more co… (#925) (@Mytolo)
  • fixed devmode / project files (#932) (@Mytolo)
  • fixed DQNLearner Gpu isse (#933) (@Mytolo)
  • fixing prob. /w symbol/ string correspondence (#934) (@Mytolo)
  • Bump flux compat (#935) (@jeremiahpslewis)
  • Reduce find_all_max allocations and increase speed based on chatgpt s… (#938) (@jeremiahpslewis)
  • Add Buildkite / GPU tests (#942) (@jeremiahpslewis)
  • Add RLZoo and RLExperiments to Buildkite (#943) (@jeremiahpslewis)
  • Drop deprecated @provide interface (#944) (@jeremiahpslewis)
  • CI Improvements (#946) (@jeremiahpslewis)
  • Github Actions Fixes (#947) (@jeremiahpslewis)
  • gpu updates RLExperiments, RLZoo (#949) (@jeremiahpslewis)
  • Bump RLCore version (#950) (@jeremiahpslewis)
  • Refactor TRPO and VPG with EpisodesSampler (#952) (@HenriDeh)
  • Fix TotalRewardPerEpisode bug (#953) (@jeremiahpslewis)
  • update docs to loop refactor (#955) (@HenriDeh)
  • CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningEnvironments, (keep existing compat) (#956) (@github-actions[bot])
  • CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningZoo, (keep existing compat) (#957) (@github-actions[bot])
  • CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningExperiments, (keep existing compat) (#958) (@github-actions[bot])
  • CompatHelper: add new compat entry for cuDNN at version 1 for package ReinforcementLearningCore, (keep existing compat) (#962) (@github-actions[bot])
  • CompatHelper: add new compat entry for cuDNN at version 1 for package ReinforcementLearningZoo, (keep existing compat) (#963) (@github-actions[bot])
  • CompatHelper: add new compat entry for cuDNN at version 1 for package ReinforcementLearningExperiments, (keep existing compat) (#964) (@github-actions[bot])
  • TargetNetwork (#966) (@HenriDeh)
  • CompatHelper: bump compat for GPUArrays to 9 for package ReinforcementLearningCore, (keep existing compat) (#969) (@github-actions[bot])
  • Prioritised DQN GPU (#974) (@CasBex)
  • CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningZoo, (keep existing compat) (#975) (@github-actions[bot])
  • CompatHelper: bump compat for ReinforcementLearningZoo to 0.8 for package ReinforcementLearningExperiments, (keep existing compat) (#976) (@github-actions[bot])
  • CompatHelper: bump compat for ReinforcementLearningCore to 0.13 for package ReinforcementLearningExperiments, (keep existing compat) (#977) (@github-actions[bot])
  • Nfq refactor (#980) (@CasBex)
  • Fix and refactor SAC (#985) (@HenriDeh)
  • CompatHelper: bump compat for DomainSets to 0.7 for package ReinforcementLearningBase, (keep existing compat) (#986) (@github-actions[bot])
  • remove rlenv dep for tests (#989) (@HenriDeh)
  • CompatHelper: add new compat entry for CUDA at version 5 for package ReinforcementLearningExperiments, (keep existing compat) (#991) (@github-actions[bot])
  • CompatHelper: add new compat entry for IntervalSets at version 0.7 for package ReinforcementLearningExperiments, (keep existing compat) (#994) (@github-actions[bot])
  • Conservative Q-Learning (#995) (@HenriDeh)
  • CompatHelper: add new compat entry for Parsers at version 2 for package ReinforcementLearningCore, (keep existing compat) (#997) (@github-actions[bot])
  • CompatHelper: add new compat entry for MLUtils at version 0.4 for package ReinforcementLearningZoo, (keep existing compat) (#998) (@github-actions[bot])
  • CompatHelper: add new compat entry for Statistics at version 1 for package ReinforcementLearningCore, (keep existing compat) (#999) (@github-actions[bot])
  • Update CQL_SAC.jl (#1003) (@HenriDeh)
  • Bump tj-actions/changed-files from 35 to 41 in /.github/workflows (#1006) (@dependabot[bot])
  • Make it compatible with Adapt 4 and Metal 1 (#1008) (@joelreymont)
  • Bump RLCore, RLEnv (#1012) (@jeremiahpslewis)
  • Fix PPO per #1007 (#1013) (@jeremiahpslewis)
  • Fix RLCore version (#1018) (@jeremiahpslewis)
  • Add Devcontainer, handle DomainSets 0.7 (#1019) (@jeremiahpslewis)
  • Initial GPUArray transition (#1020) (@jeremiahpslewis)
  • Update TagBot.yml for subprojects (#1021) (@jeremiahpslewis)
  • Fix offline agent test (#1025) (@joelreymont)
  • Fix spell check CI errors (#1027) (@joelreymont)
  • GPU Code Migration Part 2.1 (#1029) (@jeremiahpslewis)
  • Bump RLZoo to v0.8 (#1031) (@jeremiahpslewis)
  • Fix RLZoo version (#1032) (@jeremiahpslewis)

Closed issues:

  • A3C (#133)
  • Box2D environment (#2)
  • Visualize episodes (#3)
  • Implement TRPO/ACER (#134)
  • ERROR: UndefVarError: AtariPreprocessor not defined (#6)
  • bullet3 environment (#7)
  • vizdoom environment (#8)
  • Error tagging new release (#11)
  • loadenvironment error (#19)
  • Support alternative deep learning libraries (#20)
  • Random Thoughts on v0.3.0 (#24)
  • ViZDoom is broken (#130)
  • Document basic environments (#129)
  • bullet3 environment (#128)
  • Box2D environment (#127)
  • Prioritized DQN (#27)
  • Improve interfaces for model exploration and hyperparameter optimization (#28)
  • A2C (#32)
  • Add built-in support for TensorBoard (#35)
  • Add checkpoints (#36)
  • Improve code coverage (#40)
  • AbstractActionSelector not exported (#41)
  • Params empty - no tracking (#43)
  • Add reproducible examples for Atari environments (#44)
  • Add procgen (#126)
  • StopAfterEpisode with progress meter (#51)
  • Add MAgent (#125)
  • Regret Policy Gradients (#131)
  • Support SEED RL (SCALABLE AND EFFICIENT DEEP-RL ) (#62)
  • Support julia 1.4 (#63)
  • How to define a new environment? (#64)
  • Roadmap of v0.9 (#65)
  • Add MCTS related algorithms (#132)
  • Question about AbstractEnv API (#68)
  • bsuite (#124)
  • Classic environments in separate package? (#123)
  • Implement Fully Parameterized Quantile Function for Distributional Reinforcement Learning. (#135)
  • Failed to precompile ReinforcementLearning (#71)
  • Experimental support of Torch.jl (#136)
  • depends on HDF5? (#72)
  • warning and error (#73)
  • Compatibility issue in ReinforcementLearning & Flux (#74)
  • ERROR: KeyError: key "ArnoldiMethod" not found (#79)
  • Add dueling DQN (#137)
  • In DDPG: Add support for vector actions (#138)
  • Rename AbstractAgent to AbstractPolicy (#111)
  • How should ReinforcementLearning.jl be cited ? (#80)
  • Add a stop condition to terminate the experiment after reaching reward threashold (#112)
  • Add Game 2048 (#122)
  • add CUDA accelerated Env (#121)
  • Unify common network architectures and patterns (#139)
  • Alternative handling of max steps in environment (#140)
  • I get NNlib error when trying to load a model (#82)
  • "convert" warning (#83)
  • Seg fault on macbook pro (#84)
  • ACME RL lib by deepmind (#85)
  • Add Highway env (#120)
  • Definition of a policy (#86)
  • Add remote trajectories (#87)
  • Add experiments based on offline RL data (#141)
  • Asynchronous Methods for Deep Reinforcement Learning (#142)
  • Base.convert method for DiscreteSpace (#104)
  • Action Space Meaning (#88)
  • Base.in method for EmptySpace (#105)
  • Renaming get_terminal to isterminated (#106)
  • Requesting more informative field names for SharedTrajectory (#113)
  • R2D2 (#143)
  • Suggestion: More informative name for FullActionSet & MinimalActionSet (#107)
  • Returning an AbstractSpace object using get_actions (#108)
  • Recurrent Models (#144)
  • Split experiments into separate files (#145)
  • Add project.toml for tests (#146)
  • Migrate to Pluto (#148)
  • Update dependency to the latest version of ReinforcementLearning.jl (#149)
  • Docs build error (#91)
  • Split out Trajectory & CircularArrayBuffer as independent packages (#114)
  • Requesting explanation for better performance at ... (#115)
  • Add an extra mode when evaluating agent (#116)
  • Why are wrapper environments a part of RLBase instead of RLCore (say)? (#109)
  • Purpose and scope of sub-packages (#93)
  • The names of keyword arguments in Trajectory is kind of misunderstanding (#117)
  • Add experiments with GymEnv (#147)
  • Cross language support (#103)
  • Check compatibility between agent and environments (#118)
  • Behaviour for hooks for RewardOverridenEnv (#119)
  • StopAfterEpisode with custom DQNL errors beyond a particular Episode Count (#96)
  • ERROR: UndefVarError: NNlib not defined while loading agent (#110)
  • Support compression? (#102)
  • State monitoring and fault tolerance (#101)
  • Use JLSO for (de)serialization? (#97)
  • Add an example running in K8S (#100)
  • Setup github actions (#98)
  • Fails to load trajectory (#150)
  • Replace Travis with github actions (#151)
  • Test error in ReinforcementLearningEnvironments.jl (#152)
  • Move preallocations in MultiThreadEnv from global to local (#153)
  • Flux as service (#154)
  • remove @views (#155)
  • error in save & load ElasticCompactSARTSATrajectory (#156)
  • add early stopping in src\core\stop_conditions.jl (#157)
  • add time stamp in load & save function, in file src\components\agents\agent.jl (#158)
  • policies in GPU can not be saved || loaded (#159)
  • AbstractStage docstring doesn't render correctly in docs. (#160)
  • List of contributors (#161)
  • code formatting (#165)
  • Purpose of CommonRLInterface (#166)
  • Moving example environments from RLBase to RLEnvs? (#167)
  • Keeping prefix get_ in method names like get_reward (#168)
  • Currently getting an ambiguous method error in ReinforcementLearningCore v0.5.1 (#171)
  • Return experiment instead of hook only (#173)
  • TD3 Implementation (#174)
  • Train policy with GymEnv (#175)
  • Travis CI Credits (#178)
  • Training mode and testing mode (#179)
  • Unrecognized symbols (#180)
  • AbstractEnv (#181)
  • SARTTrajectory for SAC (#182)
  • define environment of FULL_ACTION_SET (#184)
  • CircularArraySLARTTrajectory instance is not of type CircularArraySLARTTrajectory (#185)
  • TagBot trigger issue (#186)
  • Is hook the same thing as "callback"? (#190)
  • Use @threads instead of @sync + @spawn in MultiThreadEnv? (#191)
  • Blog custom env link typo (#192)
  • Change clip_by_global_norm! into a Optimizer (#193)
  • PPO related algorithms are broken (#194)
  • Add card game environments (#196)
  • Separate envs from algos in Zoo? (#197)
  • Why "examples"? (#198)
  • WandB integration? (#201)
  • Add default implementations for AbstractEnvWrapper (#202)
  • Add configuration in DQNLearner to enable double-dqn by default (#205)
  • Derivative-Free Reinforcement Learning (#206)
  • ERROR: type RandomPolicy has no field policy (#208)
  • Why split repos? (#209)
  • "Getting Started" too long imo (#210)
  • PreActStage clarification (#212)
  • What's a "trace"? (#213)
  • Continuous time supported? (#215)
  • Docs looks ugly in dark mode (#217)
  • Julia 1.6.0 dependency problem with ReinforcementLearningBase/RLBase (#221)
  • Add Discrete Batch-Constrained Deep Q-learning (#226)
  • Docstring of DoEveryNStep (#225)
  • Documentation of environment; actions seems not work. (#222)
  • Documentation of "How to use Tensorboard?": with_logger not defined (#223)
  • Getting figure object; How to get an animation using GR.plot in CartPolEnv (#246)
  • Update dependency to [email protected] and resolve type piracy of findmax (#227)
  • IQN is broken with [email protected] (#228)
  • The components of Rainbow (#229)
  • Source links in documentation directs to julia repo (#231)
  • Support Tables.jl and PrettyTables.jl for Trajectories (#232)
  • code in get_started seems to be broken (#233)
  • PPO strange behaviour from having actions as one element arrays instead of scalar (#234)
  • SAC and GaussianNetwork (#236)
  • Precompilation prohibitively long (#237)
  • Document how to save/load parameters (#238)
  • Workflow of saving trajectory data (#239)
  • An explanation of "how to train policy (agent)" such as Basic_DQN would be valuable (#240)
  • How to guarantee the environment's reproducibility? (#241)
  • [Call for Contributors] Summer 2021 of Open Source Promotion Plan (#242)
  • Cannot use RLBase.action_space etc. when writing my own environment (#244)
  • ReinforcementLearningZoo.jl experiments (#245)
  • Next Release Plan (v0.9) (#247)
  • How about making this package compatible with DifferentialEquations.jl? (#249)
  • Reinforcement Learning and Combinatorial Optimization (#250)
  • PPO and multi dimensional actions spaces (#251)
  • Add ReinforcementLearningDatasets (#253)
  • Incompatibility with CSVFiles.jl (#256)
  • [RLEnvs] easy access of the length of an action vector (dimension of action space) (#257)
  • Cannot add LinearAlgebra (#258)
  • What's the checkpoints? (#261)
  • Model based reinforcement learning (#262)
  • Add a dedicated multi-dimensional space type (#268)
  • PPO is broken when using CUDA (#280)
  • Lack of reproducibility of QRDQN CartPole Experiment. (#281)
  • Reinforcement Learning.jl in a RTS (#291)
  • StopAfterNoImprovement hook test fails occasionally (#297)
  • Get error when using ReinforcementLearning (#298)
  • Problems with PGFPlotsX during the install (#301)
  • Plotting CartPole environment in Jupyter (#306)
  • Support CircularVectorSARTTrajectory RLZoo (#316)
  • Local development environment setup tips causing error (#320)
  • Rename some functions to help beginners navigate source code (#326)
  • Question about PER (#328)
  • Docs error in code output (#332)
  • Setup a CI for typo (#336)
  • double code & dysfunctional master branch when downloading package (#341)
  • Support multiple discrete action space (#347)
  • Precompilation error; using Plots makes a conflict (#349)
  • Problem with running initial tutorial. Using TabularPolicy() generates an UndefinedKeyword error for n_action (#354)
  • Question: Clarification on the RL plots generated by the run() function (#357)
  • prob question for QBasedPolicy (#360)
  • Can evaluate function be used as a component of RLcore? (#369)
  • problem about precompiling the forked package (#377)
  • Question: Can we use packages like DifferentialEquations.jl to evolve or model the environment in ReinforcementLearning.jl (#378)
  • Combine transformers and RL (#392)
  • MultiAgentManager does not select correct action space for RockPaperScissorsEnv (#393)
  • Add ReinforcementLearningDatasets.jl (#397)
  • error: dimension mismatch "cannot broadcast array to have fewer dimensions" (#400)
  • SAC policy problems? (#410)
  • Add pre-training hook (#411)
  • Dead links in documentation (#418)
  • Links of show nbview badges in RLExperiments are incorrect (#421)
  • Problem accessing public google cloud storage bucket for RLDatasets.jl (#424)
  • Function to access base env through multiple wrapper layers (#425)
  • The problem of using GaussianNetwork in gpu (#455)
  • Next Release Plan (v0.10) (#460)
  • Error in experiment "JuliaRL_DDPG_Pendulum" (#471)
  • In Windows, ReinforcementLearningDataset.jl encounter a bug (#485)
  • Conditional Testing (#493)
  • Inconsistency of the EpsilonGreedyExplorer selection function (#520)
  • PyCall.getindex in module ReinforcementLearningEnvironments conflict warning (#527)
  • device method definition overwritten (#530)
  • How to display/render AtariEnv? (#546)
  • StackFrames bug? (#551)
  • Refactor of DQN Algorithms (#557)
  • Small performance improvement (#558)
  • Infinite-recursion bug in function is_discrete_space when an object of type ClosedInterval is passed (#566)
  • JuliaRL_BasicDQN_CartPole example fails (#568)
  • action_space not defined in tutorial (#569)
  • CI fails with [email protected] (#572)
  • Warning while precompiling RLCore due to kwargs (#575)
  • Gain in VPGPolicy does not account for terminal states? (#578)
  • Strange Bug with examples CartPoleEnv and RLBase.test_runnable!(RandomWalk1D) (#579)
  • Missing docs for TDLearner (#580)
  • Difficulty Creating a Custom Environment (#581)
  • Missing docs for how to implement a new algorithm (#582)
  • Donation (#595)
  • MultiThreadEnv with custom (continuous) action spaces fails (#596)
  • PPOCartPole fails, source of error included (#605)
  • Question: Can ReinforcementLearning.jl handle Partially Observed Markov Processes (POMDPs)? (#608)
  • Add an environment wrapper to IsaacGym (#619)
  • Explain current implementation of PPO in detail (#620)
  • How to run this source code in vscode? (#623)
  • Bug: Issue with TD3 for multi-dimensional action spaces (#624)
  • Make documentation on trace normalization (#633)
  • ActionTransformedEnv doesn't transform legal_action_space_mask (#642)
  • Bug: Previous example from RLZoo now has a bug (#643)
  • TDLearner time step parameter (#648)
  • Examples of multidimensional continous actions (#676)
  • estimate v.s. basis in policies (#677)
  • Base.copy not implemented for the TicTacToe environment (#678)
  • Broken link to src (#693)
  • Support Brax (#696)
  • Q-learning update timing (#702)
  • PPO on environments with multiple action dimensions? (#703)
  • Can't checkout RLCore for development (#704)
  • various eligibility trace-equipped TD methods (#709)
  • Improve the logging mechanism during training (#725)
  • questions while looking at implementation of VPG (#729)
  • Setup sponsor related info (#730)
  • new _run() (#731)
  • SAC example experiment does not work (#736)
  • Custom environment action and state space explanation (#738)
  • PPOPolicy training: ERROR: DomainError with NaN: Normal: the condition σ >= zero(σ) is not satisfied. (#739)
  • Code Readability (#740)
  • MultiThreadEnv not available in ReinforcementLearningZoo (#741)
  • ReinforcementLearningExperiment dependencies fail to precompile (#744)
  • tanh normalization destabilizes learning with GaussianNetwork (#745)
  • how to load a saved model and test it? (#755)
  • Custom Environment Passes RLBase.test_runnable!(env) but infinite hangs and crashes when run. (#757)
  • Bounds Error at UPDATE_FREQ Step (#758)
  • StopAfterEpisode returns 1 more episode using StepsPerEpisode() hook (#759)
  • Move basic definition of environment wrapper into RLBase (#760)
  • Precompilation error - DomainSets not in dependencies (#761)
  • How to set RLBase.state_space() if the number of state space is uncertain (#762)
  • Collect both number of steps and rewards in a single hook (#763)
  • how to use MultiAgentManager on different algorithms? (#764)
  • Example run of Offline RL that totally depends on dataset without online environment (#765)
  • Every single environment / experiment crashes with following error: (#766)
  • Neural Network Approximator based policies not working (#770)
  • Deep RL example for LSTM (#772)
  • "params not defined," "JuliaRL_BasicDQN_CartPole" (#778)
  • MonteCarloLearner incorrect PreActStage behavior (#779)
  • Prioritised Experience Replay (#780)
  • Outdated dependencies (#781)
  • Running experiments throws a "UndefVarError: params not defined" message (#784)
  • Failing MPOCovariance experiment (#791)
  • Logo image was not found (#836)
  • Reactivate docs on CI/CD (#838)
  • update docs: Tips for developers (#844)
  • Package dependencies not compatible (#860)
  • need help from an expert (#862)
  • Installing ReinforcementLearning.jl downgrades Flux.jl (#864)
  • Fix Experiment type setup (#881)
  • LoadError: UndefVarError: params not defined (#882)
  • Rename update! to push! (#883)
  • Contribute Neural Fitted Q-iteration algorithm (#895)
  • PPo policy experiments failing (#910)
  • Executing RLBase.plan! after end of experiment (#913)
  • EpisodeSampler in Trajectories (#927)
  • Hook RewardsPerEpisode broken (#945)
  • Can implement this ARZ algorithm ? (#965)
  • AssertionError: action in env.action_space (#967)
  • Fixing SAC Policy (#970)
  • Prioritized DQN experiment nonfunctional (#971)
  • Prioritised DQN failing on GPU (#973)
  • An error (#983)
  • params() is no longer supported in Flux (#996)
  • GPU Compile error on PPO with MaskedPPOTrajectory (#1007)
  • RL Core tests fail sporadically (#1010)
  • RL Env tests fail with latest OpenSpiel patches (#1011)
  • Tutorial OpenSpiel KuhnOpenNSFP fails (#1024)
  • CI: Should spell check be dropped or fixed? (#1026)