The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
NOTE: It is strongly recommended that you use packages from the same release together for the best experience.
Package | Version |
---|---|
com.unity.ml-agents (C#) | v2.3.0-exp.3 |
com.unity.ml-agents.extensions (C#) | v0.6.1-preview |
ml-agents (Python) | v0.30.0 |
ml-agents-envs (Python) | v0.30.0 |
gym-unity (Python) | v0.30.0 |
Communicator (C#/Python) | v1.5.0 |
UnityToGymWrapper
and PettingZoo
API to ml-agents-envs
package. All these environments will be
versioned under ml-agents-envs
package in the future (#)NOTE: It is strongly recommended that you use packages from the same release together for the best experience.
Package | Version |
---|---|
com.unity.ml-agents (C#) | v2.2.1-exp.1 |
com.unity.ml-agents.extensions (C#) | v0.6.1-preview |
ml-agents (Python) | v0.28.0 |
ml-agents-envs (Python) | v0.28.0 |
gym-unity (Python) | v0.28.0 |
Communicator (C#/Python) | v1.5.0 |
beta
, epsilon
, and learning rate
on separate schedules (affects only PPO and POCA). (#5538)--deterministic
cli flag to deterministically select the most probable actions in policy. The same thing can
be achieved by adding deterministic: true
under network_settings
of the run options configuration.(#5597)NOTE: It is strongly recommended that you use packages from the same release together for the best experience.
Package | Version |
---|---|
com.unity.ml-agents (C#) | v2.1.0-exp.1 |
com.unity.ml-agents.extensions (C#) | v0.5.0-preview |
ml-agents (Python) | v0.27.0 |
ml-agents-envs (Python) | v0.27.0 |
gym-unity (Python) | v0.27.0 |
Communicator (C#/Python) | v1.5.0 |
NullReferenceException
when adding Behavior Parameters with no Agent
. (#5382)VectorSensorComponent
. (#5376)RigidBodySensorComponent
now displays a warning if it's used in a way that won't generate useful observations. (#5387)GridSensor
does not work in 2D environments. (#5396)NOTE: It is strongly recommended that you use packages from the same release together for the best experience.
Package | Version |
---|---|
com.unity.ml-agents (C#) | v2.0.0 |
com.unity.ml-agents.extensions (C#) | v0.4.0-preview |
ml-agents (Python) | v0.26.0 |
ml-agents-envs (Python) | v0.26.0 |
gym-unity (Python) | v0.26.0 |
Communicator (C#/Python) | v1.5.0 |
Obsolete
have been removed. If you were using these methods, you need to replace them with their supported counterpart. (#5024)IDiscreteActionMask
has changed. WriteMask(int branch, IEnumerable<int> actionIndices)
was replaced with SetActionEnabled(int branch, int actionIndex, bool isEnabled)
. (#5060)ISensor.GetObservationShape()
has been removed, and GetObservationSpec()
has been added. The ITypedSensor
and IDimensionPropertiesSensor
interfaces have been removed. (#5127)ISensor.GetCompressionType()
has been removed, and GetCompressionSpec()
has been added. The ISparseChannelSensor
interface has been removed. (#5164)SensorComponent.GetObservationShape()
was no longer being called, so it has been removed. (#5172)SensorComponent.CreateSensor()
has been replaced with SensorComponent.CreateSensors()
, which returns an ISensor[]
. (#5181)InferenceDevice
is now InferenceDevice.Default
, which is equivalent to InferenceDevice.Burst
. If you depend on the previous behavior, you can explicitly set the Agent's InferenceDevice
to InferenceDevice.CPU
. (#5175).onnx
models input names have changed. All input placeholders now use the prefix obs_
removing the distinction between visual and vector observations. In addition, the inputs and outputs of LSTM have changed. Models created with this version are not usable with previous versions of the package (#5080, #5236).onnx
models discrete action output now contains the discrete actions values and not the logits. Models created with this version are not usable with previous versions of the package (#5080)com.unity.ml-agents.extensions
to com.unity.ml-agents
. (#5259)Match3Sensor
has been refactored to produce cell and special type observations separately, and Match3SensorComponent
now produces two Match3Sensor
s (unless there are no special types). Previously trained models have different observation sizes and need to be retrained. (#5181)AbstractBoard
class for integration with Match-3 games has been changed to make it easier to support boards with different sizes using the same model. For a summary of the interface changes, please see the Migration Guide. (##5189)GridSensor
has been refactored and moved to the main package, with changes to both sensor interfaces and behaviors. Existing GridSensor created by the extension package do not work in newer versions. Previously trained models need to be retrained. Please see the Migration Guide for more details. (#5256)1.4.0-preview
(#5236)com.unity.modules.unityanalytics
an optional dependency. (#5109)com.unity.modules.physics
and com.unity.modules.physics2d
optional dependencies. (#5112)Goal Signal
as a type of observation. Trainers can now use HyperNetworks to process Goal Signal
. Trainers with HyperNetworks are more effective at solving multiple tasks. (#5142, #5159, #5149)Goal Signal
feature. (#5193)DecisionRequester.ShouldRequestDecision()
and ShouldRequestAction()
methods have been added. These are used to determine whether Agent.RequestDecision()
and Agent.RequestAction()
are called (respectively). (#5223)RaycastPerceptionSensor
now caches its raycast results; they can be accessed via RayPerceptionSensor.RayPerceptionOutput
. (#5222)ActionBuffers
are now reset to zero before being passed to Agent.Heuristic()
and IHeuristicProvider.Heuristic()
. (#5227)Agent
now calls IDisposable.Dispose()
on all ISensor
s that implement the IDisposable
interface. (#5233)CameraSensor
, RenderTextureSensor
, and Match3Sensor
now reuse their Texture2D
s, reducing the amount of memory that needs to be allocated during runtime. (#5233)ObservationWriter.WriteTexture()
so that it doesn't call Texture2D.GetPixels32()
for RGB24
textures. This results in much less memory being allocated during inference with CameraSensor
and RenderTextureSensor
. (#5233)info
to debug
and are no longer printed by default. If you want all messages to be printed, you can run mlagents-learn
with the --debug
option or add the line debug: true
at the top of the yaml config file. (#5211)Num Steps To Record > 0
and Record
was turned off. (#5274)--results-dir
has no effect. (#5269).pt
checkpoints were not deleted during training. (#5271)NOTE: It is strongly recommended that you use packages from the same release together for the best experience.
Package | Version |
---|---|
com.unity.ml-agents (C#) | v1.9.1 |
com.unity.ml-agents.extensions (C#) | v0.3.1-preview |
ml-agents (Python) | v0.25.1 |
ml-agents-envs (Python) | v0.25.1 |
gym-unity (Python) | v0.25.1 |
Communicator (C#/Python) | v1.5.0 |
--resume
flag now supports resuming experiments with additional reward providers or loading partial models if the network architecture has changed. See here for more details. (#5213)sequence_length
< time_horizon
. (#5206)save_replay_buffer
was enabled. (#5205)validate_action
to expect the right dimensions when set_action_single_agent
is called. (#5208)GymToUnityWrapper
, raise an appropriate warning if step()
is called after an environment is done. (#5204)gym
wrappers would override user-set log levels. (#5201)NOTE: It is strongly recommended that you use packages from the same release together for the best experience.
Package | Version |
---|---|
com.unity.ml-agents (C#) | v1.9.0 |
com.unity.ml-agents.extensions (C#) | v0.3.0-preview |
ml-agents (Python) | v0.25.0 |
ml-agents-envs (Python) | v0.25.0 |
gym-unity (Python) | v0.25.0 |
Communicator (C#/Python) | v1.5.0 |
BufferSensor
and BufferSensorComponent
have been added (documentation). They allow the Agent to observe variable number of entities. For an example, see the Sorter environment. (#4909)SimpleMultiAgentGroup
class and IMultiAgentGroup
interface have been added (documentation). These allow Agents to be given rewards and end episodes in groups. For examples, see the Cooperative Push Block, Dungeon Escape and Soccer environments. (#4923)poca
as the trainer in the configuration YAML after instantiating a SimpleMultiAgentGroup
to use this feature. (#5005)com.unity.ml-agents
samples. (#5077)encoding_size
setting for RewardSignals has been deprecated. Please use network_settings
instead. (#4982)ObservationSpec.name
. (#5036)NOTE: It is strongly recommended that you use packages from the same release together for the best experience.
Package | Version |
---|---|
com.unity.ml-agents (C#) | v1.8.1 |
com.unity.ml-agents.extensions (C#) | v0.2.0-preview |
ml-agents (Python) | v0.24.1 |
ml-agents-envs (Python) | v0.24.01 |
gym-unity (Python) | v0.24.1 |
Communicator (C#/Python) | v1.4.0 |
cattrs
version dependency was updated to allow >=1.1.0
on Python 3.8 or higher. (#4821)NOTE: It is strongly recommended that you use packages from the same release together for the best experience.
Package | Version |
---|---|
com.unity.ml-agents (C#) | v1.8.0 |
com.unity.ml-agents.extensions (C#) | v0.1.0-preview |
ml-agents (Python) | v0.24.0 |
ml-agents-envs (Python) | v0.24.0 |
gym-unity (Python) | v0.24.0 |
Communicator (C#/Python) | v1.4.0 |
mlagents-learn
has been added. You can now define custom
StatsWriter
implementations and register them to be called during training.
More types of plugins will be added in the future. (#4788)ActionSpec
constructor is now public. Previously, it was not possible to create an
ActionSpec with both continuous and discrete actions from code. (#4896)StatAggregationMethod.Sum
can now be passed to StatsRecorder.Add()
. This
will result in the values being summed (instead of averaged) when written to
TensorBoard. Thanks to @brccabral for the contribution! (#4816)--time-scale
parameter in mlagents-learn) was
removed when training with a player. The Editor still requires it to be clamped to 100. (#4867)VectorSensor.AddObservation(IList<float>)
. VectorSensor.AddObservation(IEnumerable<float>)
is deprecated. The IList
version is recommended, as it does not generate any
additional memory allocations. (#4887)ObservationWriter.AddList()
and deprecated ObservationWriter.AddRange()
.
AddList()
is recommended, as it does not generate any additional memory allocations. (#4887)ActuatorComponent.CreateActuators
, and deprecated ActuatorComponent.CreateActuator
. The
default implementation will wrap ActuatorComponent.CreateActuator
in an array and return that. (#4899)InferenceDevice.Burst
was added, indicating that Agent's model will be run using Barracuda's Burst backend.
This is the default for new Agents, but existing ones that use InferenceDevice.CPU
should update to
InferenceDevice.Burst
. (#4925)--torch-device
commandline option to mlagents-learn
, which sets the default
torch.device
used for training. (#4888)--cpu
commandline option had no effect and was removed. Use --torch-device=cpu
to force CPU training. (#4888)mlagents_env
API has changed, BehaviorSpec
now has a observation_specs
property containing a list of ObservationSpec
. For more information on ObservationSpec
see here. (#4763, #4825)GrpcExtensions.cs
. (#4812)ActuatorManager.UpdateActionArray()
(#4877)SensorShapeValidator.ValidateSensors()
(#4879)SideChannelManager.GetSideChannelMessage()
(#4886)RunOptions
was deserialized via pickle
. (#4842)UnityEnvironment
to wait the full timeout
period and report a misleading error message if the executable crashed
without closing the connection. It now periodically checks the process status
while waiting for a connection, and raises a better error message if it crashes. (#4880)-logfile
option in the --env-args
option to mlagents-learn
is
no longer overwritten. (#4880)load_weights
function was being called unnecessarily often in the Ghost Trainer leading to training slowdowns. (#4934)NOTE: It is strongly recommended that you use packages from the same release together for the best experience.
Package | Version |
---|---|
com.unity.ml-agents (C#) | v1.7.2 |
ml-agents (Python) | v0.23.0 |
ml-agents-envs (Python) | v0.23.0 |
gym-unity (Python) | v0.23.0 |
Communicator (C#/Python) | v1.3.0 |