Xland Minigrid Versions Save

JAX-accelerated Meta-Reinforcement Learning Environments Inspired by XLand and MiniGrid 🏎️

v0.8.0

2 weeks ago

Time limits rework

We have always felt for a long time that the original design of the time limit was severely limited as it did not allow this parameter to be changed after the environment was created, since it was defined by a method that was somehow fixedly dependent only on the environment parameters. For example:

def time_limit(self, params: EnvParamsT) -> int:
        return 3 * params.height * params.width

What if someone wanted to choose a custom time limit? The only way is to subclass the environment and change the time limit method (or use wrapper, which is basically equivalent).

In this release we made this easier as time limit is now determined by just an env parameter max_steps (similar to MiniGrid):


class EnvParams(struct.PyTreeNode):
    height: int = struct.field(pytree_node=False, default=9)
    width: int = struct.field(pytree_node=False, default=9)
    view_size: int = struct.field(pytree_node=False, default=7)
    max_steps: Optional[None] = struct.field(pytree_node=False, default=None)    # NEW!
    render_mode: str = struct.field(pytree_node=False, default="rgb_array")

Default time limit handling (all other environments were changed in a similar manner):

    def default_params(self, **kwargs) -> XLandEnvParams:
        params = XLandEnvParams(view_size=5)
        params = params.replace(**kwargs)

        if params.max_steps is None:
            params = params.replace(max_steps=3 * (params.height * params.width))
        return params

Now max_steps can be changed after the initialization, although it is not a pytree node and can not be vmaped over.

What's Changed

Allow XLand benchmarks as a single task envs in single-task ppo by @Howuhh in https://github.com/corl-team/xland-minigrid/pull/13
fix evaluation in standalone by @Howuhh in https://github.com/corl-team/xland-minigrid/pull/14
Change tile grid cell conversion in render function by @afspies in https://github.com/corl-team/xland-minigrid/pull/16
small improvements by @Howuhh in https://github.com/corl-team/xland-minigrid/pull/19
Time limits rework by @Howuhh in https://github.com/corl-team/xland-minigrid/pull/20

New Contributors

@afspies made their first contribution in https://github.com/corl-team/xland-minigrid/pull/16

Full Changelog: https://github.com/corl-team/xland-minigrid/compare/v0.7.0...v0.8.0

v0.7.0

2 months ago

What's Changed

We have added support for rendering observations as RBG images. This will expand the space of possible experiments and architectures. For example, now we can properly check generalization to new objects, which was impossible with discrete encoding, as it is difficult to add new embeddings for new objects after pre-training. This has some disadvantages, since rendering significantly reduces throughput tho.

This is a major update and therefore experimental for now. Example usage:

import jax
import xminigrid
from xminigrid.wrappers import GymAutoResetWrapper
from xminigrid.experimental.img_obs import RGBImgObservationWrapper

key = jax.random.PRNGKey(0)
reset_key, ruleset_key = jax.random.split(key)

benchmark = xminigrid.load_benchmark(name="trivial-1m")
ruleset = benchmark.sample_ruleset(ruleset_key)

env, env_params = xminigrid.make("XLand-MiniGrid-R9-25x25")
env_params = env_params.replace(ruleset=ruleset)

# auto-reset wrapper
env = GymAutoResetWrapper(env)
# for faster rendering, pre-rendered tiles will be saved at XLAND_MINIGRID_CACHE path
# use XLAND_MINIGRID_RELOAD_CACHE=True to force cache reload
env = RGBImgObservationWrapper(env)

timestep = jax.jit(env.reset)(env_params, reset_key)
timestep = jax.jit(env.step)(env_params, timestep, action=0)

To make rendering possible under jit, we had to make a few changes that changed the IDs of objects and colors. This broke compatibility with the old benchmarks, so we completely re-generated them. We also noticed a some time ago that medium-1m benchmark is not harder compared to small-1m, so we took a chance and made it a bit more complex as well. The updated configs can still be found at scripts/generate_benchmarks.sh. Thus, be careful, as the results from previous release can change significantly!

RGB image observation wrapper compatible with jit by @Howuhh in https://github.com/corl-team/xland-minigrid/pull/9

Full Changelog: https://github.com/corl-team/xland-minigrid/compare/v0.6.0...v0.7.0

v0.6.0

3 months ago

What's Changed

This is our first stable release accompanied with the public full paper preprint on the arxiv (there is a lot of new content!). Compared to the workshop version, the library was almost completely rewritten, previously missing benchmarks, examples and baselines were added, and the interface of the environments was redesigned. In the latest update we added full type hints coverage to improve development and user experience. Since the v0.3.0 we also added bechmarks variations with three millions of unique tasks. Check them out!

In the near future, we are not going to change the interfaces and benchmarks in any significant way and we will treat all new features carefully and conservatively in order to maintain reproducibility. We plan to release version v1.0.0 around the end of March, 2024.

Fix issue #3. by @floringogianu in https://github.com/corl-team/xland-minigrid/pull/5
Full typings support by @Howuhh in https://github.com/corl-team/xland-minigrid/pull/6

New Contributors

@floringogianu made their first contribution in https://github.com/corl-team/xland-minigrid/pull/5

Full Changelog: https://github.com/corl-team/xland-minigrid/compare/v0.3.0...v0.6.0

v0.3.0

5 months ago

We have released the first set of benchmarks with 1M unique tasks in each. Configs used for generation are available in scripts/generate_benchmarks.sh.

Furthermore, the following was added:

two new colours (pink, brown)
two new objects (hexagon, star)
eight new goals and rules (variations of Near).