Skip to content

Commit

Permalink
Merge branch 'main' into 98-spec-props
Browse files Browse the repository at this point in the history
  • Loading branch information
sash-a authored Mar 20, 2024
2 parents ed66bf3 + 0247608 commit c3d6f60
Show file tree
Hide file tree
Showing 30 changed files with 1,275 additions and 44 deletions.
8 changes: 4 additions & 4 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ jobs:
python-version: "3.x"
- name: Install dependencies
run: |
pip install --upgrade pip setuptools twine
pip install --upgrade pip hatch twine
- name: Build and publish
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
TWINE_USERNAME: __token__
TWINE_PASSWORD: ${{ secrets.PYPI_TOKEN }}
run: |
python setup.py sdist
hatch build
twine upload dist/*
39 changes: 19 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,39 +26,44 @@
<img src="docs/env_anim/cleaner.gif" alt="Cleaner" width="16%">
<img src="docs/env_anim/connector.gif" alt="Connector" width="16%">
<img src="docs/env_anim/cvrp.gif" alt="CVRP" width="16%">
<img src="docs/env_anim/flat_pack.gif" alt="FlatPack" width="16%">
<img src="docs/env_anim/game_2048.gif" alt="Game2048" width="16%">
<img src="docs/env_anim/graph_coloring.gif" alt="GraphColoring" width="16%">
</div>
<div class="row" align="center">
<img src="docs/env_anim/graph_coloring.gif" alt="GraphColoring" width="16%">
<img src="docs/env_anim/job_shop.gif" alt="JobShop" width="16%">
<img src="docs/env_anim/knapsack.gif" alt="Knapsack" width="16%">
<img src="docs/env_anim/maze.gif" alt="Maze" width="16%">
<img src="docs/env_anim/minesweeper.gif" alt="Minesweeper" width="16%">
<img src="docs/env_anim/mmst.gif" alt="MMST" width="16%">
<img src="docs/env_anim/multi_cvrp.gif" alt="MultiCVRP" width="16%">
</div>
<div class="row" align="center">
<img src="docs/env_anim/multi_cvrp.gif" alt="MultiCVRP" width="16%">
<img src="docs/env_anim/pac_man.gif" alt="PacMan" width="16%">
<img src="docs/env_anim/robot_warehouse.gif" alt="RobotWarehouse" width="16%">
<img src="docs/env_anim/rubiks_cube.gif" alt="RubiksCube" width="16%">
<img src="docs/env_anim/sliding_tile_puzzle.gif" alt="SlidingTilePuzzle" width="16%">
<img src="docs/env_anim/snake.gif" alt="Snake" width="16%">
<img src="docs/env_anim/sudoku.gif" alt="Sudoku" width="16%">
<img src="docs/env_anim/tetris.gif" alt="Tetris" width="16%">
<img src="docs/env_anim/tsp.gif" alt="Tetris" width="16%">
</div>
<div class="row" align="center">
<img src="docs/env_anim/pac_man.gif" alt="RobotWarehouse" width="16%">
<img src="docs/env_anim/sokoban.gif" alt="RobotWarehouse" width="16%">
<img src="docs/env_anim/sudoku.gif" alt="Sudoku" width="16%">
<img src="docs/env_anim/tetris.gif" alt="Tetris" width="16%">
<img src="docs/env_anim/tsp.gif" alt="Tetris" width="16%">
</div>
</div>

## Jumanji @ ICLR 2024

Jumanji has been accepted at [ICLR 2024](https://iclr.cc/), check out our [research paper](https://arxiv.org/abs/2306.09884).

## Welcome to the Jungle! 🌴

Jumanji is a diverse suite of scalable reinforcement learning environments written in JAX.
Jumanji is a diverse suite of scalable reinforcement learning environments written in JAX. It now features 22 environments!

Jumanji is helping pioneer a new wave of hardware-accelerated research and development in the
field of RL. Jumanji's high-speed environments enable faster iteration and large-scale
experimentation while simultaneously reducing complexity. Originating in the Research Team at
experimentation while simultaneously reducing complexity. Originating in the research team at
[InstaDeep](https://www.instadeep.com/), Jumanji is now developed jointly with the open-source
community. To join us in these efforts, reach out, raise issues and read our
[contribution guidelines](https://github.com/instadeepai/jumanji/blob/main/CONTRIBUTING.md) or just
Expand Down Expand Up @@ -98,6 +103,7 @@ problems.
| 🎨 GraphColoring | Logic | `GraphColoring-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/graph_coloring/) | [doc](https://instadeepai.github.io/jumanji/environments/graph_coloring/) |
| 💣 Minesweeper | Logic | `Minesweeper-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/minesweeper/) | [doc](https://instadeepai.github.io/jumanji/environments/minesweeper/) |
| 🎲 RubiksCube | Logic | `RubiksCube-v0`<br/>`RubiksCube-partly-scrambled-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/rubiks_cube/) | [doc](https://instadeepai.github.io/jumanji/environments/rubiks_cube/) |
| 🔀 SlidingTilePuzzle | Logic | `SlidingTilePuzzle-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/sliding_tile_puzzle/) | [doc](https://instadeepai.github.io/jumanji/environments/sliding_tile_puzzle/) |
| ✏️ Sudoku | Logic | `Sudoku-v0` <br/>`Sudoku-very-easy-v0`| [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/sudoku/) | [doc](https://instadeepai.github.io/jumanji/environments/sudoku/) |
| 📦 BinPack (3D BinPacking Problem) | Packing | `BinPack-v1` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/packing/bin_pack/) | [doc](https://instadeepai.github.io/jumanji/environments/bin_pack/) |
| 🧩 FlatPack (2D Grid Filling Problem) | Packing | `FlatPack-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/packing/flat_pack/) | [doc](https://instadeepai.github.io/jumanji/environments/flat_pack/) |
Expand All @@ -113,15 +119,15 @@ problems.
| 🐍 Snake | Routing | `Snake-v1` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/snake/) | [doc](https://instadeepai.github.io/jumanji/environments/snake/) |
| 📬 TSP (Travelling Salesman Problem) | Routing | `TSP-v1` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/tsp/) | [doc](https://instadeepai.github.io/jumanji/environments/tsp/) |
| Multi Minimum Spanning Tree Problem | Routing | `MMST-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/mmst) | [doc](https://instadeepai.github.io/jumanji/environments/mmst/) |
| ᗧ•••ᗣ•• PacMan | Routing | `PacMan-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/pacman/) | [doc](https://instadeepai.github.io/jumanji/environments/pacman/)
| ᗧ•••ᗣ•• PacMan | Routing | `PacMan-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/pac_man/) | [doc](https://instadeepai.github.io/jumanji/environments/pac_man/)
| 👾 Sokoban | Routing | `Sokoban-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/sokoban/) | [doc](https://instadeepai.github.io/jumanji/environments/sokoban/) |

<h2 name="install" id="install">Installation 🎬</h2>

You can install the latest release of Jumanji from PyPI:

```bash
pip install jumanji
pip install -U jumanji
```

Alternatively, you can install the latest development version directly from GitHub:
Expand Down Expand Up @@ -229,17 +235,10 @@ details on how to submit pull requests, our Contributor License Agreement, and c
If you use Jumanji in your work, please cite the library using:

```
@misc{bonnet2023jumanji,
@misc{bonnet2024jumanji,
title={Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX},
author={
Clément Bonnet and Daniel Luo and Donal Byrne and Shikha Surana and Vincent Coyette and
Paul Duckworth and Laurence I. Midgley and Tristan Kalloniatis and Sasha Abramowitz and
Cemlyn N. Waters and Andries P. Smit and Nathan Grinsztajn and Ulrich A. Mbou Sob and
Omayma Mahjoub and Elshadai Tegegn and Mohamed A. Mimouni and Raphael Boige and
Ruan de Kock and Daniel Furelos-Blanco and Victor Le and Arnu Pretorius and
Alexandre Laterre
},
year={2023},
author={Clément Bonnet and Daniel Luo and Donal Byrne and Shikha Surana and Sasha Abramowitz and Paul Duckworth and Vincent Coyette and Laurence I. Midgley and Elshadai Tegegn and Tristan Kalloniatis and Omayma Mahjoub and Matthew Macfarlane and Andries P. Smit and Nathan Grinsztajn and Raphael Boige and Cemlyn N. Waters and Mohamed A. Mimouni and Ulrich A. Mbou Sob and Ruan de Kock and Siddarth Singh and Daniel Furelos-Blanco and Victor Le and Arnu Pretorius and Alexandre Laterre},
year={2024},
eprint={2306.09884},
url={https://arxiv.org/abs/2306.09884},
archivePrefix={arXiv},
Expand Down
8 changes: 8 additions & 0 deletions docs/api/environments/sliding_tile_puzzle.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
::: jumanji.environments.logic.sliding_tile_puzzle.env.SlidingTilePuzzle
selection:
members:
- __init__
- reset
- step
- observation_spec
- action_spec
Binary file added docs/env_anim/sliding_tile_puzzle.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/env_img/sliding_tile_puzzle.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
52 changes: 52 additions & 0 deletions docs/environments/sliding_tile_puzzle.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Sliding Tile Puzzle Environment

<p align="center">
<img src="../env_anim/sliding_tile_puzzle.gif" width="500"/>
</p>

This is a Jax JIT-able implementation of the classic [Sliding Tile Puzzle game](https://en.wikipedia.org/wiki/Sliding_puzzle).

The Sliding Tile Puzzle game is a classic puzzle that challenges a player to slide (typically flat) pieces along certain routes (usually on a board) to establish a certain end-configuration. The pieces to be moved may consist of simple shapes, or they may be imprinted with colors, patterns, sections of a larger picture (like a jigsaw puzzle), numbers, or letters.

The puzzle is often 3×3, 4×4 or 5×5 in size and made up of square tiles that are slid into a square base, larger than the tiles by one tile space, in a specific large configuration. Tiles are moved/arranged by sliding an adjacent tile into a position occupied by the missing tile, which creates a new space. The sliding puzzle is mechanical and requires the use of no other equipment or tools.

## Observation

The observation in the Sliding Tile Puzzle game includes information about the puzzle, the position of the empty tile, and the action mask.

- `puzzle`: jax array (int32) of shape `(grid_size, grid_size)`, representing the current game state. Each element in the array corresponds to a puzzle tile. The tile represented by 0 is the empty tile.

- Here is an example of a random observation of the game board:

```
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 0 12]
[ 13 14 15 11]]
```
- In this array, the tile represented by 0 is the empty tile that can be moved.

- `empty_tile_position`: a tuple (int32) of shape `(2,)` representing the position of the empty tile in the grid. For example, (2, 2) would represent the third row and the third column in a zero-indexed grid.

- `action_mask`: jax array (bool) of shape `(4,)`, indicating which actions are valid in the current state of the environment. The actions include moving the empty tile up, right, down, or left. For example, an action mask `[True, False, True, False]` means that the valid actions are to move the empty tile upward or downward.

- `step_count`: jax array (int32) of shape `()`, current number of steps in the episode.

## Action

The action space is a `DiscreteArray` of integer values in `[0, 1, 2, 3]`. Specifically, these four actions correspond to moving the empty tile: up (0), right (1), down (2), or left (3).

## Reward

The reward could be either:

- **DenseRewardFn**: This reward function provides a dense reward based on the difference of correctly placed tiles between the current state and the next state. The reward is positive for each newly correctly placed tile and negative for each newly incorrectly placed tile.

- **SparseRewardFn**: This reward function provides a sparse reward, only rewarding when the puzzle is solved.
The reward is 1 if the puzzle is solved, and 0 otherwise.

The goal in all cases is to solve the puzzle in a way that maximizes the reward.

## Registered Versions 📖

- `SlidingTilePuzzle-v0`, the Sliding Tile Puzzle with a grid size of 5x5.
4 changes: 4 additions & 0 deletions jumanji/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,3 +134,7 @@
register(id="Sokoban-v0", entry_point="jumanji.environments:Sokoban")
# Pacman - minimal version of Atarti Pacman game
register(id="PacMan-v0", entry_point="jumanji.environments:PacMan")
# SlidingTilePuzzle - A sliding tile puzzle environment with the default grid size of 5x5.
register(
id="SlidingTilePuzzle-v0", entry_point="jumanji.environments:SlidingTilePuzzle"
)
18 changes: 13 additions & 5 deletions jumanji/environments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,20 @@

import sys

from jumanji.environments.logic import game_2048, minesweeper, rubiks_cube
from jumanji.environments.logic import (
game_2048,
graph_coloring,
minesweeper,
rubiks_cube,
sliding_tile_puzzle,
sudoku,
)
from jumanji.environments.logic.game_2048.env import Game2048
from jumanji.environments.logic.graph_coloring.env import GraphColoring
from jumanji.environments.logic.minesweeper import Minesweeper
from jumanji.environments.logic.rubiks_cube import RubiksCube
from jumanji.environments.logic.sudoku import Sudoku
from jumanji.environments.logic.minesweeper.env import Minesweeper
from jumanji.environments.logic.rubiks_cube.env import RubiksCube
from jumanji.environments.logic.sliding_tile_puzzle.env import SlidingTilePuzzle
from jumanji.environments.logic.sudoku.env import Sudoku
from jumanji.environments.packing import bin_pack, flat_pack, job_shop, knapsack, tetris
from jumanji.environments.packing.bin_pack.env import BinPack
from jumanji.environments.packing.flat_pack.env import FlatPack
Expand All @@ -44,7 +52,7 @@
from jumanji.environments.routing.cvrp.env import CVRP
from jumanji.environments.routing.maze.env import Maze
from jumanji.environments.routing.mmst.env import MMST
from jumanji.environments.routing.multi_cvrp import MultiCVRP
from jumanji.environments.routing.multi_cvrp.env import MultiCVRP
from jumanji.environments.routing.pac_man.env import PacMan
from jumanji.environments.routing.robot_warehouse.env import RobotWarehouse
from jumanji.environments.routing.snake.env import Snake
Expand Down
16 changes: 16 additions & 0 deletions jumanji/environments/logic/sliding_tile_puzzle/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from jumanji.environments.logic.sliding_tile_puzzle.env import SlidingTilePuzzle
from jumanji.environments.logic.sliding_tile_puzzle.types import Observation, State
42 changes: 42 additions & 0 deletions jumanji/environments/logic/sliding_tile_puzzle/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import jax
import jax.numpy as jnp
import pytest

from jumanji.environments.logic.sliding_tile_puzzle import SlidingTilePuzzle
from jumanji.environments.logic.sliding_tile_puzzle.generator import RandomWalkGenerator
from jumanji.environments.logic.sliding_tile_puzzle.types import State


@pytest.fixture
def sliding_tile_puzzle() -> SlidingTilePuzzle:
"""Instantiates a default SlidingTilePuzzle environment."""
generator = RandomWalkGenerator(grid_size=3)
return SlidingTilePuzzle(generator=generator)


@pytest.fixture
def state() -> State:
key = jax.random.PRNGKey(0)
empty_pos = jnp.array([0, 0])
puzzle = jnp.array(
[
[0, 1, 3],
[4, 2, 5],
[7, 8, 6],
]
)
return State(puzzle=puzzle, empty_tile_position=empty_pos, key=key, step_count=0)
24 changes: 24 additions & 0 deletions jumanji/environments/logic/sliding_tile_puzzle/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import jax.numpy as jnp

EMPTY_TILE = 0
INITIAL_STEP_COUNT = 0

UP = [-1, 0]
RIGHT = [0, 1]
DOWN = [1, 0]
LEFT = [0, -1]

MOVES = jnp.array([UP, RIGHT, DOWN, LEFT])
Loading

0 comments on commit c3d6f60

Please sign in to comment.