Space Invaders crowd data

Download

https://s3-us-west-1.amazonaws.com/aiworld.crowd-ale/space_invaders_with_crowd_reward.zip

Description

This data was taken from 734 crowdsourced games of Space Invaders where crowdsourcers provided real-time feedback on the performance of a DQN-based AI. Votes were binary, signaling either a good moves or bad moves, and filtered by taking the median where there was consensus by at least two crowdsourcers within one second. This turned out to yield negative rewards where the spaceship dies and is useful not only because deaths are not present in the original Space Invaders reward signal, but because this technique (voting on binary rewards) can be scaled to arbitrarily sophisticated AI's, is algorithm agnostic, and is tractably crowdsourceable due to the simplicity with which humans can be asked to provide such signals. The application and interface used to obtain the reward votes can be seen here.

Structure

Each file contains one game of JSON data compressed with snappy in the following schema
[
    {
        "action": "FIRE" | "MOVE_RIGHT_AND_FIRE" | "MOVE_LEFT_AND_FIRE",
        "action_number": 1 | 11 | 12;
        "game_over": true | false,
        "reward": -1 | 0 | 5 | 10 | 15 | 20 | 25 | 30 | 100 | 200,
        "screen_hex" < full screen hex from ALE (non run-length encoded - NTSC colors) >
    },

    ...
]
Rewards are -1 for deaths and the regular Space Invaders reward otherwise.

Parse with Python example

To start, install snappy with pip:
pip install snappy
import json
import snappy
with open('/your-download-path/episode_000.json.snappy', 'r') as file_ref:
    json_str = snappy.decompress(file_ref.read())
    frames = json.loads(json_str)