An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions

Last update: Jun 09, 2022

Overview

Agar.io_Q-Learning_AI

An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions.

An image of the circle categorisation function in action. Food blobs are outlined in blue, edible cells in green and dangerous cells in red according to where our program detects them. Screen edges mess that up a bit. The agents action at this moment is labelled with the green arrow.

States are calculated using the shortest euclidian distance to each of the three circle types: food, edible cells and dangerous cells. These distances are measured and discretized according to which interval they fall within. The rulers in this image are to scale.

Currently the agent can't press any keyboard buttons, only move around using the mouse. It could be added without too much hassle, but it would require a rework of some aspects of the code and a ton training, which already takes ages. The q-learning part could also do with a proper implementation of stochastic q-learning instead of our generic iterative q-learning, if I knew how to do it. I look forward to learning that at a later point.

Feel free to ask any questions about the code or the project. I hope you enjoy!

The humans in the experiment were subject to the same move set as the bots and agents, so only mouse movement.

An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions

Related tags

Overview

Agar.io_Q-Learning_AI

Owner

Official Keras Implementation for UNet++ in IEEE Transactions on Medical Imaging and DLMIA 2018

Residual Pathway Priors for Soft Equivariance Constraints

Pytorch Implementation of Spiking Neural Networks Calibration, ICML 2021

A3C LSTM Atari with Pytorch plus A3G design

[Link]mareteutral - pars tradg wth M []

StyleGAN2 with adaptive discriminator augmentation (ADA) - Official TensorFlow implementation

Graph Convolutional Networks for Temporal Action Localization (ICCV2019)

code for our BMVC 2021 paper "HCV: Hierarchy-Consistency Verification for Incremental Implicitly-Refined Classification"

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021) (PyTorch) - We released the training code!

SimulLR - PyTorch Implementation of SimulLR

The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

Code for "OctField: Hierarchical Implicit Functions for 3D Modeling (NeurIPS 2021)"

Deformable DETR is an efficient and fast-converging end-to-end object detector.

DrQ-v2: Improved Data-Augmented Reinforcement Learning

[RSS 2021] An End-to-End Differentiable Framework for Contact-Aware Robot Design

Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

SUPERVISED-CONTRASTIVE-LEARNING-FOR-PRE-TRAINED-LANGUAGE-MODEL-FINE-TUNING - The Facebook paper about fine tuning RoBERTa with contrastive loss

Learning High-Speed Flight in the Wild

This is a demo app to be used in the video streaming applications