Flappy Bird hack using Deep Reinforcement Learning (Deep Q-learning).

Overview

Using Deep Q-Network to Learn How To Play Flappy Bird

7 mins version: DQN for flappy bird

Overview

This project follows the description of the Deep Q Learning algorithm described in Playing Atari with Deep Reinforcement Learning [2] and shows that this learning algorithm can be further generalized to the notorious Flappy Bird.

Installation Dependencies:

  • Python 2.7 or 3
  • TensorFlow 0.7
  • pygame
  • OpenCV-Python

How to Run?

git clone https://github.com/yenchenlin1994/DeepLearningFlappyBird.git
cd DeepLearningFlappyBird
python deep_q_network.py

What is Deep Q-Network?

It is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards.

For those who are interested in deep reinforcement learning, I highly recommend to read the following post:

Demystifying Deep Reinforcement Learning

Deep Q-Network Algorithm

The pseudo-code for the Deep Q Learning algorithm, as given in [1], can be found below:

Initialize replay memory D to size N
Initialize action-value function Q with random weights
for episode = 1, M do
    Initialize state s_1
    for t = 1, T do
        With probability ϵ select random action a_t
        otherwise select a_t=max_a  Q(s_t,a; θ_i)
        Execute action a_t in emulator and observe r_t and s_(t+1)
        Store transition (s_t,a_t,r_t,s_(t+1)) in D
        Sample a minibatch of transitions (s_j,a_j,r_j,s_(j+1)) from D
        Set y_j:=
            r_j for terminal s_(j+1)
            r_j+γ*max_(a^' )  Q(s_(j+1),a'; θ_i) for non-terminal s_(j+1)
        Perform a gradient step on (y_j-Q(s_j,a_j; θ_i))^2 with respect to θ
    end for
end for

Experiments

Environment

Since deep Q-network is trained on the raw pixel values observed from the game screen at each time step, [3] finds that remove the background appeared in the original game can make it converge faster. This process can be visualized as the following figure:

Network Architecture

According to [1], I first preprocessed the game screens with following steps:

  1. Convert image to grayscale
  2. Resize image to 80x80
  3. Stack last 4 frames to produce an 80x80x4 input array for network

The architecture of the network is shown in the figure below. The first layer convolves the input image with an 8x8x4x32 kernel at a stride size of 4. The output is then put through a 2x2 max pooling layer. The second layer convolves with a 4x4x32x64 kernel at a stride of 2. We then max pool again. The third layer convolves with a 3x3x64x64 kernel at a stride of 1. We then max pool one more time. The last hidden layer consists of 256 fully connected ReLU nodes.

The final output layer has the same dimensionality as the number of valid actions which can be performed in the game, where the 0th index always corresponds to doing nothing. The values at this output layer represent the Q function given the input state for each valid action. At each time step, the network performs whichever action corresponds to the highest Q value using a ϵ greedy policy.

Training

At first, I initialize all weight matrices randomly using a normal distribution with a standard deviation of 0.01, then set the replay memory with a max size of 500,00 experiences.

I start training by choosing actions uniformly at random for the first 10,000 time steps, without updating the network weights. This allows the system to populate the replay memory before training begins.

Note that unlike [1], which initialize ϵ = 1, I linearly anneal ϵ from 0.1 to 0.0001 over the course of the next 3000,000 frames. The reason why I set it this way is that agent can choose an action every 0.03s (FPS=30) in our game, high ϵ will make it flap too much and thus keeps itself at the top of the game screen and finally bump the pipe in a clumsy way. This condition will make Q function converge relatively slow since it only start to look other conditions when ϵ is low. However, in other games, initialize ϵ to 1 is more reasonable.

During training time, at each time step, the network samples minibatches of size 32 from the replay memory to train on, and performs a gradient step on the loss function described above using the Adam optimization algorithm with a learning rate of 0.000001. After annealing finishes, the network continues to train indefinitely, with ϵ fixed at 0.001.

FAQ

Checkpoint not found

Change first line of saved_networks/checkpoint to

model_checkpoint_path: "saved_networks/bird-dqn-2920000"

How to reproduce?

  1. Comment out these lines

  2. Modify deep_q_network.py's parameter as follow:

OBSERVE = 10000
EXPLORE = 3000000
FINAL_EPSILON = 0.0001
INITIAL_EPSILON = 0.1

References

[1] Mnih Volodymyr, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. Human-level Control through Deep Reinforcement Learning. Nature, 529-33, 2015.

[2] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing Atari with Deep Reinforcement Learning. NIPS, Deep Learning workshop

[3] Kevin Chen. Deep Reinforcement Learning for Flappy Bird Report | Youtube result

Disclaimer

This work is highly based on the following repos:

  1. [sourabhv/FlapPyBird] (https://github.com/sourabhv/FlapPyBird)
  2. asrivat1/DeepLearningVideoGames
Owner
Yen-Chen Lin
PhD student at MIT CSAIL
Yen-Chen Lin
Pygame for humans (pip install hooman) (25k+ downloads)

hooman ~ pygame for humans pip install hooman join discord: https://discord.gg/Q23ATve The package for clearer, shorter and cleaner PyGame codebases!

Abdur-Rahmaan Janhangeer 31 Nov 08, 2022
Experimental Brawl Stars v37.222 server emulator written in Python.

Brawl Stars v37 Experimental Brawl Stars v37.222 server emulator written in Python. Requirements: Python 3.7 or higher colorama Running the server In

13 Oct 08, 2021
Warden - Warden guessing game 1

Warden first python project and first posted project sorry for errors warden gue

hasher 3 Jan 09, 2022
Netskrafl - an Icelandic crossword game website

Netskrafl - an Icelandic crossword game website English summary This repository contains the implementation of an Icelandic crossword game in the genr

Miðeind ehf 30 May 09, 2022
A set of functions compatible with the TIC-80 platform

Pygame-80 A set of functions from TIC-80 tiny computer platform ported to Pygame 2.0.1. Many of them are designed to work with the NumPy library to im

7 Aug 08, 2022
EL JUEGO DEL GUSANITO

EL JUEGO DEL GUSANITO El juego consiste en una línea que no para de moverse, el usuario lo controla con las flechas de: → derecha ← izquierda ↑ arriba

Valeria Saidid Miranda Ibarra 0 Dec 19, 2021
2D ping pong game

pingpong 2D Ping Pong game How to play: player 1 w To move up s To move Down player 2 up To move up down To move Down To change the game settings, you

menachem 0 Mar 27, 2022
Ghdl-interactive-sim - Interactive GHDL simulation of a VHDL adder using Python, Cocotb, and pygame

GHDL Interactive Simulation This is an interactive test bench for a simple VHDL adder. It uses GHDL to elaborate/run the simulation. It is coded in Py

Chuck Benedict 2 Aug 11, 2022
Pokemon game made in Python with open ended requirements from Codecademy

Pokemon game made in Python with open ended requirements from Codecademy. This is one of my first projects utilizing OOP and classes! -This game is a

Kevin Guerrero 2 Dec 29, 2021
Easily manage wine prefixes in a new way. Run Windows software and games on Linux

Bottles Easily manage wineprefix using environments Documentation · Forums · Telegram group · Funding 📚 Documentation Before opening a new issue, che

Bottles 4.1k Jan 09, 2023
🐥Flappy Birds🐤 Video game. With your help I can go through🚀 the pipes. All UI is made with 🐍Pygame🐍

🐠 Flappy Fish 🐢 I am Flappy Fish 🐟 . With your help I can jump through the pipes and experience an interesting and exciting flight deep into the fi

MohammadReza 2 Jan 14, 2022
Automatically prepare your Minecraft maps for release

map-prepare Automatically prepare Mineraft map for release. Current state: kinda works Make sure you have backups for your world before running this p

11 Oct 11, 2022
Box - a world simulator written in python with pygame

Box is a world simulator written in python with pygame. Features A world generation system A world editor Simulates creatures called boxlanders. You c

1up Community 3 Nov 14, 2022
A quantum version of Ladders and Snakes

QPath-and-Snakes A quantum version of Ladders and Snakes Desarrollo Para continuar el desarrollo sin pensar en instalación de dependencias: Descargue

2 Oct 22, 2021
Use different orders of N-gram model to play Hangman game.

Hangman game The Hangman game is a game whereby one person thinks of a word, which is kept secret from another person, who tries to guess the word one

ZavierYang 4 Oct 11, 2022
The Sinclair ZX Spectrum BASIC compiler!

ZX BASIC Copyleft (K) 2008, Jose Rodriguez-Rosa (a.k.a. Boriel) http://www.boriel.com All files in this project are covered under the GPLv3 LICENSE ex

Jose Rodriguez 143 Dec 13, 2022
Minecraft.nix - Command line Minecraft launcher managed by nix

minecraft.nix Inspired by this thread, this flake contains derivations of both v

12 Sep 06, 2022
Never get booted from a game for inactivity ever again

Anti AFK Bot Never get booted from a game for inactivity ever again! Built With Python Installation Clone the repo git clone https://github.com/lippie

1 Dec 05, 2021
This is a repository created to run a workshop on Game Theory using the programming language Python and more specifically an open-source software called the Axelrod Python library

Game-Theory-and-Python This is a repository created to run a workshop on Game Theory using the programming language Python and more specifically an op

Nikoleta Glynatsi 136 Dec 01, 2022
A game developed while learning python

Alien_Invasion a game developed while learning python you must have python-3 installed in your computer. and pygame module is also required for this.

Jani Shubham 0 Oct 10, 2022