Deep Learning agent of Starcraft2, similar to AlphaStar of DeepMind except size of network.

Overview

Introduction

This repository is for Deep Learning agent of Starcraft2. It is very similar to AlphaStar of DeepMind except size of network. I only test my code with Minigame, Simple64 map of PySC2. However, I am sure this code will work at more large scale game if network size is grown.

I am going to implement the IMPALA method soon for training the full game using RL.

Reference

  1. Download replay file(4.8.2 version file is needed): https://github.com/Blizzard/s2client-proto/tree/master/samples/replay-api
  2. Extracting observation, action from replay file: https://github.com/narhen/pysc2-replay
  3. FullyConv model of Tensorflow 1 version: https://github.com/simonmeister/pysc2-rl-agents
  4. Supervised Learning technique: https://github.com/metataro/sc2_imitation_learning/tree/8dca03e9be92e2d8297a4bc34248939af5c7ec3b

Version

Python

  1. Python3
  2. PySC2 3.0.0: https://github.com/deepmind/pysc2
  3. Tensorflow-gpu 2.3.0
  4. Tensorflow-probability 0.11.0
  5. Hickle 4.0.4
  6. Pygame 1.9.6
  7. Sklearn

Starcraft2

  1. Client 4.8.2: https://github.com/Blizzard/s2client-proto#downloads
  2. Replay 4.8.2

PC capaticy

  1. One NVIDIA Titan V
  2. 32GB RAM

Network architecture

Notice

There may be a minor error such a GPU setting, and network size. However, you can run it without major modification because I check that latest code works for Superviesed, Reinforcment Learning. It is not easy to check every part of code because it is huge.

Supervised Learning

I can only check that model with LSTM works well in Supervised Learning. FullyConv model does not show good performance yet although it fast then LSTM model for training.

Simple64

To implement AlphaStar susuccessfully, Supervised Training is crucial. Instead of using the existing replay data to check simple network of mine, I collect amount of 1000 number of replay files in Simple64 map using only Terran, and Marine rush from two Barrack with Random race opponent.

First, change a Starcraft2 replay file to hkl file format for fast training. It will remove a step of no_op action except when it is occured at first, end of episode and 8 dividble step. You need a around 80GB disk space to convert number of around 1000 replay files to hkl. Current, I only use replay file of Terran vs Terran.

$ python trajectory_generator.py --replay_path [your path]/StarCraftII/Replays/local_Simple64/ --saving_path [your path]/pysc2_dataset/simple64

After making hkl file of replay in your workspace, try to start the Supervised Learning using below command. It will save a trained model under Models folder of your workspace.

$ python run_supervised_learning.py --workspace_path [your path]/AlphaStar_Implementation/ --model_name alphastar --training True --gpu_use True --learning_rate 0.0001 --replay_hkl_file_path [your path]/pysc2_dataset/simple64/ --environment Simple64 --model_name alphastar

You can check training progress using Tensorboard under tensorboard folder of your workspace. It will take very long time to finish training becasue of vast of observation and action space.

Below is code for evaluating trained model

python run_evaluation.py --workspace_path [your path]/AlphaStar_Implementation/ --gpu_use True --visualize True --environment Simple64 --pretrained_model supervised_model

Video of downisde is one of behavior example of trained agent.

Supervised Learning demo Click to Watch!

I only use a replay file of Terran vs Terran case. Therefore, agent only need to recognize 19 unit during game. It can make the size of model do not need to become huge. Total unit number of Starcraft 2 is over 100 in full game case. For that, we need more powerful GPU to run.

Reinforcement Learning

I can only check that FullyConv works well in Reinforcement Learning. Model with LSTM takes too much time for training and does not show better performance than FullyConv yet.

MoveToBeacon

First, let's test the sample code for MoveToBeacon environment which is the simplest environment in PySC2 using model which has similar network structure as AlphaStar. First, run 'git clone https://github.com/kimbring2/AlphaStar_Implementation.git' command in your workspace. Next, start training by using below command.

$ python run_reinforcement_learning.py --workspace_path [your path]/AlphaStar_Implementation/ --training True --gpu_use True --save_model True --num_worker 5 --model_name alphastar

I provide a FullyConv, AlphaStar style model. You can change a model by using the model_name argument. Default is FullyConv model.

After the training is completed, test it using the following command. Training performance is based on two parameter. Try to use a 1.0 as the gradient_clipping and 0.0001 as the learning_rate. Futhermore, trarning progress and result are depends on the seed value. Model is automatically saved if the average reward is over 5.0.

Gradient clipping is essential for training the model of PySC2 because it has multiple stae encoder, action head network. In my experience, gradient norm value is changed based on network size. Therefore, you should check it everytime you change model structure. You can check it by using 'tf.linalg.global_norm' function.

grads = tape.gradient(loss, model.trainable_variables)
grad_norm = tf.linalg.global_norm(grads)
tf.print("grad_norm: ", grad_norm)
grads, _ = tf.clip_by_global_norm(grads, arguments.gradient_clipping)

Afater checking norm value, you should remove an outlier value among them.

After training against various parameter, I can obtain the following graph of average score.

After finishing training, run below command to test pretrained model that was saved under Models folder of workspace.

$ python run_evaluation.py --environment Simple64 --workspace_path [your path]/AlphaStar_Implementation --visualize True --model_name alphastar --pretrained_model reinforcement_model

If the accumulated reward is over 20 per episode, you can see the Marine follow the beacon well.

Detailed information

I am writing explanation for code at Medium as series.

  1. Tutorial about Replay file: https://medium.com/@dohyeongkim/alphastar-implementation-serie-part1-606572ddba99
  2. Tutorial about Network: https://dohyeongkim.medium.com/alphastar-implementation-series-part5-fd275bea68b5
  3. Tutorial about Reinforcement Learning: https://medium.com/nerd-for-tech/alphastar-implementation-series-part6-4044e7efb1ce
  4. Tutorial about Supervised Learning: https://dohyeongkim.medium.com/alphastar-implementation-series-part7-d28468c07739

License

Apache License 2.0

Comments
  • Map 'mini_games\MoveToBeacon.SC2Map' not found.

    Map 'mini_games\MoveToBeacon.SC2Map' not found.

    run this project,There is a problem:‘Map 'mini_games\MoveToBeacon.SC2Map' not found.’ I'm sorry to bother you, but I don't know why. Hope you can answer .Thanks

    opened by ashaokai123 6
  • Failed on running trajectory_generator.py: RuntimeError SC2_x64

    Failed on running trajectory_generator.py: RuntimeError SC2_x64

    when I tried to run python trajectory_generator.py, I got error messages below and got nothing in the saving_path pysc2_dataset/simple64.

    RuntimeError: Trying to run '/home/auto/StarCraftII/Versions/Base71663/SC2_x64', but it isn't executable.

    opened by mlx3223mlx 3
  • Failed on running trajectory_generator.py: Could not find map name

    Failed on running trajectory_generator.py: Could not find map name

    I did download replay files from : https://drive.google.com/drive/folders/1lqb__ubLKLfw4Jiig6KsO-D0e_wrnGWk?usp=sharing, but when I tried to run python trajectory_generator.py --replay_path [your path]/StarCraftII/Replays/local_Simple64/ --saving_path [your path]/pysc2_dataset/simple64, I got error messages below and got nothing in the saving_path pysc2_dataset/simple64.

    OpenGL initialized! Listening on: 127.0.0.1:18148 Startup Phase 3 complete. Ready for commands. ConnectHandler: Request from 127.0.0.1:37386 accepted ReadyHandler: 127.0.0.1:37386 ready Could not find map name for file: /tmp/sc-k92ku45y/StarCraft II/TempReplayInfo.SC2Replay Configuring interface options Configure: raw interface enabled Configure: feature layer interface enabled Configure: score interface enabled Configure: render interface disabled Launching next game. Next launch phase started: 2 Next launch phase started: 3 Next launch phase started: 4 Next launch phase started: 5 Next launch phase started: 6 Next launch phase started: 7 Next launch phase started: 8 Starting replay 'TempStartReplay.SC2Replay' Game has started. Using default stable ids, none found at: /home/dev/SC2.4.8.2/StarCraftII/stableid.json Successfully loaded stable ids: GameData\stableid.json Could not find map name for file: /tmp/sc-k92ku45y/StarCraft II/TempReplayInfo.SC2Replay player1_race fail Could not find map name for file: /tmp/sc-k92ku45y/StarCraft II/TempReplayInfo.SC2Replay Configuring interface options Configure: raw interface enabled Configure: feature layer interface enabled Configure: score interface enabled Configure: render interface disabled Launching next game. Next launch phase started: 2 Next launch phase started: 3 Next launch phase started: 4 Next launch phase started: 5 Next launch phase started: 6 Next launch phase started: 7 Next launch phase started: 8 Starting replay 'TempStartReplay.SC2Replay' Game has started. Could not find map name for file: /tmp/sc-k92ku45y/StarCraft II/TempReplayInfo.SC2Replay player1_race fail Could not find map name for file: /tmp/sc-k92ku45y/StarCraft II/TempReplayInfo.SC2Replay

    OS version is ubuntu16.04, python version is 3.7.7 and other dependencies are installed according to README.

    opened by zhang-yingping 3
  • about the spatial encoder

    about the spatial encoder

    according https://ychai.uk/notes/2019/07/21/RL/DRL/Decipher-AlphaStar-on-StarCraft-II/ , the spatial encoder may be not consistant with the description of the paper presented below:

    Spatial encoder Inputs: map, entity_embeddings Outputs: embedded_spatial - A 1D tensor of the embedded map map_skip - output tensors of intermediate computation, used for skip connections. map: add two features

    cameral: whether a location is inside/outside the virtual camera; scattered entities. Pass entity_embeddings through a size 32 conv1D followed by a ReLU, then scattered into a map layer so that the 32 vector at a specific location corresponds to the units placed there. Concatenated all planes including camera, scattered_entities, vasibility, entity_owners, buildable, etc. Project to 32 channels by 2D conv with kernel size 1, followed by a ReLU. Then downsampled from 128x128 to 16x16 through 3 conv2D and ReLUs with different channel sizes (i.e., 64, 128, and 128).

    embedded_spatial: The ResBlock output is embedded into a 1D tensor of size 256 by a MLP and a ReLU.

    opened by SongleChen2015 1
  • Process finished with exit code 137 (interrupted by signal 9: SIGKILL)

    Process finished with exit code 137 (interrupted by signal 9: SIGKILL)

    Hi there,

    When I ran the reinforcement learning, the program was interrupted with the exit code 137 (interrupted by signal 9: SIGKILL),

    I found that the memory of the RAM was increasing in the Reinforcement Learning training process, and the training was interrupted when the memory was over 100%.

    Step 320 image

    Step 400 image

    Thank you for your help.

    opened by HenryCY 0
  • IndexError: list index out of range

    IndexError: list index out of range

    Traceback (most recent call last): File "C:\Users\JACK\Desktop\AlphaStar_Implementation\run_reinforcement_learning.py", line 77, in tf.config.experimental.set_memory_growth(physical_devices[0], True) IndexError: list index out of range

    opened by JBX2010 0
Releases(v1.0.0)
Owner
Dohyeong Kim
Researchers interested in creating agents that behave like humans using Deep Learning
Dohyeong Kim
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.

Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.

Nikolas Petrou 1 Jan 13, 2022
Python binding for Khiva library.

Khiva-Python Build Documentation Build Linux and Mac OS Build Windows Code Coverage README This is the Khiva Python binding, it allows the usage of Kh

Shapelets 46 Oct 16, 2022
A Novel Plug-in Module for Fine-grained Visual Classification

Pytorch implementation for A Novel Plug-in Module for Fine-Grained Visual Classification. fine-grained visual classification task.

ChouPoYung 109 Dec 20, 2022
(CVPR2021) DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation

DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation CVPR2021(oral) [arxiv] Requirements python3.7 pytorch==

W-zx-Y 85 Dec 07, 2022
Official Implementation of SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations

Official Implementation of SimIPU SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations Since

Zhyever 37 Dec 01, 2022
Code repository for "Reducing Underflow in Mixed Precision Training by Gradient Scaling" presented at IJCAI '20

Reducing Underflow in Mixed Precision Training by Gradient Scaling This project implements the gradient scaling method to improve the performance of m

Ruizhe Zhao 5 Apr 14, 2022
2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation Authors: Ge-Peng Ji*, Yu-Cheng Chou*, Deng-Ping Fan, Geng Che

Ge-Peng Ji (Daniel) 85 Dec 30, 2022
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

mlpack 4.2k Jan 09, 2023
Data and code for ICCV 2021 paper Distant Supervision for Scene Graph Generation.

Distant Supervision for Scene Graph Generation Data and code for ICCV 2021 paper Distant Supervision for Scene Graph Generation. Introduction The pape

THUNLP 23 Dec 31, 2022
A general framework for deep learning experiments under PyTorch based on pytorch-lightning

torchx Torchx is a general framework for deep learning experiments under PyTorch based on pytorch-lightning. TODO list gan-like training wrapper text

Yingtian Liu 6 Mar 17, 2022
A high-level Python library for Quantum Natural Language Processing

lambeq About lambeq is a toolkit for quantum natural language processing (QNLP). Documentation: https://cqcl.github.io/lambeq/ Getting started Prerequ

Cambridge Quantum 315 Jan 01, 2023
PyTorch implementation of the YOLO (You Only Look Once) v2

PyTorch implementation of the YOLO (You Only Look Once) v2 The YOLOv2 is one of the most popular one-stage object detector. This project adopts PyTorc

申瑞珉 (Ruimin Shen) 433 Nov 24, 2022
Resources for the Ki testnet challenge

Ki Testnet Challenge This repository hosts ki-testnet-challenge. A set of scripts and resources to be used for the Ki Testnet Challenge What is the te

Ki Foundation 23 Aug 08, 2022
UPSNet: A Unified Panoptic Segmentation Network

UPSNet: A Unified Panoptic Segmentation Network Introduction UPSNet is initially described in a CVPR 2019 oral paper. Disclaimer This repository is te

Uber Research 622 Dec 26, 2022
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps Here is the code for ssbassline model. We also provide OCR results/features/mode

ZephyrZhuQi 51 Nov 18, 2022
CSE-519---Project - Job Title Analysis (Project for CSE 519 - Data Science Fundamentals)

A Multifaceted Approach to Job Title Analysis CSE 519 - Data Science Fundamentals Project Description Project consists of three parts: Salary Predicti

Jimit Dholakia 1 Jan 04, 2022
Lorien: A Unified Infrastructure for Efficient Deep Learning Workloads Delivery

Lorien: A Unified Infrastructure for Efficient Deep Learning Workloads Delivery Lorien is an infrastructure to massively explore/benchmark the best sc

Amazon Web Services - Labs 45 Dec 12, 2022
A pytorch implementation of Paper "Improved Training of Wasserstein GANs"

WGAN-GP An pytorch implementation of Paper "Improved Training of Wasserstein GANs". Prerequisites Python, NumPy, SciPy, Matplotlib A recent NVIDIA GPU

Marvin Cao 1.4k Dec 14, 2022
FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction

FaceExtraction FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction Occlusions often occur in face images in the wild, tr

16 Dec 14, 2022
Pytorch implementation of Value Iteration Networks (NIPS 2016 best paper)

VIN: Value Iteration Networks A quick thank you A few others have released amazing related work which helped inspire and improve my own implementation

Kent Sommer 297 Dec 26, 2022