TensorFlow implementation of ENet, trained on the Cityscapes dataset.

Last update: Dec 16, 2022

Overview

segmentation

TensorFlow implementation of ENet (https://arxiv.org/pdf/1606.02147.pdf) based on the official Torch implementation (https://github.com/e-lab/ENet-training) and the Keras implementation by PavlosMelissinos (https://github.com/PavlosMelissinos/enet-keras), trained on the Cityscapes dataset (https://www.cityscapes-dataset.com/).

Youtube video of results (https://youtu.be/HbPhvct5kvs):
The results in the video can obviously be improved, but because of limited computing resources (personally funded Azure VM) I did not perform any further hyperparameter tuning.

You might get the error "No gradient defined for operation 'MaxPoolWithArgmax_1' (op type: MaxPoolWithArgmax)". To fix this, I had to add the following code to the file /usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_grad.py:

@ops.RegisterGradient("MaxPoolWithArgmax")  
def _MaxPoolGradWithArgmax(op, grad, unused_argmax_grad):  
  return gen_nn_ops._max_pool_grad_with_argmax(op.inputs[0], grad, op.outputs[1], op.get_attr("ksize"), op.get_attr("strides"), padding=op.get_attr("padding"))

Documentation:

preprocess_data.py:

ASSUMES: that all Cityscapes training (validation) image directories have been placed in data_dir/cityscapes/leftImg8bit/train (data_dir/cityscapes/leftImg8bit/val) and that all corresponding ground truth directories have been placed in data_dir/cityscapes/gtFine/train (data_dir/cityscapes/gtFine/val).
DOES: script for performing all necessary preprocessing of images and labels.

model.py:

ASSUMES: that preprocess_data.py has already been run.
DOES: contains the ENet_model class.

utilities.py:

ASSUMES: -
DOES: contains a number of functions used in different parts of the project.

train.py:

ASSUMES: that preprocess_data.py has already been run.
DOES: script for training the model.

run_on_sequence.py:

ASSUMES: that preprocess_data.py has already been run.
DOES: runs a model checkpoint (set in line 56) on all frames in a Cityscapes demo sequence directory (set in line 30) and creates a video of the result.

Training details:

In the paper the authors suggest that you first pretrain the encoder to categorize downsampled regions of the input images, I did however train the entire network from scratch.
Batch size: 4.
For all other hyperparameters I used the same values as in the paper.
Training loss:
Validation loss:
The results in the video above was obtained with the model at epoch 23, for which a checkpoint is included in segmentation/training_logs/best_model in the repo.

Training on Microsoft Azure:

To train the model, I used an NC6 virtual machine on Microsoft Azure. Below I have listed what I needed to do in order to get started, and some things I found useful. For reference, my username was 'fregu856':

Download Cityscapes.
Install docker-ce:
- $ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
- $ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
- $ sudo apt-get update
- $ sudo apt-get install -y docker-ce
Install CUDA drivers (see "Install CUDA drivers for NC VMs" in https://docs.microsoft.com/en-us/azure/virtual-machines/linux/n-series-driver-setup):
- $ CUDA_REPO_PKG=cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
- $ wget -O /tmp/${CUDA_REPO_PKG} http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/${CUDA_REPO_PKG}
- $ sudo dpkg -i /tmp/${CUDA_REPO_PKG}
- $ rm -f /tmp/${CUDA_REPO_PKG}
- $ sudo apt-get update
- $ sudo apt-get install cuda-drivers
- Reboot the VM
Install nvidia-docker:
- $ wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
- $ sudo dpkg -i /tmp/nvidia-docker*.deb && rm /tmp/nvidia-docker*.deb
- $ sudo nvidia-docker run --rm nvidia/cuda nvidia-smi
Download the latest TensorFlow docker image with GPU support (tensorflow 1.3):
- $ sudo docker pull tensorflow/tensorflow:latest-gpu
Create start_docker_image.sh containing:

#!/bin/bash

# DEFAULT VALUES
GPUIDS="0"
NAME="fregu856_GPU"


NV_GPU="$GPUIDS" nvidia-docker run -it --rm \
        -p 5584:5584 \
        --name "$NAME""$GPUIDS" \
        -v /home/fregu856:/root/ \
        tensorflow/tensorflow:latest-gpu bash

/root/ will now be mapped to /home/fregu856 (i.e., $ cd -- takes you to the regular home folder).
To start the image:
- $ sudo sh start_docker_image.sh
To commit changes to the image:
- Open a new terminal window.
- $ sudo docker commit fregu856_GPU0 tensorflow/tensorflow:latest-gpu
To stop the image when it’s running:
- $ sudo docker stop fregu856_GPU0
To exit the image without killing running code:
- Ctrl-P + Q
To get back into a running image:
- $ sudo docker attach fregu856_GPU0
To open more than one terminal window at the same time:
- $ sudo docker exec -it fregu856_GPU0 bash
To install the needed software inside the docker image:
- $ apt-get update
- $ apt-get install nano
- $ apt-get install sudo
- $ apt-get install wget
- $ sudo apt-get install libopencv-dev python-opencv
- Commit changes to the image (otherwise, the installed packages will be removed at exit!)

TensorFlow implementation of ENet, trained on the Cityscapes dataset.

Related tags

Overview

segmentation

Documentation:

Training details:

Training on Microsoft Azure:

Owner

Fredrik Gustafsson

N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting

The 2nd place solution of 2021 google landmark retrieval on kaggle.

MOpt-AFL provided by the paper "MOPT: Optimized Mutation Scheduling for Fuzzers"

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Official Implementation for the paper DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover’s Distance Improves Out-Of-Distribution Face Identification

Learning with Subset Stacking

git《Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser》(2021) GitHub: [fig5]

《Lerning n Intrinsic Grment Spce for Interctive Authoring of Grment Animtion》

Readings for "A Unified View of Relational Deep Learning for Polypharmacy Side Effect, Combination Therapy, and Drug-Drug Interaction Prediction."

Official NumPy Implementation of Deep Networks from the Principle of Rate Reduction (2021)

A repo with study material, exercises, examples, etc for Devnet SPAUTO

Unofficial Implementation of MLP-Mixer, gMLP, resMLP, Vision Permutator, S2MLPv2, RaftMLP, ConvMLP, ConvMixer in Jittor and PyTorch.

Indonesian Car License Plate Character Recognition using Tensorflow, Keras and OpenCV.

Contrastive Learning with Non-Semantic Negatives

Official Implementation for the "An Empirical Investigation of 3D Anomaly Detection and Segmentation" paper.

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

ToFFi - Toolbox for Frequency-based Fingerprinting of Brain Signals

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

Code for ICML 2021 paper: How could Neural Networks understand Programs?

Disentangled Face Attribute Editing via Instance-Aware Latent Space Search, accepted by IJCAI 2021.