U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

Last update: Jan 06, 2022

Overview

U-Net Implementation

By Christopher Ley

This is my interpretation and implementation of the famous paper "U-Net: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

This data set is a Binary Segmentation exercise of ~400 test images of cars from various angles such as those shown here:

Initial implementation for Binary Segmentation

The implementation performs almost as the winners of the competition (Dice: 0.9926 vs 0.99733) after only 5 epoch and we would expect the results to be as good as the winners using this architecture with more training and a little tweaking of the training hyper-parameters.

Here are the scores for training over 5 epochs by running:

(DeepLearning): python3 train.py

Training Results

0%|          | 0/540 [00:00<?, ?it/s]Accuracy: 103298971/467927040 = 22.08%
Dice score: 0.36127230525016785
100%|██████████| 540/540 [05:59<00:00,  1.50it/s, loss=0.0949]
==> Saving Checkpoint to: ./checkpoints/checkpoint_2022-01-06_12:39_epoch_0.pth.tar
Accuracy: 460498379/467927040 = 98.41%
Dice score: 0.9652246236801147
100%|██████████| 540/540 [05:59<00:00,  1.50it/s, loss=0.0469]
==> Saving Checkpoint to: ./checkpoints/checkpoint_2022-01-06_12:48_epoch_1.pth.tar
Accuracy: 461809183/467927040 = 98.69%
Dice score: 0.9711439609527588
100%|██████████| 540/540 [05:56<00:00,  1.51it/s, loss=0.0283]
==> Saving Checkpoint to: ./checkpoints/checkpoint_2022-01-06_12:56_epoch_2.pth.tar
Accuracy: 465675737/467927040 = 99.52%
Dice score: 0.9891990423202515
100%|██████████| 540/540 [06:00<00:00,  1.50it/s, loss=0.0194]
==> Saving Checkpoint to: ./checkpoints/checkpoint_2022-01-06_13:04_epoch_3.pth.tar
Accuracy: 465397979/467927040 = 99.46%
Dice score: 0.9878408908843994
100%|██████████| 540/540 [06:00<00:00,  1.50it/s, loss=0.0142]
==> Saving Checkpoint to: ./checkpoints/checkpoint_2022-01-06_13:12_epoch_4.pth.tar
Accuracy: 466399501/467927040 = 99.67%
Dice score: 0.9926225543022156

And an example of the output vs the ground truth of the validation set, I removed whole makes for the validation set, all 16 angles, the network had never seen this particular make from any angle.

Ground Truth

Prediction

Although limited in scope (binary segmentation for only cars), this architecture performs well with multiclass segmentation, I extended this to apply segmentation to the NYUv2 which is a multiclass objective, with little modification to the above code.

I will clean this up and upload the results and modifications soon!

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

Related tags

Overview

U-Net Implementation

By Christopher Ley

Initial implementation for Binary Segmentation

Training Results

Ground Truth

Prediction

Owner

Christopher Ley

ppo_pytorch_cpp - an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch

Adversarial Learning for Semi-supervised Semantic Segmentation, BMVC 2018

Codes for AAAI22 paper "Learning to Solve Travelling Salesman Problem with Hardness-Adaptive Curriculum"

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Combine Tacotron2 and Hifi GAN to generate speech from text

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

Llvlir - Low Level Variable Length Intermediate Representation

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

✔️ Visual, reactive testing library for Julia. Time machine included.

MetaBalance: High-Performance Neural Networks for Class-Imbalanced Data

Trained on Simulated Data, Tested in the Real World

网络协议2天集训

CSD: Consistency-based Semi-supervised learning for object Detection

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

PyVideoAI: Action Recognition Framework

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

Topic Modelling for Humans

Episodic-memory - Ego4D Episodic Memory Benchmark

Doosan robotic arm, simulation, control, visualization in Gazebo and ROS2 for Reinforcement Learning.

DziriBERT: a Pre-trained Language Model for the Algerian Dialect