DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

Overview

DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现


目录

  1. 性能情况 Performance
  2. 所需环境 Environment
  3. 注意事项 Attention
  4. 文件下载 Download
  5. 训练步骤 How2train
  6. 预测步骤 How2predict
  7. 评估步骤 miou
  8. 参考资料 Reference

性能情况

训练数据集 权值文件名称 测试数据集 输入图片大小 mIOU
VOC12+SBD deeplabv3_mobilenetv2.h5 VOC-Val12 512x512 72.50
VOC12+SBD deeplabv3_xception.h5 VOC-Val12 512x512 87.10

所需环境

tensorflow==2.2.0

注意事项

代码中的deeplabv3_mobilenetv2.h5和deeplabv3_xception.h5是基于VOC拓展数据集训练的。训练和预测时注意修改backbone。

文件下载

训练所需的deeplabv3_mobilenetv2.h5和deeplabv3_xception.h5可在百度网盘中下载。
链接: https://pan.baidu.com/s/1zVRshWRkb5C3kmDMwEf89A 提取码: ccq5

VOC拓展数据集的百度网盘如下:
链接: https://pan.baidu.com/s/1BrR7AUM1XJvPWjKMIy2uEw 提取码: vszf

训练步骤

a、训练voc数据集

1、将我提供的voc数据集放入VOCdevkit中(无需运行voc_annotation.py)。
2、在train.py中设置对应参数,默认参数已经对应voc数据集所需要的参数了,所以只要修改backbone和model_path即可。
3、运行train.py进行训练。

b、训练自己的数据集

1、本文使用VOC格式进行训练。
2、训练前将标签文件放在VOCdevkit文件夹下的VOC2007文件夹下的SegmentationClass中。
3、训练前将图片文件放在VOCdevkit文件夹下的VOC2007文件夹下的JPEGImages中。
4、在训练前利用voc_annotation.py文件生成对应的txt。
5、在train.py文件夹下面,选择自己要使用的主干模型和下采样因子。本文提供的主干模型有mobilenet和xception。下采样因子可以在8和16中选择。需要注意的是,预训练模型需要和主干模型相对应。
6、注意修改train.py的num_classes为分类个数+1。
7、运行train.py即可开始训练。

预测步骤

a、使用预训练权重

1、下载完库后解压,如果想用backbone为mobilenet的进行预测,直接运行predict.py就可以了;如果想要利用backbone为xception的进行预测,在百度网盘下载deeplab_xception.h5,放入model_data,修改deeplab.py的backbone和model_path之后再运行predict.py,输入。

img/street.jpg

可完成预测。
2、在predict.py里面进行设置可以进行fps测试、整个文件夹的测试和video视频检测。

b、使用自己训练的权重

1、按照训练步骤训练。
2、在deeplab.py文件里面,在如下部分修改model_path、num_classes、backbone使其对应训练好的文件;model_path对应logs文件夹下面的权值文件,num_classes代表要预测的类的数量加1,backbone是所使用的主干特征提取网络

_defaults = {
    #----------------------------------------#
    #   model_path指向logs文件夹下的权值文件
    #----------------------------------------#
    "model_path"        : 'model_data/deeplabv3_mobilenetv2.h5',
    #----------------------------------------#
    #   所需要区分的类的个数+1
    #----------------------------------------#
    "num_classes"       : 21,
    #----------------------------------------#
    #   所使用的的主干网络:mobilenet、xception    
    #----------------------------------------#
    "backbone"          : "mobilenet",
    #----------------------------------------#
    #   输入图片的大小
    #----------------------------------------#
    "input_shape"       : [512, 512],
    #----------------------------------------#
    #   下采样的倍数,一般可选的为8和16
    #   与训练时设置的一样即可
    #----------------------------------------#
    "downsample_factor" : 16,
    #--------------------------------#
    #   blend参数用于控制是否
    #   让识别结果和原图混合
    #--------------------------------#
    "blend"             : True,
}

3、运行predict.py,输入

img/street.jpg

可完成预测。
4、在predict.py里面进行设置可以进行fps测试、整个文件夹的测试和video视频检测。

评估步骤

1、设置get_miou.py里面的num_classes为预测的类的数量加1。
2、设置get_miou.py里面的name_classes为需要去区分的类别。
3、运行get_miou.py即可获得miou大小。

Reference

https://github.com/ggyyzm/pytorch_segmentation
https://github.com/bonlime/keras-deeplab-v3-plus

You might also like...
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

The Most Important Thing. Our code is developed based on: LXMERT: Learning Cross-Modality Encoder Representations from Transformers

Official implementation for Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020

Likelihood-Regret Official implementation of Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020. T

Official Implementation for
Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement Recently, the power of unconditional image synthesis has significantly advanced th

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

A Joint Video and Image Encoder for End-to-End Retrieval
A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval project page | arXiv | webvid-data Repository containing the code,

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

VAENAR-TTS - PyTorch Implementation PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Code for the paper
Code for the paper "Adversarial Generator-Encoder Networks"

This repository contains code for the paper "Adversarial Generator-Encoder Networks" (AAAI'18) by Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky. Pr

PyTorch implementation of SQN based on CloserLook3D's encoder

SQN_pytorch This repo is an implementation of Semantic Query Network (SQN) using CloserLook3D's encoder in Pytorch. For TensorFlow implementation, che

Comments
  • How to reproduce the model deeplabv3_xception.h5?

    How to reproduce the model deeplabv3_xception.h5?

    Hi, I'm trying to train deeplabv3 with xception backbone on voc + SBD dataset. You provided the voc pretrained model deeplabv3_xception.h5. But If I want to reproduce your training result, I should not use it as pretrained model, right? So I comment out the line in train.py to not loading pretrained weights. But after 100 epochs, my model accuracy is poor, compared with your model. Did I miss something, do I need something like an ImageNet pretrained model or COCO pretrained model? Thanks!

    opened by zhimengf 6
  • How to export a .pb file

    How to export a .pb file

    Hi, I have problem exporting a .pb file with the current produced .h5 files. I do not think you have provided the export method in the project file. Could you please give me some advice on that? @bubbliiiing

    opened by 77knight 1
Releases(v3.0)
  • v3.0(Apr 22, 2022)

    重要更新

    • 支持step、cos学习率下降法。
    • 支持adam、sgd优化器选择。
    • 支持学习率根据batch_size自适应调整。
    • 支持不同预测模式的选择,单张图片预测、文件夹预测、视频预测、图片裁剪。
    • 更新summary.py文件,用于观看网络结构。
    • 增加了多GPU训练。
    Source code(tar.gz)
    Source code(zip)
  • v2.0(Mar 4, 2022)

    重要更新

    • 更新train.py文件,增加了大量的注释,增加多个可调整参数。
    • 更新predict.py文件,增加了大量的注释,增加fps、视频预测、批量预测等功能。
    • 更新deeplab.py文件,增加了大量的注释,增加先验框选择、置信度、非极大抑制等参数。
    • 合并get_dr_txt.py、get_gt_txt.py和get_map.py文件,通过一个文件来实现数据集的评估。
    • 更新voc_annotation.py文件,增加多个可调整参数。
    • 更新summary.py文件,用于观看网络结构。
    Source code(tar.gz)
    Source code(zip)
Owner
Bubbliiiing
Bubbliiiing
Running AlphaFold2 (from ColabFold) in Azure Machine Learning

Running AlphaFold2 (from ColabFold) in Azure Machine Learning Colby T. Ford, Ph.D. Companion repository for Medium Post: How to predict many protein s

Colby T. Ford 3 Feb 18, 2022
Code for "Optimizing risk-based breast cancer screening policies with reinforcement learning"

Tempo: Optimizing risk-based breast cancer screening policies with reinforcement learning Introduction This repository was used to develop Tempo, as d

Adam Yala 12 Oct 11, 2022
Face recognize system

FRS Face_recognize_system This project contains my work that target on solving some problems of FRS: Face detection: Retinaface Face anti-spoofing: Fo

Tran Anh Tuan 4 Nov 18, 2021
The official implementation of ICCV paper "Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds".

Box-Aware Tracker (BAT) Pytorch-Lightning implementation of the Box-Aware Tracker. Box-Aware Feature Enhancement for Single Object Tracking on Point C

Kangel Zenn 5 Mar 26, 2022
[CVPR 2021] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition (CVPR 2021) arXiv Prerequisite PyTorch = 1.2.0 Python3 torchvision PIL argpar

51 Nov 11, 2022
Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

CSRL Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning Python: 3

4 Apr 14, 2022
This repo holds codes of the ICCV21 paper: Visual Alignment Constraint for Continuous Sign Language Recognition.

VAC_CSLR This repo holds codes of the paper: Visual Alignment Constraint for Continuous Sign Language Recognition.(ICCV 2021) [paper] Prerequisites Th

Yuecong Min 64 Dec 19, 2022
Official implementation of the paper "Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering"

Light Field Networks Project Page | Paper | Data | Pretrained Models Vincent Sitzmann*, Semon Rezchikov*, William Freeman, Joshua Tenenbaum, Frédo Dur

Vincent Sitzmann 130 Dec 29, 2022
A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

What Dead simple python wrapper for Yolo V3 using AlexyAB's darknet fork. Works with CUDA 10.1 and OpenCV 4.1 or later (I use OpenCV master as of Jun

Pliable Pixels 6 Jan 12, 2022
Implementation of SSMF: Shifting Seasonal Matrix Factorization

SSMF Implementation of SSMF: Shifting Seasonal Matrix Factorization, Koki Kawabata, Siddharth Bhatia, Rui Liu, Mohit Wadhwa, Bryan Hooi. NeurIPS, 2021

Koki Kawabata 9 Jun 10, 2022
Code for "Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation". [AAAI 2021]

Graph Evolving Meta-Learning for Low-resource Medical Dialogue Generation Code to be further cleaned... This repo contains the code of the following p

Shuai Lin 29 Nov 01, 2022
🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"

🆕 Are you looking for a new YOLOv3 implemented by TF2.0 ? If you hate the fucking tensorflow1.x very much, no worries! I have implemented a new YOLOv

3.6k Dec 26, 2022
Categorizing comments on YouTube into different categories.

Youtube Comments Categorization This repo is for categorizing comments on a youtube video into different categories. negative (grievances, complaints,

Rhitik 5 Nov 26, 2022
✅ How Robust are Fact Checking Systems on Colloquial Claims?. In NAACL-HLT, 2021.

How Robust are Fact Checking Systems on Colloquial Claims? Official PyTorch implementation of our NAACL paper: Byeongchang Kim*, Hyunwoo Kim*, Seokhee

Byeongchang Kim 19 Mar 15, 2022
An educational resource to help anyone learn deep reinforcement learning.

Status: Maintenance (expect bug fixes and minor updates) Welcome to Spinning Up in Deep RL! This is an educational resource produced by OpenAI that ma

OpenAI 7.6k Jan 09, 2023
Crowd-Kit is a powerful Python library that implements commonly-used aggregation methods for crowdsourced annotation and offers the relevant metrics and datasets

Crowd-Kit: Computational Quality Control for Crowdsourcing Documentation Crowd-Kit is a powerful Python library that implements commonly-used aggregat

Toloka 125 Dec 30, 2022
"3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021

Texformer: 3D Human Texture Estimation from a Single Image with Transformers This is the official implementation of "3D Human Texture Estimation from

XiangyuXu 193 Dec 05, 2022
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

Mamy Ratsimbazafy 360 Dec 10, 2022
[3DV 2020] PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction

PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction International Conference on 3D Vision, 2020 Sai Sagar Jinka1, Rohan

Rohan Chacko 39 Oct 12, 2022
CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-agent Path Planning in Continuous Spaces

CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-agent Path Planning in Continuous Spaces This is a repository for the following pape

17 Oct 13, 2022