DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

Last update: Nov 25, 2022

Related tags

Overview

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

性能情况

训练数据集	权值文件名称	测试数据集	输入图片大小	mIOU
VOC12+SBD	deeplabv3_mobilenetv2.h5	VOC-Val12	512x512	72.50
VOC12+SBD	deeplabv3_xception.h5	VOC-Val12	512x512	87.10

所需环境

tensorflow==2.2.0

注意事项

代码中的deeplabv3_mobilenetv2.h5和deeplabv3_xception.h5是基于VOC拓展数据集训练的。训练和预测时注意修改backbone。

文件下载

训练所需的deeplabv3_mobilenetv2.h5和deeplabv3_xception.h5可在百度网盘中下载。
链接: https://pan.baidu.com/s/1zVRshWRkb5C3kmDMwEf89A 提取码: ccq5

VOC拓展数据集的百度网盘如下：
链接: https://pan.baidu.com/s/1BrR7AUM1XJvPWjKMIy2uEw 提取码: vszf

训练步骤

a、训练voc数据集

1、将我提供的voc数据集放入VOCdevkit中（无需运行voc_annotation.py）。
2、在train.py中设置对应参数，默认参数已经对应voc数据集所需要的参数了，所以只要修改backbone和model_path即可。
3、运行train.py进行训练。

b、训练自己的数据集

1、本文使用VOC格式进行训练。
2、训练前将标签文件放在VOCdevkit文件夹下的VOC2007文件夹下的SegmentationClass中。
3、训练前将图片文件放在VOCdevkit文件夹下的VOC2007文件夹下的JPEGImages中。
4、在训练前利用voc_annotation.py文件生成对应的txt。
5、在train.py文件夹下面，选择自己要使用的主干模型和下采样因子。本文提供的主干模型有mobilenet和xception。下采样因子可以在8和16中选择。需要注意的是，预训练模型需要和主干模型相对应。
6、注意修改train.py的num_classes为分类个数+1。
7、运行train.py即可开始训练。

预测步骤

a、使用预训练权重

1、下载完库后解压，如果想用backbone为mobilenet的进行预测，直接运行predict.py就可以了；如果想要利用backbone为xception的进行预测，在百度网盘下载deeplab_xception.h5，放入model_data，修改deeplab.py的backbone和model_path之后再运行predict.py，输入。

img/street.jpg

可完成预测。
2、在predict.py里面进行设置可以进行fps测试、整个文件夹的测试和video视频检测。

b、使用自己训练的权重

1、按照训练步骤训练。
2、在deeplab.py文件里面，在如下部分修改model_path、num_classes、backbone使其对应训练好的文件；model_path对应logs文件夹下面的权值文件，num_classes代表要预测的类的数量加1，backbone是所使用的主干特征提取网络。

_defaults = {
    #----------------------------------------#
    #   model_path指向logs文件夹下的权值文件
    #----------------------------------------#
    "model_path"        : 'model_data/deeplabv3_mobilenetv2.h5',
    #----------------------------------------#
    #   所需要区分的类的个数+1
    #----------------------------------------#
    "num_classes"       : 21,
    #----------------------------------------#
    #   所使用的的主干网络：mobilenet、xception    
    #----------------------------------------#
    "backbone"          : "mobilenet",
    #----------------------------------------#
    #   输入图片的大小
    #----------------------------------------#
    "input_shape"       : [512, 512],
    #----------------------------------------#
    #   下采样的倍数，一般可选的为8和16
    #   与训练时设置的一样即可
    #----------------------------------------#
    "downsample_factor" : 16,
    #--------------------------------#
    #   blend参数用于控制是否
    #   让识别结果和原图混合
    #--------------------------------#
    "blend"             : True,
}

3、运行predict.py，输入

img/street.jpg

可完成预测。
4、在predict.py里面进行设置可以进行fps测试、整个文件夹的测试和video视频检测。

评估步骤

1、设置get_miou.py里面的num_classes为预测的类的数量加1。
2、设置get_miou.py里面的name_classes为需要去区分的类别。
3、运行get_miou.py即可获得miou大小。

Reference

https://github.com/ggyyzm/pytorch_segmentation
https://github.com/bonlime/keras-deeplab-v3-plus

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

22 Nov 25, 2022

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

The Most Important Thing. Our code is developed based on: LXMERT: Learning Cross-Modality Encoder Representations from Transformers

53 Dec 16, 2022

Official implementation for Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020

Likelihood-Regret Official implementation of Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020. T

33 Oct 12, 2022

Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement Recently, the power of unconditional image synthesis has significantly advanced th

967 Jan 4, 2023

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

12.6k Jan 9, 2023

A Joint Video and Image Encoder for End-to-End Retrieval

Comments

How to reproduce the model deeplabv3_xception.h5?

Hi, I'm trying to train deeplabv3 with xception backbone on voc + SBD dataset. You provided the voc pretrained model deeplabv3_xception.h5. But If I want to reproduce your training result, I should not use it as pretrained model, right? So I comment out the line in train.py to not loading pretrained weights. But after 100 epochs, my model accuracy is poor, compared with your model. Did I miss something, do I need something like an ImageNet pretrained model or COCO pretrained model? Thanks!

opened by zhimengf 6
How to export a .pb file

Hi, I have problem exporting a .pb file with the current produced .h5 files. I do not think you have provided the export method in the project file. Could you please give me some advice on that? @bubbliiiing

opened by 77knight 1

Releases(v3.0)

v3.0(Apr 22, 2022)
重要更新

支持step、cos学习率下降法。

支持adam、sgd优化器选择。

支持学习率根据batch_size自适应调整。

支持不同预测模式的选择，单张图片预测、文件夹预测、视频预测、图片裁剪。

更新summary.py文件，用于观看网络结构。

增加了多GPU训练。

Source code(tar.gz)
Source code(zip)
v2.0(Mar 4, 2022)
重要更新

更新train.py文件，增加了大量的注释，增加多个可调整参数。

更新predict.py文件，增加了大量的注释，增加fps、视频预测、批量预测等功能。

更新deeplab.py文件，增加了大量的注释，增加先验框选择、置信度、非极大抑制等参数。

合并get_dr_txt.py、get_gt_txt.py和get_map.py文件，通过一个文件来实现数据集的评估。

更新voc_annotation.py文件，增加多个可调整参数。

更新summary.py文件，用于观看网络结构。

Source code(tar.gz)
Source code(zip)
v1.0(Sep 9, 2021)

Source code(tar.gz)
Source code(zip)
deeplabv3_mobilenetv2.h5(10.98 MB)
deeplabv3_xception.h5(158.40 MB)

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

Related tags

Overview

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

目录

性能情况

所需环境

注意事项

文件下载

训练步骤

a、训练voc数据集

b、训练自己的数据集

预测步骤

a、使用预训练权重

b、使用自己训练的权重

评估步骤

Reference

You might also like...

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

Official implementation for Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020

Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

A Joint Video and Image Encoder for End-to-End Retrieval

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Code for the paper "Adversarial Generator-Encoder Networks"

PyTorch implementation of SQN based on CloserLook3D's encoder

Comments

How to reproduce the model deeplabv3_xception.h5?

How to export a .pb file

Releases(v3.0)

v3.0(Apr 22, 2022)

重要更新

v2.0(Mar 4, 2022)

重要更新

v1.0(Sep 9, 2021)

Owner

Bubbliiiing

Running AlphaFold2 (from ColabFold) in Azure Machine Learning

Code for "Optimizing risk-based breast cancer screening policies with reinforcement learning"

Face recognize system

The official implementation of ICCV paper "Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds".

[CVPR 2021] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

This repo holds codes of the ICCV21 paper: Visual Alignment Constraint for Continuous Sign Language Recognition.

Official implementation of the paper "Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering"

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

Implementation of SSMF: Shifting Seasonal Matrix Factorization

Code for "Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation". [AAAI 2021]

🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"

Categorizing comments on YouTube into different categories.

✅ How Robust are Fact Checking Systems on Colloquial Claims?. In NAACL-HLT, 2021.

An educational resource to help anyone learn deep reinforcement learning.

Crowd-Kit is a powerful Python library that implements commonly-used aggregation methods for crowdsourced annotation and offers the relevant metrics and datasets

"3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

[3DV 2020] PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction

CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-agent Path Planning in Continuous Spaces