这是一个yolox-keras的源码,可以用于训练自己的模型。

Overview

YOLOX:You Only Look Once目标检测模型在Keras当中的实现


目录

  1. 性能情况 Performance
  2. 实现的内容 Achievement
  3. 所需环境 Environment
  4. 小技巧的设置 TricksSet
  5. 文件下载 Download
  6. 训练步骤 How2train
  7. 预测步骤 How2predict
  8. 评估步骤 How2eval
  9. 参考资料 Reference

性能情况

训练数据集 权值文件名称 测试数据集 输入图片大小 mAP 0.5:0.95 mAP 0.5
COCO-Train2017 yolox_s.h5 COCO-Val2017 640x640 39.2 58.7
COCO-Train2017 yolox_m.h5 COCO-Val2017 640x640 46.1 65.2
COCO-Train2017 yolox_l.h5 COCO-Val2017 640x640 49.3 68.1
COCO-Train2017 yolox_x.h5 COCO-Val2017 640x640 50.5 69.2

实现的内容

  • 主干特征提取网络:使用了Focus网络结构。
  • 分类回归层:Decoupled Head,在YoloX中,Yolo Head被分为了分类回归两部分,最后预测的时候才整合在一起。
  • 训练用到的小技巧:Mosaic数据增强、CIOU(原版是IOU和GIOU,CIOU效果类似,都是IOU系列的,甚至更新一些)、学习率余弦退火衰减。
  • Anchor Free:不使用先验框
  • SimOTA:为不同大小的目标动态匹配正样本。

所需环境

tensorflow-gpu==1.13.1
keras==2.1.5

小技巧的设置

在train.py文件下:
1、mosaic参数可用于控制是否实现Mosaic数据增强。
2、Cosine_scheduler可用于控制是否使用学习率余弦退火衰减。
3、label_smoothing可用于控制是否Label Smoothing平滑。

文件下载

训练所需的权值可在百度网盘中下载。
链接: https://pan.baidu.com/s/1o14Vi-CzZEaz9hic_OPZCQ 提取码: 4kc2

VOC数据集下载地址如下:
VOC2007+2012训练集
链接: https://pan.baidu.com/s/16pemiBGd-P9q2j7dZKGDFA 提取码: eiw9

VOC2007测试集
链接: https://pan.baidu.com/s/1BnMiFwlNwIWG9gsd4jHLig 提取码: dsda

训练步骤

a、数据集的准备

1、本文使用VOC格式进行训练,训练前需要自己制作好数据集,如果没有自己的数据集,可以通过Github连接下载VOC12+07的数据集尝试下。
2、训练前将标签文件放在VOCdevkit文件夹下的VOC2007文件夹下的Annotation中。
3、训练前将图片文件放在VOCdevkit文件夹下的VOC2007文件夹下的JPEGImages中。

b、数据集的预处理

1、训练数据集时,在model_data文件夹下建立一个cls_classes.txt,里面写所需要区分的类别。
2、设置根目录下的voc_annotation.py里的一些参数。第一次训练可以仅修改classes_path,classes_path用于指向检测类别所对应的txt,即:

classes_path = 'model_data/cls_classes.txt'

model_data/cls_classes.txt文件内容为:

cat
dog
...

3、设置完成后运行voc_annotation.py,生成训练所需的2007_train.txt以及2007_val.txt。

c、开始网络训练

1、通过voc_annotation.py,我们已经生成了2007_train.txt以及2007_val.txt,此时我们可以开始训练了。
2、设置根目录下的train.py里的一些参数。第一次训练可以仅修改classes_path,classes_path用于指向检测类别所对应的txt,设置方式与b、数据集的预处理类似。训练自己的数据集必须要修改!
3、设置完成后运行train.py开始训练了,在训练多个epoch后,权值会生成在logs文件夹中。
4、训练的参数较多,大家可以在下载库后仔细看注释,其中最重要的部分依然是train.py里的classes_path。

d、训练结果预测

1、训练结果预测需要用到两个文件,分别是yolo.py和predict.py。
2、设置根目录下的yolo.py里的一些参数。第一次预测可以仅修改model_path以及classes_path。训练自己的数据集必须要修改。model_path指向训练好的权值文件,在logs文件夹里。classes_path指向检测类别所对应的txt。
3、设置完成后运行predict.py开始预测了,具体细节查看预测步骤。
4、预测的参数较多,大家可以在下载库后仔细看注释,其中最重要的部分依然是yolo.py里的model_path以及classes_path。

预测步骤

a、使用预训练权重

1、下载完库后解压,在百度网盘下载各个权值,放入model_data,默认使用yolox_s.h5,其它可调整,运行predict.py,输入

img/street.jpg

2、在predict.py里面进行设置可以进行video视频检测、fps测试、批量文件测试与保存。

b、使用自己训练的权重

1、按照训练步骤训练。
2、在yolo.py文件里面,在如下部分修改model_path和classes_path使其对应训练好的文件;model_path对应logs文件夹下面的权值文件,classes_path是model_path对应分的类

_defaults = {
    #--------------------------------------------------------------------------#
    #   使用自己训练好的模型进行预测一定要修改model_path和classes_path!
    #   model_path指向logs文件夹下的权值文件,classes_path指向model_data下的txt
    #   如果出现shape不匹配,同时要注意训练时的model_path和classes_path参数的修改
    #--------------------------------------------------------------------------#
    "model_path"        : 'model_data/yolox_s.h5',
    "classes_path"      : 'model_data/coco_classes.txt',
    #---------------------------------------------------------------------#
    #   输入图片的大小,必须为32的倍数。
    #---------------------------------------------------------------------#
    "input_shape"       : [640, 640],
    #---------------------------------------------------------------------#
    #   所使用的YoloX的版本。s、m、l、x
    #---------------------------------------------------------------------#
    "phi"               : 's',
    #---------------------------------------------------------------------#
    #   只有得分大于置信度的预测框会被保留下来
    #---------------------------------------------------------------------#
    "confidence"        : 0.5,
    #---------------------------------------------------------------------#
    #   非极大抑制所用到的nms_iou大小
    #---------------------------------------------------------------------#
    "nms_iou"           : 0.3,
    "max_boxes"         : 100,
    #---------------------------------------------------------------------#
    #   该变量用于控制是否使用letterbox_image对输入图像进行不失真的resize,
    #   在多次测试后,发现关闭letterbox_image直接resize的效果更好
    #---------------------------------------------------------------------#
    "letterbox_image"   : True,
}

3、运行predict.py,输入

img/street.jpg

4、在predict.py里面进行设置可以进行video视频检测、fps测试、批量文件测试与保存。

评估步骤

1、本文使用VOC格式进行评估。
2、划分测试集,如果在训练前已经运行过voc_annotation.py文件,代码会自动将数据集划分成训练集、验证集和测试集。
3、如果想要修改测试集的比例,可以修改voc_annotation.py文件下的trainval_percent。trainval_percent用于指定(训练集+验证集)与测试集的比例,默认情况下 (训练集+验证集):测试集 = 9:1。train_percent用于指定(训练集+验证集)中训练集与验证集的比例,默认情况下 训练集:验证集 = 9:1。
4、设置根目录下的yolo.py里的一些参数。第一次评估可以仅修改model_path以及classes_path。训练自己的数据集必须要修改。model_path指向训练好的权值文件,在logs文件夹里。classes_path指向检测类别所对应的txt。
5、设置根目录下的get_map.py里的一些参数。第一次评估可以仅修改classes_path,classes_path用于指向检测类别所对应的txt,评估自己的数据集必须要修改。与yolo.py中分开设置的原因是可以让使用者自己选择评估什么类别,而非所有类别。
6、运行get_map.py即可获得评估结果,评估结果会保存在map_out文件夹中。

Reference

https://github.com/Megvii-BaseDetection/YOLOX

You might also like...
Comments
  • using yolox_keras as backbone/pretrained model wanted

    using yolox_keras as backbone/pretrained model wanted

    many will try using different backbones like yolo/VGG/ENet for their models and it will be usefull if you have a pretrained model that doesn't include top layers for them. I also tried to find the saved model as it has to be in this directory :'model_data/yolox_s.h5' but it isn't if you still have the model I'll appreciate to upload it again. because I tried to train the model but found some issues with utils and loading data. Thanks

    opened by asparsa 1
  • tf与keras兼容问题报错

    tf与keras兼容问题报错

    !python predict.py 报错如下: model_data/yolox_s.h5 model, and classes loaded. Traceback (most recent call last): File "predict.py", line 14, in yolo = YOLO() File "/content/yolox-keras/yolo.py", line 82, in init self.boxes, self.scores, self.classes = self.generate() File "/content/yolox-keras/yolo.py", line 106, in generate letterbox_image = self.letterbox_image File "/content/yolox-keras/utils/utils_bbox.py", line 73, in DecodeBox grid_x, grid_y = tf.meshgrid(K.arange(hw[i][1]), K.arange(hw[i][0])) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None File "/usr/local/lib/python3.7/dist-packages/keras/layers/core/tf_op_layer.py", line 107, in handle return TFOpLambda(op)(*args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 67, in error_handler raise e.with_traceback(filtered_tb) from None File "/usr/local/lib/python3.7/dist-packages/keras/backend.py", line 3510, in arange if stop is None and start < 0: tensorflow.python.framework.errors_impl.OperatorNotAllowedInGraphError: Exception encountered when calling layer "tf.keras.backend.arange" (type TFOpLambda).

    using a tf.Tensor as a Python bool is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.

    Call arguments received: • start=tf.Tensor(shape=(), dtype=int32) • stop=None • step=1 • dtype=int32

    修改:utils.utils_bbox.DecodeBox如下: @tf.function def DecodeBox()

    接着报错: model_data/yolox_s.h5 model, and classes loaded. Traceback (most recent call last): File "predict.py", line 14, in yolo = YOLO() File "/content/yolox-keras/yolo.py", line 82, in init self.boxes, self.scores, self.classes = self.generate() File "/content/yolox-keras/yolo.py", line 106, in generate letterbox_image = self.letterbox_image File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None File "/usr/local/lib/python3.7/dist-packages/keras/engine/keras_tensor.py", line 256, in array f'You are passing {self}, an intermediate Keras symbolic input/output, ' TypeError: You are passing KerasTensor(type_spec=TensorSpec(shape=(None, None, None, 85), dtype=tf.float32, name=None), name='concatenate_13/concat:0', description="created by layer 'concatenate_13'"), an intermediate Keras symbolic input/output, to a TF API that does not allow registering custom dispatchers, such as tf.cond, tf.function, gradient tapes, or tf.map_fn. Keras Functional model construction only supports TF API calls that do support dispatching, such as tf.math.add or tf.reshape. Other APIs cannot be called directly on symbolic Kerasinputs/outputs. You can work around this limitation by putting the operation in a custom Keras layer call and calling that layer on this symbolic input/output.

    opened by cyning911 17
Releases(v2.1)
Owner
Bubbliiiing
Bubbliiiing
This package implements the algorithms introduced in Smucler, Sapienza, and Rotnitzky (2020) to compute optimal adjustment sets in causal graphical models.

optimaladj: A library for computing optimal adjustment sets in causal graphical models This package implements the algorithms introduced in Smucler, S

Facundo Sapienza 6 Aug 04, 2022
A really easy-to-use and powerful sudoku solver.

SodukuSolver This is a really useful sudoku solver with a Qt gui. USAGE Enter the numbers in and click "RUN"! If you don't want to wait, simply press

Ujhhgtg Teams 11 Jun 02, 2022
The authors' official PyTorch SigWGAN implementation

The authors' official PyTorch SigWGAN implementation This repository is the official implementation of [Sig-Wasserstein GANs for Time Series Generatio

9 Jun 16, 2022
TJU Deep Learning & Neural Network

Deep_Learning & Neural_Network_Lab 实验环境 Python 3.9 Anaconda3(官网下载或清华镜像都行) PyTorch 1.10.1(安装代码如下) conda install pytorch torchvision torchaudio cudatool

St3ve Lee 1 Jan 19, 2022
A simple program for training and testing vit

Vit This is a simple program for training and testing vit. Key requirements: torch, torchvision and timm. Dataset I put 5 categories of the cub classi

xiezhenyu 2 Oct 11, 2022
Code for the CVPR2022 paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity"

Introduction This is an official release of the paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity" (arxiv link). Abstrac

Leo 21 Nov 23, 2022
Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

TANG, shixiang 6 Nov 25, 2022
A general python framework for visual object tracking and video object segmentation, based on PyTorch

PyTracking A general python framework for visual object tracking and video object segmentation, based on PyTorch. 📣 Two tracking/VOS papers accepted

2.6k Jan 04, 2023
HALO: A Skeleton-Driven Neural Occupancy Representation for Articulated Hands

HALO: A Skeleton-Driven Neural Occupancy Representation for Articulated Hands Oral Presentation, 3DV 2021 Korrawe Karunratanakul, Adrian Spurr, Zicong

Korrawe Karunratanakul 43 Oct 07, 2022
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

NNI Doc | 简体中文 NNI (Neural Network Intelligence) is a lightweight but powerful toolkit to help users automate Feature Engineering, Neural Architecture

Microsoft 12.4k Dec 31, 2022
we propose a novel deep network, named feature aggregation and refinement network (FARNet), for the automatic detection of anatomical landmarks.

Feature Aggregation and Refinement Network for 2D Anatomical Landmark Detection Overview Localization of anatomical landmarks is essential for clinica

aoyueyuan 0 Aug 28, 2022
VOS: Learning What You Don’t Know by Virtual Outlier Synthesis

VOS This is the source code accompanying the paper VOS: Learning What You Don’t

248 Dec 25, 2022
LSTM model trained on a small dataset of 3000 names written in PyTorch

LSTM model trained on a small dataset of 3000 names. Model generates names from model by selecting one out of top 3 letters suggested by model at a time until an EOS (End Of Sentence) character is no

Sahil Lamba 1 Dec 20, 2021
This is a Image aid classification software based on python TK library development

This is a Image aid classification software based on python TK library development.

EasonChan 1 Jan 17, 2022
Deep learning model for EEG artifact removal

DeepSeparator Introduction Electroencephalogram (EEG) recordings are often contaminated with artifacts. Various methods have been developed to elimina

23 Dec 21, 2022
3.8% and 18.3% on CIFAR-10 and CIFAR-100

Wide Residual Networks This code was used for experiments with Wide Residual Networks (BMVC 2016) http://arxiv.org/abs/1605.07146 by Sergey Zagoruyko

Sergey Zagoruyko 1.2k Dec 29, 2022
Construct a neural network frame by Numpy

本项目的CSDN博客链接:https://blog.csdn.net/weixin_41578567/article/details/111482022 1. 概览 本项目主要用于神经网络的学习,通过基于numpy的实现,了解神经网络底层前向传播、反向传播以及各类优化器的原理。 该项目目前已实现的功

24 Jan 22, 2022
Disturbing Target Values for Neural Network regularization: attacking the loss layer to prevent overfitting

Disturbing Target Values for Neural Network regularization: attacking the loss layer to prevent overfitting 1. Classification Task PyTorch implementat

Yongho Kim 0 Apr 24, 2022
End-to-End Speech Processing Toolkit

ESPnet: end-to-end speech processing toolkit system/pytorch ver. 1.3.1 1.4.0 1.5.1 1.6.0 1.7.1 1.8.1 1.9.0 ubuntu20/python3.9/pip ubuntu20/python3.8/p

ESPnet 5.9k Jan 04, 2023
Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting

Official code of APHYNITY Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting (ICLR 2021, Oral) Yuan Yin*, Vincent Le Guen*

Yuan Yin 24 Oct 24, 2022