Py-faster-rcnn - Faster R-CNN (Python implementation)

Overview

py-faster-rcnn has been deprecated. Please see Detectron, which includes an implementation of Mask R-CNN.

Disclaimer

The official Faster R-CNN code (written in MATLAB) is available here. If your goal is to reproduce the results in our NIPS 2015 paper, please use the official code.

This repository contains a Python reimplementation of the MATLAB code. This Python implementation is built on a fork of Fast R-CNN. There are slight differences between the two implementations. In particular, this Python port

  • is ~10% slower at test-time, because some operations execute on the CPU in Python layers (e.g., 220ms / image vs. 200ms / image for VGG16)
  • gives similar, but not exactly the same, mAP as the MATLAB version
  • is not compatible with models trained using the MATLAB code due to the minor implementation differences
  • includes approximate joint training that is 1.5x faster than alternating optimization (for VGG16) -- see these slides for more information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

By Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun (Microsoft Research)

This Python implementation contains contributions from Sean Bell (Cornell) written during an MSR internship.

Please see the official README.md for more details.

Faster R-CNN was initially described in an arXiv tech report and was subsequently published in NIPS 2015.

License

Faster R-CNN is released under the MIT License (refer to the LICENSE file for details).

Citing Faster R-CNN

If you find Faster R-CNN useful in your research, please consider citing:

@inproceedings{renNIPS15fasterrcnn,
    Author = {Shaoqing Ren and Kaiming He and Ross Girshick and Jian Sun},
    Title = {Faster {R-CNN}: Towards Real-Time Object Detection
             with Region Proposal Networks},
    Booktitle = {Advances in Neural Information Processing Systems ({NIPS})},
    Year = {2015}
}

Contents

  1. Requirements: software
  2. Requirements: hardware
  3. Basic installation
  4. Demo
  5. Beyond the demo: training and testing
  6. Usage

Requirements: software

NOTE If you are having issues compiling and you are using a recent version of CUDA/cuDNN, please consult this issue for a workaround

  1. Requirements for Caffe and pycaffe (see: Caffe installation instructions)

Note: Caffe must be built with support for Python layers!

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# Unrelatedly, it's also recommended that you use CUDNN
USE_CUDNN := 1

You can download my Makefile.config for reference. 2. Python packages you might not have: cython, python-opencv, easydict 3. [Optional] MATLAB is required for official PASCAL VOC evaluation only. The code now includes unofficial Python evaluation code.

Requirements: hardware

  1. For training smaller networks (ZF, VGG_CNN_M_1024) a good GPU (e.g., Titan, K20, K40, ...) with at least 3G of memory suffices
  2. For training Fast R-CNN with VGG16, you'll need a K40 (~11G of memory)
  3. For training the end-to-end version of Faster R-CNN with VGG16, 3G of GPU memory is sufficient (using CUDNN)

Installation (sufficient for the demo)

  1. Clone the Faster R-CNN repository
# Make sure to clone with --recursive
git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git
  1. We'll call the directory that you cloned Faster R-CNN into FRCN_ROOT

    Ignore notes 1 and 2 if you followed step 1 above.

    Note 1: If you didn't clone Faster R-CNN with the --recursive flag, then you'll need to manually clone the caffe-fast-rcnn submodule:

    git submodule update --init --recursive

    Note 2: The caffe-fast-rcnn submodule needs to be on the faster-rcnn branch (or equivalent detached state). This will happen automatically if you followed step 1 instructions.

  2. Build the Cython modules

    cd $FRCN_ROOT/lib
    make
  3. Build Caffe and pycaffe

    cd $FRCN_ROOT/caffe-fast-rcnn
    # Now follow the Caffe installation instructions here:
    #   http://caffe.berkeleyvision.org/installation.html
    
    # If you're experienced with Caffe and have all of the requirements installed
    # and your Makefile.config in place, then simply do:
    make -j8 && make pycaffe
  4. Download pre-computed Faster R-CNN detectors

    cd $FRCN_ROOT
    ./data/scripts/fetch_faster_rcnn_models.sh

    This will populate the $FRCN_ROOT/data folder with faster_rcnn_models. See data/README.md for details. These models were trained on VOC 2007 trainval.

Demo

After successfully completing basic installation, you'll be ready to run the demo.

To run the demo

cd $FRCN_ROOT
./tools/demo.py

The demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007.

Beyond the demo: installation for training and testing models

  1. Download the training, validation, test data and VOCdevkit

    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
  2. Extract all of these tars into one directory named VOCdevkit

    tar xvf VOCtrainval_06-Nov-2007.tar
    tar xvf VOCtest_06-Nov-2007.tar
    tar xvf VOCdevkit_08-Jun-2007.tar
  3. It should have this basic structure

    $VOCdevkit/                           # development kit
    $VOCdevkit/VOCcode/                   # VOC utility code
    $VOCdevkit/VOC2007                    # image sets, annotations, etc.
    # ... and several other directories ...
  4. Create symlinks for the PASCAL VOC dataset

    cd $FRCN_ROOT/data
    ln -s $VOCdevkit VOCdevkit2007

    Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.

  5. [Optional] follow similar steps to get PASCAL VOC 2010 and 2012

  6. [Optional] If you want to use COCO, please see some notes under data/README.md

  7. Follow the next sections to download pre-trained ImageNet models

Download pre-trained ImageNet models

Pre-trained ImageNet models can be downloaded for the three networks described in the paper: ZF and VGG16.

cd $FRCN_ROOT
./data/scripts/fetch_imagenet_models.sh

VGG16 comes from the Caffe Model Zoo, but is provided here for your convenience. ZF was trained at MSRA.

Usage

To train and test a Faster R-CNN detector using the alternating optimization algorithm from our NIPS 2015 paper, use experiments/scripts/faster_rcnn_alt_opt.sh. Output is written underneath $FRCN_ROOT/output.

cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_alt_opt.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng1701 RNG_SEED 1701

("alt opt" refers to the alternating optimization training algorithm described in the NIPS paper.)

To train and test a Faster R-CNN detector using the approximate joint training method, use experiments/scripts/faster_rcnn_end2end.sh. Output is written underneath $FRCN_ROOT/output.

cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng1701 RNG_SEED 1701

This method trains the RPN module jointly with the Fast R-CNN network, rather than alternating between training the two. It results in faster (~ 1.5x speedup) training times and similar detection accuracy. See these slides for more details.

Artifacts generated by the scripts in tools are written in this directory.

Trained Fast R-CNN networks are saved under:

output/
   
    /
    
     /

    
   

Test outputs are saved under:

output/
   
    /
    
     /
     
      /

     
    
   
Owner
Ross Girshick
Ross Girshick
FairMOT for Multi-Class MOT using YOLOX as Detector

FairMOT-X Project Overview FairMOT-X is a multi-class multi object tracker, which has been tailored for training on the BDD100K MOT Dataset. It makes

Jonathan Tan 33 Dec 28, 2022
This is the official implementation for "Do Transformers Really Perform Bad for Graph Representation?".

Graphormer By Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng*, Guolin Ke, Di He*, Yanming Shen and Tie-Yan Liu. This repo is the official impl

Microsoft 1.3k Dec 29, 2022
Auto-updating data to assist in investment to NEPSE

Symbol Ratios Summary Sector LTP Undervalued Bonus % MEGA Strong Commercial Banks 368 5 10 JBBL Strong Development Banks 568 5 10 SIFC Strong Finance

Amit Chaudhary 16 Nov 01, 2022
Mmdet benchmark with python

mmdet_benchmark 本项目是为了研究 mmdet 推断性能瓶颈,并且对其进行优化。 配置与环境 机器配置 CPU:Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz GPU:NVIDIA GeForce RTX 3080 10GB 内存:64G 硬盘:1T

杨培文 (Yang Peiwen) 24 May 21, 2022
Decorators for maximizing memory utilization with PyTorch & CUDA

torch-max-mem This package provides decorators for memory utilization maximization with PyTorch and CUDA by starting with a maximum parameter size and

Max Berrendorf 10 May 02, 2022
Generalized Data Weighting via Class-level Gradient Manipulation

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

18 Nov 12, 2022
Everything about being a TA for ITP/AP course!

تی‌ای بودن! تی‌ای یا دستیار استاد از نقش‌های رایج بین دانشجویان مهندسی است، این ریپوزیتوری قرار است نکات مهم درمورد تی‌ای بودن و تی ای شدن را به ما نش

<a href=[email protected]"> 14 Sep 10, 2022
Python implementation of Bayesian optimization over permutation spaces.

Bayesian Optimization over Permutation Spaces This repository contains the source code and the resources related to the paper "Bayesian Optimization o

Aryan Deshwal 9 Dec 23, 2022
Posterior predictive distributions quantify uncertainties ignored by point estimates.

Posterior predictive distributions quantify uncertainties ignored by point estimates.

DeepMind 177 Dec 06, 2022
Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

PyTorch code to reproduce LyDROO algorithm [1], which is an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability an

Liang HUANG 87 Dec 28, 2022
Pytorch implementation of Nueral Style transfer

Nueral Style Transfer Pytorch implementation of Nueral style transfer algorithm , it is used to apply artistic styles to content images . Content is t

Abhinav 9 Oct 15, 2022
Pytorch implementation of face attention network

Face Attention Network Pytorch implementation of face attention network as described in Face Attention Network: An Effective Face Detector for the Occ

Hooks 312 Dec 09, 2022
Implementation of DropLoss for Long-Tail Instance Segmentation in Pytorch

[AAAI 2021]DropLoss for Long-Tail Instance Segmentation [AAAI 2021] DropLoss for Long-Tail Instance Segmentation Ting-I Hsieh*, Esther Robb*, Hwann-Tz

Tim 37 Dec 02, 2022
Probabilistic Gradient Boosting Machines

PGBM Probabilistic Gradient Boosting Machines (PGBM) is a probabilistic gradient boosting framework in Python based on PyTorch/Numba, developed by Air

Olivier Sprangers 112 Dec 28, 2022
Deep Markov Factor Analysis (NeurIPS2021)

Deep Markov Factor Analysis (DMFA) Codes and experiments for deep Markov factor analysis (DMFA) model accepted for publication at NeurIPS2021: A. Farn

Sarah Ostadabbas 2 Dec 16, 2022
PyTorch implementation of DeepUME: Learning the Universal Manifold Embedding for Robust Point Cloud Registration (BMVC 2021)

DeepUME: Learning the Universal Manifold Embedding for Robust Point Cloud Registration [video] [paper] [supplementary] [data] [thesis] Introduction De

Natalie Lang 10 Dec 14, 2022
Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks.

pyradiomics v3.0.1 Build Status Linux macOS Windows Radiomics feature extraction in Python This is an open-source python package for the extraction of

Artificial Intelligence in Medicine (AIM) Program 842 Dec 28, 2022
GANTheftAuto is a fork of the Nvidia's GameGAN

Description GANTheftAuto is a fork of the Nvidia's GameGAN, which is research focused on emulating dynamic game environments. The early research done

Harrison 801 Dec 27, 2022
Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

EfficientZero (NeurIPS 2021) Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021. Environments Effi

Weirui Ye 671 Jan 03, 2023
Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation Efficient Self-Ensemble Framework for Semantic Segmentation by Walid Bousselham

61 Dec 26, 2022