TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

Related tags

Deep Learningtorchcv
Overview

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

@misc{you2019torchcv,
    author = {Ansheng You and Xiangtai Li and Zhen Zhu and Yunhai Tong},
    title = {TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision},
    howpublished = {\url{https://github.com/donnyyou/torchcv}},
    year = {2019}
}

This repository provides source code for most deep learning based cv problems. We'll do our best to keep this repository up-to-date. If you do find a problem about this repository, please raise an issue or submit a pull request.

- Semantic Flow for Fast and Accurate Scene Parsing
- Code and models: https://github.com/lxtGH/SFSegNets

Implemented Papers

  • Image Classification

    • VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition
    • ResNet: Deep Residual Learning for Image Recognition
    • DenseNet: Densely Connected Convolutional Networks
    • ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
    • ShuffleNet V2: Practical Guidelines for Ecient CNN Architecture Design
    • Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search
  • Semantic Segmentation

    • DeepLabV3: Rethinking Atrous Convolution for Semantic Image Segmentation
    • PSPNet: Pyramid Scene Parsing Network
    • DenseASPP: DenseASPP for Semantic Segmentation in Street Scenes
    • Asymmetric Non-local Neural Networks for Semantic Segmentation
    • Semantic Flow for Fast and Accurate Scene Parsing
  • Object Detection

    • SSD: Single Shot MultiBox Detector
    • Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
    • YOLOv3: An Incremental Improvement
    • FPN: Feature Pyramid Networks for Object Detection
  • Pose Estimation

    • CPM: Convolutional Pose Machines
    • OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
  • Instance Segmentation

    • Mask R-CNN
  • Generative Adversarial Networks

    • Pix2pix: Image-to-Image Translation with Conditional Adversarial Nets
    • CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks.

QuickStart with TorchCV

Now only support Python3.x, pytorch 1.3.

pip3 install -r requirements.txt
cd lib/exts
sh make.sh

Performances with TorchCV

All the performances showed below fully reimplemented the papers' results.

Image Classification

  • ImageNet (Center Crop Test): 224x224
Model Train Test Top-1 Top-5 BS Iters Scripts
ResNet50 train val 77.54 93.59 512 30W ResNet50
ResNet101 train val 78.94 94.56 512 30W ResNet101
ShuffleNetV2x0.5 train val 60.90 82.54 1024 40W ShuffleNetV2x0.5
ShuffleNetV2x1.0 train val 69.71 88.91 1024 40W ShuffleNetV2x1.0
DFNetV1 train val 70.99 89.68 1024 40W DFNetV1
DFNetV2 train val 74.22 91.61 1024 40W DFNetV2

Semantic Segmentation

  • Cityscapes (Single Scale Whole Image Test): Base LR 0.01, Crop Size 769
Model Backbone Train Test mIOU BS Iters Scripts
PSPNet 3x3-Res101 train val 78.20 8 4W PSPNet
DeepLabV3 3x3-Res101 train val 79.13 8 4W DeepLabV3
  • ADE20K (Single Scale Whole Image Test): Base LR 0.02, Crop Size 520
Model Backbone Train Test mIOU PixelACC BS Iters Scripts
PSPNet 3x3-Res50 train val 41.52 80.09 16 15W PSPNet
DeepLabv3 3x3-Res50 train val 42.16 80.36 16 15W DeepLabV3
PSPNet 3x3-Res101 train val 43.60 81.30 16 15W PSPNet
DeepLabv3 3x3-Res101 train val 44.13 81.42 16 15W DeepLabV3

Object Detection

  • Pascal VOC2007/2012 (Single Scale Test): 20 Classes
Model Backbone Train Test mAP BS Epochs Scripts
SSD300 VGG16 07+12_trainval 07_test 0.786 32 235 SSD300
SSD512 VGG16 07+12_trainval 07_test 0.808 32 235 SSD512
Faster R-CNN VGG16 07_trainval 07_test 0.706 1 15 Faster R-CNN

Pose Estimation

  • OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Instance Segmentation

  • Mask R-CNN

Generative Adversarial Networks

  • Pix2pix
  • CycleGAN

DataSets with TorchCV

TorchCV has defined the dataset format of all the tasks which you could check in the subdirs of data. Following is an example dataset directory trees for training semantic segmentation. You could preprocess the open datasets with the scripts in folder data/seg/preprocess

Dataset
    train
        image
            00001.jpg/png
            00002.jpg/png
            ...
        label
            00001.png
            00002.png
            ...
    val
        image
            00001.jpg/png
            00002.jpg/png
            ...
        label
            00001.png
            00002.png
            ...

Commands with TorchCV

Take PSPNet as an example. ("tag" could be any string, include an empty one.)

  • Training
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh train tag
  • Resume Training
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh train tag
  • Validate
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh val tag
  • Testing:
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh test tag

Demos with TorchCV

Example output of VGG19-OpenPose

Example output of VGG19-OpenPose

converts nominal survey data into a numerical value based on a dictionary lookup.

SWAP RATE Converts nominal survey data into a numerical values based on a dictionary lookup. It allows the user to switch nominal scale data from text

Jake Rhodes 1 Jan 18, 2022
PyTorch implementation of Self-supervised Contrastive Regularization for DG (SelfReg)

SelfReg PyTorch official implementation of Self-supervised Contrastive Regularization for Domain Generalization (SelfReg, https://arxiv.org/abs/2104.0

64 Dec 16, 2022
A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Convolutional Neural Network (CNN). This repository contains a source code of a deep learning network built with TensorFlow and Keras to classify gend

Pawel Dziemiach 1 Dec 18, 2021
Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021

Fine-grained Post-training for Multi-turn Response Selection Implements the model described in the following paper Fine-grained Post-training for Impr

Janghoon Han 83 Dec 20, 2022
Spectral normalization (SN) is a widely-used technique for improving the stability and sample quality of Generative Adversarial Networks (GANs)

Why Spectral Normalization Stabilizes GANs: Analysis and Improvements [paper (NeurIPS 2021)] [paper (arXiv)] [code] Authors: Zinan Lin, Vyas Sekar, Gi

Zinan Lin 32 Dec 16, 2022
A lightweight tool to get an AI Infrastructure Stack up in minutes not days.

K3ai will take care of setup K8s for You, deploy the AI tool of your choice and even run your code on it.

k3ai 105 Dec 04, 2022
Apply Graph Self-Supervised Learning methods to graph-level task(TUDataset, MolculeNet Datset)

Graphlevel-SSL Overview Apply Graph Self-Supervised Learning methods to graph-level task(TUDataset, MolculeNet Dataset). It is unified framework to co

JunSeok 8 Oct 15, 2021
Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

CrossViT : Cross-Attention Multi-Scale Vision Transformer for Image Classification This is an unofficial PyTorch implementation of CrossViT: Cross-Att

Rishikesh (ऋषिकेश) 103 Nov 25, 2022
social humanoid robots with GPGPU and IoT

Social humanoid robots with GPGPU and IoT Social humanoid robots with GPGPU and IoT Paper Authors Mohsen Jafarzadeh, Stephen Brooks, Shimeng Yu, Balak

0 Jan 07, 2022
The Multi-Mission Maximum Likelihood framework (3ML)

PyPi Conda The Multi-Mission Maximum Likelihood framework (3ML) A framework for multi-wavelength/multi-messenger analysis for astronomy/astrophysics.

The Multi-Mission Maximum Likelihood (3ML) 62 Dec 30, 2022
A Closer Look at Reference Learning for Fourier Phase Retrieval

A Closer Look at Reference Learning for Fourier Phase Retrieval This repository contains code for our NeurIPS 2021 Workshop on Deep Learning and Inver

Tobias Uelwer 1 Oct 28, 2021
Facial Expression Detection In The Realtime

The human's facial expressions is very important to detect thier emotions and sentiment. It can be very efficient to use to make our computers make interviews. Furthermore, we have robots now can det

Adel El-Nabarawy 4 Mar 01, 2022
Roger Labbe 13k Dec 29, 2022
Code of paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification We provide the codes for repr

12 Dec 12, 2022
Pytorch Implementation of PointNet and PointNet++++

Pytorch Implementation of PointNet and PointNet++ This repo is implementation for PointNet and PointNet++ in pytorch. Update 2021/03/27: (1) Release p

Luigi Ariano 1 Nov 11, 2021
Spearmint Bayesian optimization codebase

Spearmint Spearmint is a software package to perform Bayesian optimization. The Software is designed to automatically run experiments (thus the code n

Formerly: Harvard Intelligent Probabilistic Systems Group -- Now at Princeton 1.5k Dec 29, 2022
TSIT: A Simple and Versatile Framework for Image-to-Image Translation

TSIT: A Simple and Versatile Framework for Image-to-Image Translation This repository provides the official PyTorch implementation for the following p

Liming Jiang 255 Nov 23, 2022
Repository for Multimodal AutoML Benchmark

Benchmarking Multimodal AutoML for Tabular Data with Text Fields Repository for the NeurIPS 2021 Dataset Track Submission "Benchmarking Multimodal Aut

Xingjian Shi 44 Nov 24, 2022
Platform-agnostic AI Framework 🔥

🇬🇧 TensorLayerX is a multi-backend AI framework, which can run on almost all operation systems and AI hardwares, and support hybrid-framework progra

TensorLayer Community 171 Jan 06, 2023
Code samples for my book "Neural Networks and Deep Learning"

Code samples for "Neural Networks and Deep Learning" This repository contains code samples for my book on "Neural Networks and Deep Learning". The cod

Michael Nielsen 13.9k Dec 26, 2022