Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018

Overview

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning

Tensorflow code and models for the paper:

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
Yin Cui, Yang Song, Chen Sun, Andrew Howard, Serge Belongie
CVPR 2018

This repository contains code and pre-trained models used in the paper and 2 demos to demonstrate: 1) the importance of pre-training data on transfer learning; 2) how to calculate domain similarity between source domain and target domain.

Notice that we used a mini validation set (./inat_minival.txt) contains 9,697 images that are randomly selected from the original iNaturalist 2017 validation set. The rest of valdiation images were combined with the original training set to train our model in the paper. There are 665,473 training images in total.

Dependencies:

Preparation:

  • Clone the repo with recursive:
git clone --recursive https://github.com/richardaecn/cvpr18-inaturalist-transfer.git
  • Install dependencies. Please refer to TensorFlow, pyemd, scikit-learn and scikit-image official websites for installation guide.
  • Download data and feature and unzip them into the same directory as the cloned repo. You should have two folders './data' and './feature' in the repo's directory.

Datasets (optional):

In the paper, we used data from 9 publicly available datasets:

We provide a download link that includes the entire CUB-200-2011 dataset and data splits for the rest of 8 datasets. The provided link contains sufficient data for this repo. If you would like to use other 8 datasets, please download them from the official websites and put them in the corresponding subfolders under './data'.

Pre-trained Models (optional):

The models were trained using TensorFlow-Slim. We implemented Squeeze-and-Excitation Networks (SENet) under './slim'. The pre-trained models can be downloaded from the following links:

Network Pre-trained Data Input Size Download Link
Inception-V3 ImageNet 299 link
Inception-V3 iNat2017 299 link
Inception-V3 iNat2017 448 link
Inception-V3 iNat2017 299 -> 560 FT1 link
Inception-V3 ImageNet + iNat2017 299 link
Inception-V3 SE ImageNet + iNat2017 299 link
Inception-V4 iNat2017 448 link
Inception-V4 iNat2017 448 -> 560 FT2 link
Inception-ResNet-V2 ImageNet + iNat2017 299 link
Inception-ResNet-V2 SE ImageNet + iNat2017 299 link
ResNet-V2 50 ImageNet + iNat2017 299 link
ResNet-V2 101 ImageNet + iNat2017 299 link
ResNet-V2 152 ImageNet + iNat2017 299 link

1 This model was trained with 299 input size on train + 90% val and then fine-tuned with 560 input size on 90% val.

2 This model was trained with 448 input size on train + 90% val and then fine-tuned with 560 input size on 90% val.

TensorFlow Hub also provides a pre-trained Inception-V3 299 on iNat2017 original training set here.

Featrue Extraction (optional):

Run the following Python script to extract feature:

python feature_extraction.py

To run this script, you need to download the checkpoint of Inception-V3 299 trained on iNat2017. The dataset and pre-trained model can be modified in the script.

We provide a download link that includes features used in the domos of this repo.

Demos

  1. Linear logistic regression on extracted features:

This demo shows the importance of pre-training data on transfer learning. Based on features extracted from an Inception-V3 pre-trained on iNat2017, we are able to achieve 89.9% classification accuracy on CUB-200-2011 with the simple logistic regression, outperforming most state-of-the-art methods.

LinearClassifierDemo.ipynb
  1. Calculating domain similarity by Earth Mover's Distance (EMD): This demo gives an example to calculate the domain similarity proposed in the paper. Results correspond to part of the Fig. 5 in the original paper.
DomainSimilarityDemo.ipynb

Training and Evaluation

  • Convert dataset into '.tfrecord':
python convert_dataset.py --dataset_name=cub_200 --num_shards=10
  • Train (fine-tune) the model on 1 GPU:
CUDA_VISIBLE_DEVICES=0 ./train.sh
  • Evaluate the model on another GPU simultaneously:
CUDA_VISIBLE_DEVICES=1 ./eval.sh
  • Run Tensorboard for visualization:
tensorboard --logdir=./checkpoints/cub_200/ --port=6006

Citation

If you find our work helpful in your research, please cite it as:

@inproceedings{Cui2018iNatTransfer,
  title = {Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning},
  author = {Yin Cui, Yang Song, Chen Sun, Andrew Howard, Serge Belongie},
  booktitle={CVPR},
  year={2018}
}
Owner
Yin Cui
Research Scientist at Google
Yin Cui
PyTorch Implementation of CycleGAN and SSGAN for Domain Transfer (Minimal)

MNIST-to-SVHN and SVHN-to-MNIST PyTorch Implementation of CycleGAN and Semi-Supervised GAN for Domain Transfer. Prerequites Python 3.5 PyTorch 0.1.12

Yunjey Choi 401 Dec 30, 2022
Ağ tarayıcı.Gönderdiği paketler ile ağa bağlı olan cihazların IP adreslerini gösterir.

NetScanner.py Ağ tarayıcı.Gönderdiği paketler ile ağa bağlı olan cihazların IP adreslerini gösterir. Linux'da Kullanımı: git clone https://github.com/

4 Aug 23, 2021
Deep Anomaly Detection with Outlier Exposure (ICLR 2019)

Outlier Exposure This repository contains the essential code for the paper Deep Anomaly Detection with Outlier Exposure (ICLR 2019). Requires Python 3

Dan Hendrycks 464 Dec 27, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 08, 2023
Official PyTorch Implementation of HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning (NeurIPS 2021 Spotlight)

[NeurIPS 2021 Spotlight] HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning [Paper] This is Official PyTorch implementatio

42 Nov 01, 2022
AirLoop: Lifelong Loop Closure Detection

AirLoop This repo contains the source code for paper: Dasong Gao, Chen Wang, Sebastian Scherer. "AirLoop: Lifelong Loop Closure Detection." arXiv prep

Chen Wang 53 Jan 03, 2023
An automated facial recognition based attendance system (desktop application)

Facial_Recognition_based_Attendance_System An automated facial recognition based attendance system (desktop application) Made using Python, Tkinter an

1 Jun 21, 2022
Patch-Based Deep Autoencoder for Point Cloud Geometry Compression

Patch-Based Deep Autoencoder for Point Cloud Geometry Compression Overview The ever-increasing 3D application makes the point cloud compression unprec

17 Dec 05, 2022
Towards End-to-end Video-based Eye Tracking

Towards End-to-end Video-based Eye Tracking The code accompanying our ECCV 2020 publication and dataset, EVE. Authors: Seonwook Park, Emre Aksan, Xuco

Seonwook Park 76 Dec 12, 2022
DeepFill v1/v2 with Contextual Attention and Gated Convolution, CVPR 2018, and ICCV 2019 Oral

Generative Image Inpainting An open source framework for generative image inpainting task, with the support of Contextual Attention (CVPR 2018) and Ga

2.9k Dec 16, 2022
A heterogeneous entity-augmented academic language model based on Open Academic Graph (OAG)

Library | Paper | Slack We released two versions of OAG-BERT in CogDL package. OAG-BERT is a heterogeneous entity-augmented academic language model wh

THUDM 58 Dec 17, 2022
EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale

EgonNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale Paper: EgoNN: Egocentric Neural Network for Point Cloud

19 Sep 20, 2022
[NeurIPS 2020] Official repository for the project "Listening to Sound of Silence for Speech Denoising"

Listening to Sounds of Silence for Speech Denoising Introduction This is the repository of the "Listening to Sounds of Silence for Speech Denoising" p

Henry Xu 40 Dec 20, 2022
HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

HashNeRF-pytorch Instant-NGP recently introduced a Multi-resolution Hash Encodin

Yash Sanjay Bhalgat 616 Jan 06, 2023
A small tool to joint picture including gif

README 做设计的时候遇到拼接长图的情况,但是发现没有什么好用的能拼接gif的工具。 于是自己写了个gif拼接小工具。 可以自动拼接gif、png和jpg等常见格式。 效果 从上至下 从下至上 从左至右 从右至左 使用 克隆仓库 git clone https://github.com/Dels

3 Dec 15, 2021
An end-to-end PyTorch framework for image and video classification

What's New: March 2021: Added RegNetZ models November 2020: Vision Transformers now available, with training recipes! 2020-11-20: Classy Vision v0.5 R

Facebook Research 1.5k Dec 31, 2022
基于Paddle框架的fcanet复现

fcanet-Paddle 基于Paddle框架的fcanet复现 fcanet 本项目基于paddlepaddle框架复现fcanet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: frazerlin-fcanet 数据准备 本项目已挂

QuanHao Guo 7 Mar 07, 2022
Matthew Colbrook 1 Apr 08, 2022
atmaCup #11 の Public 4th / Pricvate 5th Solution のリポジトリです。

#11 atmaCup 2021-07-09 ~ 2020-07-21 に行われた #11 [初心者歓迎! / 画像編] atmaCup のリポジトリです。結果は Public 4th / Private 5th でした。 フレームワークは PyTorch で、実装は pytorch-image-m

Tawara 12 Apr 07, 2022
A denoising autoencoder + adversarial losses and attention mechanisms for face swapping.

faceswap-GAN Adding Adversarial loss and perceptual loss (VGGface) to deepfakes'(reddit user) auto-encoder architecture. Updates Date Update 2018-08-2

3.2k Dec 30, 2022