Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

Overview

SegSwap

Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

[PDF] [Project page]

teaser

teaser

If our project is helpful for your research, please consider citing :

@article{shen2021learning,
  title={Learning Co-segmentation by Segment Swapping for Retrieval and Discovery},
  author={Shen, Xi and Efros, Alexei A and Joulin, Armand and Aubry, Mathieu},
  journal={arXiv},
  year={2021}

Table of Content

1. Installation

1.1. Dependencies

Our model can be learnt on a a single GPU Tesla-V100-16GB. The code has been tested in Pytorch 1.7.1 + cuda 10.2

Other dependencies can be installed via (tqdm, kornia, opencv-python, scipy) :

bash requirement.sh

1.2. Pre-trained MocoV2-resnet50 + cross-transformer (~300M)

Quick download :

cd model/pretrained
bash download_model.sh

2. Training Data Generation

2.1. Download COCO (~20G)

This command will download coco2017 training set + annotations (~20G).

cd data/COCO2017/download_coco.sh
bash download_coco.sh

2.2. Image Pairs with One Repeated Object

2.2.1 Generating 100k pairs (~18G)

This command will generate 100k image pairs with one repeated object.

cd data/
python generate_1obj.py --out-dir pairs_1obj_100k 

2.2.1 Examples of image pairs

Source Blended Obj + Background Stylised Source Stylised Background

2.2.2 Visualizing correspondences and masks of the generated pairs

This command will generate 10 pairs and visualize correspondences and masks of the pairs.

cd data/
bash vis_pair.sh

These pairs can be illustrated via vis10_1obj/vis.html

2.3. Image Pairs with Two Repeated Object

2.3.1 Generating 100k pairs (~18G)

This command will generate 100k image pairs with one repeated object.

cd data/
python generate_2obj.py --out-dir pairs_2obj_100k 

2.3.1 Examples of image pairs

Source Blended Obj + Background Stylised Source Stylised Background

2.3.2 Visualizing correspondences and masks of the generated pairs

This command will generate 10 pairs and visualize correspondences and masks of the pairs.

cd data/
bash vis_pair.sh

These pairs can be illustrated via vis10_2obj/vis.html

3. Evaluation

3.1 One-shot Art Detail Detection on Brueghel Dataset

3.1.1 Visual results: top-3 retrieved images

teaser

3.1.2 Data

Brueghel dataset has been uploaded in this repo

3.1.3 Quantitative results

The following command conduct evaluation on Brueghel with pre-trained cross-transformer:

cd evalBrueghel
python evalBrueghel.py --out-coarse out_brueghel.json --resume-pth ../model/hard_mining_neg5.pth --label-pth ../data/Brueghel/brueghelTest.json

Note that this command will save the features of Brueghel(~10G).

3.2 Place Recognition on Tokyo247 Dataset

3.2.1 Visual results: top-3 retrieved images

teaser

3.2.2 Data

Download Tokyo247 from its project page

Download the top-100 results used by patchVlad(~1G).

The data needs to be organised:

./SegSwap/data/Tokyo247
                    ├── query/
                        ├── 247query_subset_v2/
                    ├── database/
...

./SegSwap/evalTokyo
                    ├── top100_patchVlad.npy

3.2.3 Quantitative results

The following command conduct evaluation on Tokyo247 with pre-trained cross-transformer:

cd evalTokyo
python evalTokyo.py --qry-dir ../data/Tokyo247/query/247query_subset_v2 --db-dir ../data/Tokyo247/database --resume-pth ../model/hard_mining_neg5.pth

3.3 Place Recognition on Pitts30K Dataset

3.3.1 Visual results: top-3 retrieved images

teaser

3.3.2 Data

Download Pittsburgh dataset from its project page

Download the top-100 results used by patchVlad (~4G).

The data needs to be organised:

./SegSwap/data/Pitts
                ├── queries_real/
...

./SegSwap/evalPitts
                    ├── top100_patchVlad.npy

3.3.3 Quantitative results

The following command conduct evaluation on Pittsburgh30K with pre-trained cross-transformer:

cd evalPitts
python evalPitts.py --qry-dir ../data/Pitts/queries_real --db-dir ../data/Pitts --resume-pth ../model/hard_mining_neg5.pth

3.4 Discovery on Internet Dataset

3.4.1 Visual results

teaser

3.4.2 Data

Download Internet dataset from its project page

We provide a script to quickly download and preprocess the data (~400M):

cd data/Internet
bash download_int.sh

The data needs to be organised:

./SegSwap/data/Internet
                ├── Airplane100
                    ├── GroundTruth                
                ├── Horse100
                    ├── GroundTruth                
                ├── Car100
                    ├── GroundTruth                                

3.4.3 Quantitative results

The following commands conduct evaluation on Internet with pre-trained cross-transformer

cd evalInt
bash run_pair_480p.sh
bash run_best_only_cycle.sh

4. Training

Stage 1: standard training

Supposing that the generated pairs are saved in ./SegSwap/data/pairs_1obj_100k and ./SegSwap/data/pairs_2obj_100k.

Training command can be found in ./SegSwap/train/run.sh.

Note that this command should be able to be launched on a single GPU with 16G memory.

cd train
bash run.sh

Stage 2: hard mining

In train/run_hardmining.sh, replacing --resume-pth by the model trained in the 1st stage, than running:

cd train
bash run_hardmining.sh

5. Acknowledgement

We appreciate helps from :

Part of code is borrowed from our previous projects: ArtMiner and Watermark

6. ChangeLog

  • 21/10/21, model, evaluation + training released

7. License

This code is distributed under an MIT LICENSE.

Note that our code depends on other libraries, including Kornia, Pytorch, and uses datasets which each have their own respective licenses that must also be followed.

Owner
xshen
Ph.D, Computer Vision, Deep Learning.
xshen
Pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model'

RTK-PAD This is an official pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model', which is accepted by IEEE T

6 Aug 01, 2022
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

5 Dec 10, 2022
Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.

Conceptual 12M We introduce the Conceptual 12M (CC12M), a dataset with ~12 million image-text pairs meant to be used for vision-and-language pre-train

Google Research Datasets 226 Dec 07, 2022
This repository contains code demonstrating the methods outlined in Path Signature Area-Based Causal Discovery in Coupled Time Series presented at Causal Analysis Workshop 2021.

signed-area-causal-inference This repository contains code demonstrating the methods outlined in Path Signature Area-Based Causal Discovery in Coupled

Will Glad 1 Mar 11, 2022
DecoupledNet is semantic segmentation system which using heterogeneous annotations

DecoupledNet: Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation Created by Seunghoon Hong, Hyeonwoo Noh and Bohyung Han at POSTE

Hyeonwoo Noh 74 Sep 22, 2021
CondNet: Conditional Classifier for Scene Segmentation

CondNet: Conditional Classifier for Scene Segmentation Introduction The fully convolutional network (FCN) has achieved tremendous success in dense vis

ycszen 31 Jul 22, 2022
Code for "The Box Size Confidence Bias Harms Your Object Detector"

The Box Size Confidence Bias Harms Your Object Detector - Code Disclaimer: This repository is for research purposes only. It is designed to maintain r

Johannes G. 24 Dec 07, 2022
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition The official code of ABINet (CVPR 2021, Oral).

334 Dec 31, 2022
CAST: Character labeling in Animation using Self-supervision by Tracking

CAST: Character labeling in Animation using Self-supervision by Tracking (Published as a conference paper at EuroGraphics 2022) Note: The CAST paper c

15 Nov 18, 2022
使用yolov5训练自己数据集(详细过程)并通过flask部署

使用yolov5训练自己的数据集(详细过程)并通过flask部署 依赖库 torch torchvision numpy opencv-python lxml tqdm flask pillow tensorboard matplotlib pycocotools Windows,请使用 pycoc

HB.com 19 Dec 28, 2022
Deep Reinforcement Learning for Multiplayer Online Battle Arena

MOBA_RL Deep Reinforcement Learning for Multiplayer Online Battle Arena Prerequisite Python 3 gym-derk Tensorflow 2.4.1 Dotaservice of TimZaman Seed R

Dohyeong Kim 32 Dec 18, 2022
Breaching - Breaching privacy in federated learning scenarios for vision and text

Breaching - A Framework for Attacks against Privacy in Federated Learning This P

Jonas Geiping 139 Jan 03, 2023
Implementation of ECCV20 paper: the devil is in classification: a simple framework for long-tail object detection and instance segmentation

Implementation of our ECCV 2020 paper The Devil is in Classification: A Simple Framework for Long-tail Instance Segmentation This repo contains code o

twang 98 Sep 17, 2022
Codebase for the Summary Loop paper at ACL2020

Summary Loop This repository contains the code for ACL2020 paper: The Summary Loop: Learning to Write Abstractive Summaries Without Examples. Training

Canny Lab @ The University of California, Berkeley 44 Nov 04, 2022
Cascaded Pyramid Network (CPN) based on Keras (Tensorflow backend)

ML2 Takehome Project Reimplementing the paper: Cascaded Pyramid Network for Multi-Person Pose Estimation Dataset The model uses the COCO dataset which

Vo Van Tu 1 Nov 22, 2021
Reimplementation of Learning Mesh-based Simulation With Graph Networks

Pytorch Implementation of Learning Mesh-based Simulation With Graph Networks This is the unofficial implementation of the approach described in the pa

Jingwei Xu 33 Dec 14, 2022
A Vision Transformer approach that uses concatenated query and reference images to learn the relationship between query and reference images directly.

A Vision Transformer approach that uses concatenated query and reference images to learn the relationship between query and reference images directly.

24 Dec 13, 2022
Generative Query Network (GQN) in PyTorch as described in "Neural Scene Representation and Rendering"

Update 2019/06/24: A model trained on 10% of the Shepard-Metzler dataset has been added, the following notebook explains the main features of this mod

Jesper Wohlert 313 Dec 27, 2022
Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques

Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques This repository is derived from the NMTGMinor

Tu Anh Dinh 1 Sep 07, 2022
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

Ritchie Ng 9.2k Jan 02, 2023