Instance-level Image Retrieval using Reranking Transformers

Fuwen Tan, Jiangbo Yuan, Vicente Ordonez, ICCV 2021.

Abstract

Instance-level image retrieval is the task of searching in a large database for images that match an object in a query image. To address this task, systems usually rely on a retrieval step that uses global image descriptors, and a subsequent step that performs domain-specific refinements or reranking by leveraging operations such as geometric verification based on local features. In this work, we propose Reranking Transformers (RRTs) as a general model to incorporate both local and global features to rerank the matching images in a supervised fashion and thus replace the relatively expensive process of geometric verification. RRTs are lightweight and can be easily parallelized so that reranking a set of top matching results can be performed in a single forward-pass. We perform extensive experiments on the Revisited Oxford and Paris datasets, and the Google Landmark v2 dataset, showing that RRTs outperform previous reranking approaches while using much fewer local descriptors. Moreover, we demonstrate that, unlike existing approaches, RRTs can be optimized jointly with the feature extractor, which can lead to feature representations tailored to downstream tasks and further accuracy improvements.

Software required

The code is only tested on Linux 64:

  conda create -n rrt python=3.6
  conda activate rrt
  pip install -r requirements.txt

Organization

To use the code for experiments on Google Landmarks v2, Revisited Oxford/Paris, please refer to the folder RRT_GLD.

To use the code for experiments on Stanford Online Products, please refer to the folder RRT_SOP.

To use the code for evaluating SuperGlue on Revisited Oxford/Paris and Stanford Online Products, please refer to the repo SuperGlue.

Citing

If you find our paper/code useful, please consider citing:

@inproceedings{fwtan-instance-2021,
    author = {Fuwen Tan and Jiangbo Yuan and Vicente Ordonez},
    title = {Instance-level Image Retrieval using Reranking Transformers},
    year = {2021},
    booktitle = {International Conference on Computer Vision (ICCV)}
 }

[ICCV 2021] Instance-level Image Retrieval using Reranking Transformers

Related tags

Overview

Instance-level Image Retrieval using Reranking Transformers

Abstract

Software required

Organization

Citing

Owner

UVA Computer Vision

:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

Google's Meena transformer chatbot implementation

code for modular summarization work published in ACL2021 by Krishna et al

PortaSpeech - PyTorch Implementation

PyTorch impelementations of BERT-based Spelling Error Correction Models.

Hostapd-mac-tod-acl - Setup a hostapd AP with MAC ToD ACL

AMUSE - financial summarization

LightSeq: A High-Performance Inference Library for Sequence Processing and Generation

The model is designed to train a single and large neural network in order to predict correct translation by reading the given sentence.

This repository implements a brute-force spellchecker utilizing the Damerau-Levenshtein edit distance.

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

An open collection of annotated voices in Japanese language

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Large-scale Knowledge Graph Construction with Prompting

Perform sentiment analysis and keyword extraction on Craigslist listings

VampiresVsWerewolves - Our Implementation of a MiniMax algorithm with alpha beta pruning in the context of an in-class competition

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration