Awesome Long-Tailed Learning

Overview

Awesome Long-Tailed Learning Awesome

This repo pays specially attention to the long-tailed distribution, where labels follow a long-tailed or power-law distribution in the training dataset or/and test dataset. Related papers are sumarized, including its application in computer vision, in particular image classification, and extreme multi-label learning (XML), in particular text categorization.

🔆 Updated 2021-09-27

Long-tailed Learning in Computer Vision

Type of Long-Tailed Learning Methods

Type TST IS CBS CLW NC ENS DA
Meaning Two-Stage Training Instance Sampling Class-Balanced Sampling Class-Level Weighting Normalized Classifier Ensemble Data Augmentation

Long-Tailed Learning Workshops

Year Venue Title Remark
2021 CVPR Open World Vision long-tail, open-set, streaming labels
2021 CVPR Learning from Limited and Imperfect Data (L2ID) label noise, SSL, long-tail

Long-Tailed Learning Papers

Year Venue Title Remark
2021 Arxiv LEARNING FROM LONG-TAILED DATA WITH NOISY LABELS
2021 ICCV Self Supervision to Distillation for Long-Tailed Visual Recognition
2021 ICCV Distilling Virtual Examples for Long-tailed Recognition
2021 CVPR Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification
2021 CVPR MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition
2021 CVPR Disentangling Label Distribution for Long-tailed Visual Recognition
2021 CVPR Long-Tailed Multi-Label Visual Recognition by Collaborative Training on Uniform and Re-Balanced Samplings
2021 CVPR Seesaw Loss for Long-Tailed Instance Segmentation
2021 ICLR IS LABEL SMOOTHING TRULY INCOMPATIBLE WITH KNOWLEDGE DISTILLATION: AN EMPIRICAL STUDY
2021 Arxiv Improving Long-Tailed Classification from Instance Level
2021 Arxiv DISTRIBUTION-AWARE SEMANTICS-ORIENTED PSEUDO-LABEL FOR IMBALANCED SEMI-SUPERVISED LEARNING SSL, Code
2021 Arxiv ResLT: Residual Learning for Long-tailed Recognition
2021 Arxiv Improving Long-Tailed Classification from Instance Level
2021 Arxiv Disentangling Sampling and Labeling Bias for Learning in Large-Output Spaces by Google
2021 Arxiv Breadcrumbs: Adversarial Class-Balanced Sampling for Long-tailed Recognition
2021 Arxiv Procrustean Training for Imbalanced Deep Learning
2021 Arxiv Balanced Knowledge Distillation for Long-tailed Learning CBS+IS, Code
2021 Arxiv Class-Balanced Distillation for Long-Tailed Visual Recognition ENS+DA+IS, by Google Research
2021 Arxiv Distributional Robustness Loss for Long-tail Learning TST+CBS
2021 CVPR Improving Calibration for Long-Tailed Recognition DA+TST, Code
2021 CVPR Distribution Alignment: A Unified Framework for Long-tail Visual Recognition TST
2021 CVPR Adversarial Robustness under Long-Tailed Distribution
2021 CVPR CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning by Google, Code, Tensorflow
2021 ICLR HETEROSKEDASTIC AND IMBALANCED DEEP LEARNING WITH ADAPTIVE REGULARIZATION Code
2021 ICLR LONG-TAILED RECOGNITION BY ROUTING DIVERSE DISTRIBUTION-AWARE EXPERTS ENS+NC, Code, by Zi-Wei Liu
2021 ICLR Long-Tail Learning via Logit Adjustment by Google
2021 AAAI Bag of Tricks for Long-Tailed Visual Recognition with Deep Convolutional Neural Networks
2021 Arxiv Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification
2020 Arxiv ELF: An Early-Exiting Framework for Long-Tailed Classification
2020 CVPR Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition from a Domain Adaptation Perspective
2020 CVPR Equalization Loss for Long-Tailed Object Recognition
2020 CVPR Deep Representation Learning on Long-tailed Data: A Learnable Embedding Augmentation Perspective
2020 ICLR Decoupling representation and classifier for long-tailed recognition Code
2020 NeurIPS Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning Code
2020 NeurIPS Rethinking the Value of Labels for Improving Class-Imbalanced Learning Code
2020 CVPR Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition Code
2019 NeurIPS Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss Code
2019 CVPR Large-Scale Long-Tailed Recognition in an Open World Code, bibtex, by CUHK
2018 - iNatrualist. The inaturalist 2018 competition dataset long-tailed dataset
2017 Arxiv The Devil is in the Tails: Fine-grained Classification in the Wild
2017 NeurIPS Learning to model the tail

eXtreme Multi-label Learning for Information Retrieval

Binary Relevance

Year Venue Title Remark
2019 Machine learning Data Scarcity, Robustness and Extreme Multi-label Classification
2019 WSDM Slice: Scalable linear extreme classifiers trained on 100 million labels for related searches
2017 KDD PPDSparse: A Parallel Primal-Dual Sparse Method for Extreme Classification
2017 AISTATS Label Filters for Large Scale Multilabel Classification
2016 WSDM DiSMEC - Distributed Sparse Machines for Extreme Multi-label Classification
2016 ICML PD-Sparse: A Primal and Dual Sparse Approach to Extreme Multiclass and Multilabel Classification

Tree-based Methods

Year Venue Title Remark
2021 KDD Extreme Multi-label Learning for Semantic Matching in Product Search by Amazon, code
2020 arXiv Probabilistic Label Trees for Extreme Multi-label Classification PLT survey, code
2020 arXiv Online probabilistic label trees
2020 AISTATS LdSM: Logarithm-depth Streaming Multi-label Decision Trees Instance tree,c++ code
2019 NeurIPS AttentionXML: Extreme Multi-Label Text Classification with Multi-Label Attention Based Recurrent Neural Networks Label tree
2019 arXiv Bonsai - Diverse and Shallow Trees for Extreme Multi-label Classification Label tree
2018 ICML CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning Instance tree
2018 WWW Parabel: Partitioned Label Trees for Extreme Classification with Application to Dynamic Search Advertising Label tree...by Manik Varma
2016 ICML Extreme F-Measure Maximization using Sparse Probability Estimates Label tree
2016 KDD Extreme Multi-label Loss Functions for Recommendation, Tagging, Ranking & Other Missing Label Applications Instance tree
2014 KDD A Fast, Accurate and Stable Tree-classifier for eXtreme Multi-label Learning Instance tree, python implementation
2013 ICML Label Partitioning For Sublinear Ranking Label tree
2013 WWW Multi-Label Learning with Millions of Labels: Recommending Advertiser Bid Phrases for Web Pages Instance tree, Random Forest, Gini Index
2011 NeurIPS Efficient label tree learning for large scale object recognition Label tree, multi-class
2010 NeurIPS Label embedding trees for large multi-class tasks Label tree, multi-class
2008 ECML Workshop Effective and Efficient Multilabel Classification in Domains with Large Number of Labels Label tree

Embedding-based Methods

Year Venue Title Remark
2019 AAAI Distributional Semantics Meets Multi-Label Learning bibtex
2019 arXiv Ranking-Based Autoencoder for Extreme Multi-label Classification
2019 NeurIPS Breaking the Glass Ceiling for Embedding-Based Classifiers for Large Ouput Spaces by Google Research
2017 KDD AnnexML: Approximate Nearest Neighbor Search for Extreme Multi-label Classification
2015 NeurIPS Sparse Local Embeddings for Extreme Multi-label Classification
2014 ICML Large-scale Multi-label Learning with Missing Labels
2014 ICML Multi-label Classification via Feature-aware Implicit Label Space Encoding
2013 ICML Efficient Multi-label Classification with Many Labels
2012 NeurIIPS Feature-aware Label Space Dimension Reduction for Multi-label Classification
2011 IJCAI WSABIE: Scaling Up To Large Vocabulary Image Annotation bibtex
2009 NeurIPS Multi-Label Prediction via Compressed Sensing
2008 KDD Extracting Shared Subspaces for Multi-label Classification

Speed-up and Compression

Year Venue Title Remark
2020 KDD Large-Scale Training System for 100-Million Classification at Alibaba Applied Data Science Track
2020 arXiv SOLAR: Sparse Orthogonal Learned and Random Embeddings
2020 ICLR EXTREME CLASSIFICATION VIA ADVERSARIAL SOFTMAX APPROXIMATION
2019 AISTATS Stochastic Negative Mining for Learning with Large Output Spaces by Google
2019 NeurIPS Extreme Classification in Log Memory using Count-Min Sketch: A Case Study of Amazon Search with 50M Products Rice University, bibtex
2019 arXiv An Embarrassingly Simple Baseline for eXtreme Multi-label Prediction
2019 arXiv Accelerating Extreme Classification via Adaptive Feature Agglomeration bibtex, authors from IIT
2019 SDM Fast Training for Large-Scale One-versus-All Linear Classifiers using Tree-Structured Initialization code bibtex

Noval XML Settings

Year Venue Title Remark
2020 arXiv Extreme Multi-label Classification from Aggregated Labels by Inderjit Dhillon. This paper considers multi-instance learning in XML
2020 arXiv Unbiased Loss Functions for Extreme Classification With Missing Labels by Rohit Babbar. Missing labels
2020 ICML Deep Streaming Label Learning code, by Dacheng Tao, streaming multi-label learning
2016 arXiv Streaming Label Learning for Modeling Labels on the Fly by Dacheng Tao, streaming multi-label learning

Theoritical Studies

Year Venue Title Remark
2019 ICML Sparse Extreme Multi-label Learning with Oracle Property Code, by Weiwei Liu
2019 NeurIPS Multilabel reductions: what is my loss optimising? bibtex, by Google

Text Classification

Year Venue Title Remark
2021 ICML SiameseXML: Siamese Networks meet Extreme Classifiers with 100M Labels
2020 KDD Correlation Networks for Extreme Multi-label Text Classification code
2020 arXiv GNN-XML: Graph Neural Networks for Extreme Multi-label Text Classification
2020 ICML Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification code
2019 ACL Large-Scale Multi-Label Text Classification on EU Legislation Eur-Lex 4.3K, bibtex
2019 arXiv X-BERT: eXtreme Multi-label Text Classification with BERT code by Yiming Yang, Inderjit Dhillon
2019 NeurIPS AttentionXML: Extreme Multi-Label Text Classification with Multi-Label Attention Based Recurrent Neural Networks
2018 EMNLP Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces few-shot, zero-shot, evaluation metric
2018 NeurIPS A no-regret generalization of hierarchical softmax to extreme multi-label classification code, PLT code
2017 SIGIR Deep Learning for Extreme Multi-label Text Classification by Yiming Yang at CMU, bibtex

Others

Label Correlation

Year Venue Title Remark
2019 ICML DL2: Training and Querying Neural Networks with Logic
2015 KDD Discovering and Exploiting Deterministic Label Relationships in Multi-Label Learning
2010 KDD Multi-Label Learning by Exploiting Label Dependency

Long-tailed Continual Learning

Year Venue Title Remark
2020 ECCV Imbalanced Continual Learning with Partitioning Reservoir Sampling

Train/Test Split

Year Venue Title Remark
2021 Arxiv Stratified Sampling for Extreme Multi-Label Data

XML Seminar

Year Venue Title Remark
2019 Dagstuhl Seminar 18291 Extreme Classification

Survey References:

  1. https://arxiv.org/pdf/1901.00248.pdf
  2. http://www.iith.ac.in/~saketha/research/AkshatMTP2018.pdf
  3. http://manikvarma.org/pubs/bengio19.pdf
  4. The Emerging Trends of Multi-Label Learning

XML Datasets link

Extreme Classification Workshops link

Owner
Stomach_ache
Stomach_ache
Official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR)

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

12 Jan 13, 2022
Model-based 3D Hand Reconstruction via Self-Supervised Learning, CVPR2021

S2HAND: Model-based 3D Hand Reconstruction via Self-Supervised Learning S2HAND presents a self-supervised 3D hand reconstruction network that can join

Yujin Chen 72 Dec 12, 2022
PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in clustering (CVPR2021)

PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering Jang Hyun Cho1, Utkarsh Mall2, Kavita Bala2, Bharath Harihar

Jang Hyun Cho 164 Dec 30, 2022
Code for ICDM2020 full paper: "Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning"

Subg-Con Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning (Jiao et al., ICDM 2020): https://arxiv.org/abs/2009.10273 Over

34 Jul 06, 2022
Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration

This repo is for the paper: Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration The DAC environment is based on the Dynam

Carola Doerr 1 Aug 19, 2022
Gif-caption - A straightforward GIF Captioner written in Python

Broksy's GIF Captioner Have you ever wanted to easily caption a GIF without havi

3 Apr 09, 2022
AFLFast (extends AFL with Power Schedules)

AFLFast Power schedules implemented by Marcel Böhme [email protected]

Marcel Böhme 380 Jan 03, 2023
Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees" Installa

0 Oct 13, 2021
Official PyTorch implementation of paper: Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation (ICCV 2021 Oral Presentation)

SML (ICCV 2021, Oral) : Official Pytorch Implementation This repository provides the official PyTorch implementation of the following paper: Standardi

SangHun 61 Dec 27, 2022
Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

1 Oct 11, 2021
Official implementation for TTT++: When Does Self-supervised Test-time Training Fail or Thrive

TTT++ This is an official implementation for TTT++: When Does Self-supervised Test-time Training Fail or Thrive? TL;DR: Online Feature Alignment + Str

VITA lab at EPFL 39 Dec 25, 2022
Official code for the paper "Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks".

Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks This repository contains the official code for the

Linus Ericsson 11 Dec 16, 2022
Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python = 3.6 , Pytorch

FuxiVirtualHuman 84 Jan 03, 2023
MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieva

Introduction This is the source code of our TCSVT 2021 paper "MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieval". Ple

7 Aug 24, 2022
Official implementation of "Motif-based Graph Self-Supervised Learning forMolecular Property Prediction"

Motif-based Graph Self-Supervised Learning for Molecular Property Prediction Official Pytorch implementation of NeurIPS'21 paper "Motif-based Graph Se

zaixi 71 Dec 20, 2022
ADSPM: Attribute-Driven Spontaneous Motion in Unpaired Image Translation

ADSPM: Attribute-Driven Spontaneous Motion in Unpaired Image Translation This repository provides a PyTorch implementation of ADSPM. Requirements Pyth

24 Jul 24, 2022
A python script to dump all the challenges locally of a CTFd-based Capture the Flag.

A python script to dump all the challenges locally of a CTFd-based Capture the Flag. Features Connects and logins to a remote CTFd instance. Dumps all

Podalirius 77 Dec 07, 2022
A data-driven approach to quantify the value of classifiers in a machine learning ensemble.

Documentation | External Resources | Research Paper Shapley is a Python library for evaluating binary classifiers in a machine learning ensemble. The

Benedek Rozemberczki 188 Dec 29, 2022
Hierarchical Few-Shot Generative Models

Hierarchical Few-Shot Generative Models Giorgio Giannone, Ole Winther This repo contains code and experiments for the paper Hierarchical Few-Shot Gene

Giorgio Giannone 6 Dec 12, 2022
Implementation of "Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis"

Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis Abstract: This work targets at using a general deep lea

163 Dec 14, 2022