Manifold-Mixup implementation for fastai V2

Last update: Jul 25, 2022

Overview

Manifold Mixup

Unofficial implementation of ManifoldMixup (Proceedings of ICML 19) for fast.ai (V2) based on Shivam Saboo's pytorch implementation of manifold mixup, fastai's input mixup implementation plus some improvements/variants that I developped with lessw2020.

This package provides four additional callbacks to the fastai learner :

ManifoldMixup which implements ManifoldMixup
OutputMixup which implements a variant that does the mixup only on the output of the last layer (this was shown to be more performant on a benchmark and an independant blogpost)
DynamicManifoldMixup which lets you use manifold mixup with a schedule to increase difficulty progressively
DynamicOutputMixup which lets you use manifold mixup with a schedule to increase difficulty progressively

Usage

For a minimal demonstration of the various callbacks and their parameters, see the Demo notebook.

Mixup

To use manifold mixup, you need to import manifold_mixup and pass the corresponding callback to the cbs argument of your learner :

learner = Learner(data, model, cbs=ManifoldMixup())
learner.fit(8)

The ManifoldMixup callback takes three parameters :

alpha=0.4 parameter of the beta law used to sample the interpolation weight
use_input_mixup=True do you want to apply mixup to the inputs
module_list=None can be used to pass an explicit list of target modules

The OutputMixup variant takes only the alpha parameters.

Dynamic mixup

Dynamic callbackss, which are available via dynamic_mixup, take three parameters instead of the single alpha parameter :

alpha_min=0.0 the initial, minimum, value for the parameter of the beta law used to sample the interpolation weight (we recommend keeping it to 0)
alpha_max=0.6 the final, maximum, value for the parameter of the beta law used to sample the interpolation weight
scheduler=SchedCos the scheduling function to describe the evolution of alpha from alpha_min to alpha_max

The default schedulers are SchedLin, SchedCos, SchedNo, SchedExp and SchedPoly. See the Annealing section of fastai2's documentation for more informations on available schedulers, ways to combine them and provide your own.

Notes

Which modules will be intrumented by ManifoldMixup ?

ManifoldMixup tries to establish a sensible list of modules on which to apply mixup:

it uses a user provided module_list if possible
otherwise it uses only the modules wrapped with ManifoldMixupModule
if none are found, it defaults to modules with Block or Bottleneck in their name (targetting mostly resblocks)
finaly, if needed, it defaults to all modules that are not included in the non_mixable_module_types list

The non_mixable_module_types list contains mostly recurrent layers but you can add elements to it in order to define module classes that should not be used for mixup (do not hesitate to create an issue or start a PR to add common modules to the default list).

When can I use OutputMixup ?

OutputMixup applies the mixup directly to the output of the last layer. This only works if the loss function contains something like a softmax (and not when it is directly used as it is for regression).

Thus, OutputMixup cannot be used for regression.

A note on skip-connections / residual-blocks

ManifoldMixup (this does not apply to OutputMixup) is greatly degraded when applied inside a residual block. This is due to the mixed-up values becoming incoherent with the output of the skip connection (which have not been mixed).

While this implementation is equiped to work around the problem for U-Net and ResNet like architectures, you might run into problems (negligeable improvements over the baseline) with other network structures. In which case, the best way to apply manifold mixup would be to manually select the modules to be instrumented.

For more unofficial fastai extensions, see the Fastai Extensions Repository.

Manifold-Mixup implementation for fastai V2

Related tags

Overview

Manifold Mixup

Usage

Mixup

Dynamic mixup

Notes

Which modules will be intrumented by ManifoldMixup ?

When can I use OutputMixup ?

A note on skip-connections / residual-blocks

Owner

Nestor Demeure

PyTorch implementation for Partially View-aligned Representation Learning with Noise-robust Contrastive Loss (CVPR 2021)

This project contains an implemented version of Face Detection using OpenCV and Mediapipe. This is a code snippet and can be used in projects.

Official repository for Automated Learning Rate Scheduler for Large-Batch Training (8th ICML Workshop on AutoML)

Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite.

A general python framework for visual object tracking and video object segmentation, based on PyTorch

A multi-scale unsupervised learning for deformable image registration

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Learning to Initialize Neural Networks for Stable and Efficient Training

Locally cache assets that are normally streamed in POPULATION: ONE

Training PSPNet in Tensorflow. Reproduce the performance from the paper.

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Object detection and instance segmentation toolkit based on PaddlePaddle.

TumorInsight is a Brain Tumor Detection and Classification model built using RESNET50 architecture.

SmallInitEmb - LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence

Fader Networks: Manipulating Images by Sliding Attributes - NIPS 2017

Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

tmm_fast is a lightweight package to speed up optical planar multilayer thin-film device computation.

PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

This repository gives an example on how to preprocess the data of the HECKTOR challenge