iftopt

An Implicit Function Theorem (IFT) optimizer for bi-level optimizations.

Requirements

Python 3.7+
PyTorch 1.x

Installation

$ pip install git+https://github.com/money-shredder/iftopt.git

Usage

Assuming a bi-level optimization of the form:

y* = argmin_{y} val_loss(x*, y), where x* = argmin_{x} train_loss(x, y).

To solve for the optimal x* and y* in the optimization problem, we can implement the following with iftopt:

from iftopt import HyperOptimizer
train_lr = val_lr = 0.1
# parameter to minimize the training loss
x = torch.nn.Parameter(...)
# hyper-parameter to minimize the validation loss
y = torch.nn.Parameter(...)
# training loss optimizer
opt = torch.optim.SGD([x], lr=train_lr)
# validation loss optimizer
hopt = HyperOptimizer(
    [y], torch.optim.SGD([y], lr=val_lr), vih_lr=0.1, vih_iterations=5)
# outer optimization loop for y
for _ in range(...):
    # inner optimization loop for x
    for _ in range(...):
        z = train_loss(x, y)
        # inner optimization step for x
        opt.zero_grad()
        z.backward()
        opt.step()
    # outer optimization step for y
    hopt.set_train_parameters([x])
    z = train_loss(x, y)
    hopt.train_step(z)
    v = val_loss(x, y)
    hopt.val_step(v)
    hopt.grad()
    hopt.step()

For a concrete simple example, please check out and run demo.py, where

train_loss = lambda x, y: (x + y) ** 2
val_loss = lambda x, y: x ** 2

with x = y = 1.0 initially. It will generate a video demo.mp4 showing the optimization trajectory in the animation below. Note that although the hyper-parameter y does not have a direct gradient w.r.t. the validation loss, iftopt can still minimize the validation loss by computing the hyper-gradient via implicit function theorem.

An Implicit Function Theorem (IFT) optimizer for bi-level optimizations

Related tags

Overview

iftopt

Requirements

Installation

Usage

Owner

The Money Shredder Lab

🌎 The Modern Declarative Data Flow Framework for the AI Empowered Generation.

DWIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data.

A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding his way.

Stacs-ci - A set of modules to enable integration of STACS with commonly used CI / CD systems

Weighing Counts: Sequential Crowd Counting by Reinforcement Learning

Improving Contrastive Learning by Visualizing Feature Transformation, ICCV 2021 Oral

Reinforcement Learning Theory Book (rus)

Code for the paper "SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness" (NeurIPS 2021)

Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

LEAP: Learning Articulated Occupancy of People

A project that uses optical flow and machine learning to detect aimhacking in video clips.

MNIST, but with Bezier curves instead of pixels

Transformer Huffman coding - Complete Huffman coding through transformer

Auto HMM: Automatic Discrete and Continous HMM including Model selection

A Simple and Versatile Framework for Object Detection and Instance Recognition

Shuffle Attention for MobileNetV3

Graph neural network message passing reframed as a Transformer with local attention

Creating multimodal multitask models

The implementation of the CVPR2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes"

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).