This repository is dedicated to developing and maintaining code for experiments with wide neural networks.

Overview

Wide-Networks

This repository contains the code of various experiments on wide neural networks. In particular, we implement classes for abc-parameterizations of NNs as defined by (Yang & Hu 2021). Although an equivalent description can be given using only ac-parameterizations, we keep the 3 scales (a, b and c) in the code to allow more flexibility depending on how we want to approach the problem of dealing with infinitely wide NNs.

Structure of the code

The BaseModel class

All the code related to neural networks is in the directory pytorch. The different models we have implemented are in this directory along with the base class found in the file base_model.py which implements the generic attributes and methods all our NNs classes will share.

The BaseModel class inherits from the Pytorch Lightning module, and essentially defines the necessary attributes for any NN to work properly, namely the architecture (which is defined in the _build_model() method), the activation function (we consider the same activation function at each layer), the loss function, the optimizer and the initializer for the parameters of the network.

Optionally, the BaseModel class can define attributes for the normalization (e.g. BatchNorm, LayerNorm, etc) and the scheduler, and any of the aforementioned attributes (optional or not) can be customized depending on the needs (see examples for the scheduler of ipllr and the initializer of abc_param).

The ModelConfig class

All the hyper-parameters which define the model (depth, width, activation function name, loss name, optimizer name, etc) have to be passed as argument to _init_() as an object of the class ModelConfig (pytorch/configs/model.py). This class reads from a yaml config file which defines all the necessary objects for a NN (see examples in pytorch/configs). Essentially, the class ModelConfig is here so that one only has to set the yaml config file properly and then the attributes are correctly populated in BaseModel via the class ModelConfig.

abc-parameterizations

The code for abc-parameterizations (Yang & Hu 2021) can be found in pytorch/abc_params. There we define the base class for abc-parameterizations, mainly setting the layer, init and lr scales from the values of a,b,c, as well as defining the initial parameters through Gaussians of appropriate variance depending on the value of b and the activation function.

Everything that is architecture specific (fully-connected, conv, residual, etc) is left out of this base class and has to be implemented in the _build_model() method of the child class (see examples in pytorch/abc_params/fully_connected). We also define there the base classes for the ntk, muP (Yang & Hu 2021), ip and ipllr parameterizations, and there fully-connected implementations in pytorch/abc_params/fully_connected.

Experiment runs

Setup

Before running any experiment, make sure you first install all the necessary packages:

pip3 install -r requirements.txt

You can optionally create a virtual environment through

python3 -m venv your_env_dir

then activate it with

source your_env_dir/bin/activate

and then install the requirements once the environment is activated. Now, if you haven't installed the wide-networks library in site-packages, before running the command for your experiment, make sure you first add the wide-networks library to the PYTHONPATH by running the command

export PYTHONPATH=$PYTHONPATH:"$PWD"

from the root directory (wide-networks/.) of where the wide-networks library is located.

Python jobs

We define python jobs which can be run with arguments from the command line in the directory jobs. Mainly, those jobs launch a training / val / test pipeline for a given model using the Lightning module, and the results are collected in a dictionary which is saved to a pickle file a the end of training for later examination. Additionally, metrics are logged in TensorBoard and can be visualized during training with the command

tensorboard --logdir=`your_experiment_dir`

We have written jobs to launch experiments on MNIST and CIFAR-10 with the fully connected version of different models such as muP (Yang & Hu 2021), IP-LLR, Naive-IP which can be found in jobs/abc_parameterizations. Arguments can be passed to those Python scripts through the command line, but they are optional and the default values will be used if the parameters of the script are not manually set. For example, the command

python3 jobs/abc_parameterizations/fc_muP_run.py --activation="relu" --n_steps=600 --dataset="mnist"

will launch a training / val / test pipeline with ReLU as the activation function, 600 SGD steps and the MNIST dataset. The other parameters of the run (e.g. the base learning rate and batch size) will have their default values. The jobs will automatically create a directory (and potentially subdirectories) for the experiment and save there the python logs, the tensorboard events and the results dictionary saved to a pickle file as well as the checkpoints saved for the network.

Visualizing results

To visualize the results after training for a given experiment, one can launch the notebook experiments-results.ipynb located in pytorch/notebooks/training/abc_parameterizations, and simply change the arguments in the "Set variables" cell to load the results from the corresponding experiment. Then running all the cells will produce (and save) some figures related to the training phase (e.g. loss vs. steps).

Owner
Karl Hajjar
PhD student at Laboratoire de Mathématiques d'Orsay
Karl Hajjar
Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

Cross-media Structured Common Space for Multimedia Event Extraction Table of Contents Overview Requirements Data Quickstart Citation Overview The code

Manling Li 49 Nov 21, 2022
Car Price Predictor App used to predict the price of the car based on certain input parameters created using python's scikit-learn, fastapi, numpy and joblib packages.

Pricefy Car Price Predictor App used to predict the price of the car based on certain input parameters created using python's scikit-learn, fastapi, n

Siva Prakash 1 May 10, 2022
Official implementation of Rethinking Graph Neural Architecture Search from Message-passing (CVPR2021)

Rethinking Graph Neural Architecture Search from Message-passing Intro The GNAS can automatically learn better architecture with the optimal depth of

Shaofei Cai 48 Sep 30, 2022
This is the PyTorch implementation of GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation

Official PyTorch repo for GAN's N' Roses. Diverse im2im and vid2vid selfie to anime translation.

1.1k Jan 01, 2023
LSTMs (Long Short Term Memory) RNN for prediction of price trends

Price Prediction with Recurrent Neural Networks LSTMs BTC-USD price prediction with deep learning algorithm. Artificial Neural Networks specifically L

5 Nov 12, 2021
A script depending on VASP output for calculating Fermi-Softness.

Fermi softness calculation for Vienna Ab initio Simulation Package (VASP) Update 1.1.0: Big update: Rewrote the code. Use Bader atomic division instea

qslin 11 Nov 08, 2022
Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch

Lie Transformer - Pytorch (wip) Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch. Only the SE3 version will be present in thi

Phil Wang 78 Oct 26, 2022
Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows

Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows This is the official implementation of the ICCV 2021 Paper "Probabilistic Mono

62 Nov 23, 2022
An Open-Source Tool for Automatic Disease Diagnosis..

OpenMedicalChatbox An Open-Source Package for Automatic Disease Diagnosis. Overview Due to the lack of open source for existing RL-base automated diag

8 Nov 08, 2022
HODEmu, is both an executable and a python library that is based on Ragagnin 2021 in prep.

HODEmu HODEmu, is both an executable and a python library that is based on Ragagnin 2021 in prep. and emulates satellite abundance as a function of co

Antonio Ragagnin 1 Oct 13, 2021
Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

Back to the Feature with PixLoc We introduce PixLoc, a neural network for end-to-end learning of camera localization from an image and a 3D model via

Computer Vision and Geometry Lab 610 Jan 05, 2023
PAthological QUpath Obsession - QuPath and Python conversations

PAQUO: PAthological QUpath Obsession Welcome to paquo 👋 , a library for interacting with QuPath from Python. paquo's goal is to provide a pythonic in

Bayer AG 60 Dec 31, 2022
(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

Haoxi Ran 264 Dec 23, 2022
AttentionGAN for Unpaired Image-to-Image Translation & Multi-Domain Image-to-Image Translation

AttentionGAN-v2 for Unpaired Image-to-Image Translation AttentionGAN-v2 Framework The proposed generator learns both foreground and background attenti

Hao Tang 530 Dec 27, 2022
High performance distributed framework for training deep learning recommendation models based on PyTorch.

High performance distributed framework for training deep learning recommendation models based on PyTorch.

340 Dec 30, 2022
AlphaNet Improved Training of Supernet with Alpha-Divergence

AlphaNet: Improved Training of Supernet with Alpha-Divergence This repository contains our PyTorch training code, evaluation code and pretrained model

Facebook Research 87 Oct 10, 2022
A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising (CVPR 2020 Oral & TPAMI 2021)

ELD The implementation of CVPR 2020 (Oral) paper "A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising" and its journal (TPAMI) v

Kaixuan Wei 359 Jan 01, 2023
PyTorch implementation of Federated Learning with Non-IID Data, and federated learning algorithms, including FedAvg, FedProx.

Federated Learning with Non-IID Data This is an implementation of the following paper: Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, Vik

Youngjoon Lee 48 Dec 29, 2022
Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance [Video Demo] [Paper] Installation Requirements Python 3.6 PyTorch 1.1.0 Pleas

Jiachen Xu 19 Oct 28, 2022
Lepard: Learning Partial point cloud matching in Rigid and Deformable scenes

Lepard: Learning Partial point cloud matching in Rigid and Deformable scenes [Paper] Method overview 4DMatch Benchmark 4DMatch is a benchmark for matc

103 Jan 06, 2023