Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Last update: Jan 04, 2023

Overview

Bayesian-Torch: Bayesian neural network layers for uncertainty estimation

Get started | Example usage | Documentation | License | Citing

Bayesian layers and utilities to perform stochastic variational inference in PyTorch

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayesian-Torch is designed to be flexible and seamless in extending a deterministic deep neural network architecture to corresponding Bayesian form by simply replacing the deterministic layers with Bayesian layers.

The repository has implementations for the following Bayesian layers:

Variational layers with reparameterized Monte Carlo estimators [Blundell et al. 2015]

LinearVariational 
Conv1dVariational, Conv2dVariational, Conv3dVariational, ConvTranspose1dVariational, ConvTranspose2dVariational, ConvTranspose3dVariational
LSTMVariational

Variational layers with Flipout Monte Carlo estimators [Wen et al. 2018]

LinearFlipout 
Conv1dFlipout, Conv2dFlipout, Conv3dFlipout, ConvTranspose1dFlipout, ConvTranspose2dFlipout, ConvTranspose3dFlipout
LSTMFlipout

Variational layers with Gaussian mixture model (GMM) posteriors using reparameterized Monte Carlo estimators (in pre-alpha)

LinearMixture
Conv1dMixture, Conv2dMixture, Conv3dMixture, ConvTranspose1dMixture, ConvTranspose2dMixture, ConvTranspose3dMixture
LSTMMixture

Please refer to documentation of Bayesian layers for details.

Other features include:

MOPED: specifying weight priors and variational posteriors with Empirical Bayes [Krishnan et al. 2019]
AvUC: Accuracy versus Uncertainty Calibration [Krishnan et al. 2020]

Installation

Install from source:

git clone https://github.com/IntelLabs/bayesian-torch
cd bayesian-torch
pip install .

This code has been tested on PyTorch v1.6.0 and torchvision v0.7.0 with python 3.7.7.

Dependencies:

Create conda environment with python=3.7
Install PyTorch and torchvision packages within conda environment following instructions from PyTorch install guide
conda install -c conda-forge accimage
pip install tensorboard
pip install scikit-learn

Example usage

We have provided example model implementations using the Bayesian layers.

We also provide example usages and scripts to train/evaluate the models. The instructions for CIFAR10 examples is provided below, similar scripts for ImageNet and MNIST are available.

cd bayesian_torch

Training

To train Bayesian ResNet on CIFAR10, run this command:

Mean-field variational inference (Reparameterized Monte Carlo estimator)

sh scripts/train_bayesian_cifar.sh

Mean-field variational inference (Flipout Monte Carlo estimator)

sh scripts/train_bayesian_flipout_cifar.sh

To train deterministic ResNet on CIFAR10, run this command:

Vanilla

sh scripts/train_deterministic_cifar.sh

Evaluation

To evaluate Bayesian ResNet on CIFAR10, run this command:

Mean-field variational inference (Reparameterized Monte Carlo estimator)

sh scripts/test_bayesian_cifar.sh

Mean-field variational inference (Flipout Monte Carlo estimator)

sh scripts/test_bayesian_flipout_cifar.sh

To evaluate deterministic ResNet on CIFAR10, run this command:

Vanilla

sh scripts/test_deterministic_cifar.sh

Citing

If you use this code, please cite as:

@misc{krishnan2020bayesiantorch,
    author = {Ranganath Krishnan and Piero Esposito},
    title = {Bayesian-Torch: Bayesian neural network layers for uncertainty estimation},
    year = {2020},
    publisher = {GitHub},
    howpublished = {\url{https://github.com/IntelLabs/bayesian-torch}}
}

Cite the weight sampling methods as well: Blundell et al. 2015; Wen et al. 2018

Contributors

Ranganath Krishnan
Piero Esposito

This code is intended for researchers and developers, enables to quantify principled uncertainty estimates from deep neural network predictions using stochastic variational inference in Bayesian neural networks. Feedbacks, issues and contributions are welcome. Email to [email protected] for any questions.

Comments

The average should be taken over log probability rather than logits
https://github.com/IntelLabs/bayesian-torch/blob/7abcfe7ff3811c6a5be6326ab91a8d5cb1e8619f/bayesian_torch/examples/main_bayesian_cifar.py#L363-L367 I think the average across the MC runs should be taken over the log probability. However, the output here is the logits before the softmax operation. I think we may first run output = F.log_softmax(output, dim=1) and then take the average.

There are two equivalent ways to take the average, which I think is more reasonable. The first way is

for mc_run in range(args.num_mc): output, kl = model(input_var) output = F.log_softmax(output, dim=1) output_.append(output) kl_.append(kl) output = torch.mean(torch.stack(output_), dim=0) loss= F.nll_loss(output, target_var) # this is to replace the original cross_entropy_loss

Or equivalently, we can first take the cross-entropy loss for each MC run, and average the losses at the end:

loss = 0 for mc_run in range(args.num_mc): output, kl = model(input_var) loss = loss + F.cross_entropy(output, target_var, dim=1) kl_.append(kl) loss = loss / args.num_mc # this is to replace the original cross_entropy_loss
question
opened by Nebularaid2000 5

KL divergence not changing during training

Hello,

I am trying to make a single layer BNN using the LinearReparameterization layer. I am unable to get it to give reasonable uncertainty estimates, so I started monitoring the KL term from the layers and noticed that it is not changing at all for each epoch. Even when I scale up the KL term in the loss, it remains unchanged.

I am not sure if this is a bug, or if I am not doing the training correctly.

My model

class BNN(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim=1):
        super().__init__()
        self.layer1 = LinearReparameterization(input_dim, hidden_dim)
        self.layerf = nn.Linear(hidden_dim, output_dim)
        
    def forward(self, x):
        kl_sum = 0
        x, kl = self.layer1(x)
        kl_sum += kl
        x = F.relu(x)
        x = self.layerf(x)
        return x, kl_sum

and my training loop

model = BNN(X_train.shape[-1], 100).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = torch.nn.MSELoss()

for epoch in pbar:
        running_kld_loss = 0
        running_mse_loss = 0
        running_loss = 0
        for datapoints, labels in dataloader_train:
            optimizer.zero_grad()
            
            output, kl = model(datapoints)
            kl = get_kl_loss(model)
            
            # calculate loss with kl term for Bayesian layers
            mse_loss = criterion(output, labels)
            loss = mse_loss + kl * kld_beta / batch_size
            
            loss.backward()
            optimizer.step()
            
            running_mse_loss += mse_loss.detach().numpy()
            running_kld_loss += kl.detach().numpy()
            running_loss += loss.detach().numpy()
            
        status.update({
            'Epoch': epoch, 
            'loss': running_loss/len(dataloader_train),
            'kl': running_kld_loss/len(dataloader_train),
            'mse': running_mse_loss/len(dataloader_train)
        })

When I print the KL loss, it starts at ~5.0 and does not decrease at all.

opened by gkwt 4

Pytorch version issue

Hi， it seems that this software only supports Pytorch 1.8.1 LTS or newer version. When I try to use this code in my own project(which is mainly based on Pytorch 1.4.0), some conflicts occur and seemingly unavoidable. Could you please update this software to make it also support some older version Pytorch, for example, 1.4.0. Great Thanks!!!

opened by hebowei2000 4
conv_transpose2d parameter order problem

https://github.com/IntelLabs/bayesian-torch/blob/f6f516e9b3721466aa0036c735475a421cc3ce80/bayesian_torch/layers/variational_layers/conv_variational.py#L784-L786

When I try to convert a deterministic autoencoder into a bayesian one with a ConvTranspose2d layer I get the error "Exception has occurred: TypeError conv_transpose2d(): argument 'groups' (position 7) must be int, not tuple" which I suspect comes from self.dilation and self.group which are swaped.

opened by pierreosselin 3
Let the models return prediction only, saving KL Divergence as an attribute
Closes #7 .

Let the user, if they want, to return predictions only on forward method, while saving kl divergence as an attribute. This is important to make it easier to integrate into PyTorch models.

Also, it does not break the lib as it is: we added a new parameter on forward method that defaults to True and, if manually set to false, returns predictions only.

Performed the following changes, on all layers:

Added return_kl on all forward methods, defaulting to True. If set to false, won't return kl.

Added a new kl attribute to each layer, updating it at every feedforward step. Useful when integrating with already-built PyTorch models.

That should help integrating with PyTorch experiments while keeping backward compatibility towards this lib.
enhancement
opened by piEsposito 3
Fix KL computation
Hello,

I think there might be a few problems in your model definitions.

In particular:

in resnet_flipout.py and resnet_variational.py you only sum the kl of the last block inside self.layerN

in resnet_flipout_large.py and resnet_variational_large.py you check for is None while you probably want is not None or actually no check at all since it can't be None in any reasonable setting. Also the str(layer) check is odd since it contains a BasicBlock or BottleNeck object (you're looping over an nn.Sequential of blocks). In fact in this code that string check is very likely superfluous (didn't test this, but I did include it in this PR as example)

I hope you can confirm and perhaps fix these issues, which will help me (and maybe others) in building on your nice codebase :)
opened by y0ast 3

Enable the Bayesian layer to freeze the parameters to their mean values

I think it would be good to provide an option to freeze the weights and biases to the mean value when inferencing. The forward function would somehow look like this:

def forward(self, input, sample=True):
    if sample:
        # do sampling and forward as in the current code
    else:
        # set weight=self.mu_weight
        # set bias=self.mu_bias
        # (optional) set kl=0, since it is useless in this case
    return out, kl

opened by Nebularaid2000 2

FR: Enable forward method of Bayesian Layers to return value only for smoother integration with PyTorch
It would be nice if we could store the KL divergence value as an attribute of the Bayesian Layers and return them on the forward method only if needed.

With that we can have less friction on integration with PyTorch. being able to "plug and play" with bayesian-torch layers on deterministic models.

It would be something like that:

def forward(self, x, return_kl=False): ... self.kl = kl if return_kl: return out, kl return out

We then can get it from the bayesian layers when calculating the loss with no harm or hard changes to the code, which might encourage users to try the lib.

I can work on that also.
enhancement
opened by piEsposito 2
modifing the Scripts for any arbitrary output size

I am trying to use these repo for classification with 3 output nodes. However, in the fully connected layer, I got the error of size mismatch. Can you help me with this issue?
question

opened by narminGhaffari 2

add mu_kernel to of delta_kernel in flipout layers?

Hi,

Shouldn't mu_kernel be added to delta_kernel in the code below?

Best, Lewis

diff --git a/bayesian_torch/layers/flipout_layers/conv_flipout.py b/bayesian_torch/layers/flipout_layers/conv_flipout.py
index 4b3e88d..719cfdc 100644
--- a/bayesian_torch/layers/flipout_layers/conv_flipout.py
+++ b/bayesian_torch/layers/flipout_layers/conv_flipout.py
@@ -165,7 +165,7 @@ class Conv1dFlipout(BaseVariationalLayer_):
         sigma_weight = torch.log1p(torch.exp(self.rho_kernel))
         eps_kernel = self.eps_kernel.data.normal_()
 
-        delta_kernel = (sigma_weight * eps_kernel)
+        delta_kernel = (sigma_weight * eps_kernel) + self.mu_kernel 
 
         kl = self.kl_div(self.mu_kernel, sigma_weight, self.prior_weight_mu,
                          self.prior_weight_sigma)

opened by burntcobalt 2

Kernel_size

The Conv2dReparameterization only allows kernels with same dim (e.g., 2×2) However, some CNN model has different kernels (e.g., in inceptionresnetv2, the kernel size in block17 is 1×7）

So, I modified the code in conv_variational.py from:

        self.mu_kernel = Parameter(
            torch.Tensor(out_channels, in_channels // groups, kernel_size,
                         kernel_size))
        self.rho_kernel = Parameter(
            torch.Tensor(out_channels, in_channels // groups, kernel_size,
                         kernel_size))
        self.register_buffer(
            'eps_kernel',
            torch.Tensor(out_channels, in_channels // groups, kernel_size,
                         kernel_size),
            persistent=False)
        self.register_buffer(
            'prior_weight_mu',
            torch.Tensor(out_channels, in_channels // groups, kernel_size,
                         kernel_size),
            persistent=False)
        self.register_buffer(
            'prior_weight_sigma',
            torch.Tensor(out_channels, in_channels // groups, kernel_size,
                         kernel_size),
            persistent=False)

        self.mu_kernel = Parameter(
            torch.Tensor(out_channels, in_channels // groups, kernel_size[0],
                         kernel_size[1]))
        self.rho_kernel = Parameter(
            torch.Tensor(out_channels, in_channels // groups, kernel_size[0],
                         kernel_size[1]))
        self.register_buffer(
            'eps_kernel',
            torch.Tensor(out_channels, in_channels // groups, kernel_size[0],
                         kernel_size[1]),
            persistent=False)
        self.register_buffer(
            'prior_weight_mu',
            torch.Tensor(out_channels, in_channels // groups, kernel_size[0],
                         kernel_size[1]),
            persistent=False)
        self.register_buffer(
            'prior_weight_sigma',
            torch.Tensor(out_channels, in_channels // groups, kernel_size[0],
                         kernel_size[1]),
            persistent=False)

also, the kernel_size=d.kernel_size[0] was changes to kernel_size=d.kernel_size in dnn_to_cnn.py

enhancement

opened by flydephone 1

How bayesian network be used in quantization?

@peteriz @jpablomch @ranganathkrishnan we can get the $\mu$ and $\sigma$ of each weight ,so how it can be used to quantization which mapping the $w_{float32 }$ into $w_{int8}$
enhancement

opened by LeopoldACC 1

Releases(v0.3.0)

v0.3.0(Dec 14, 2022)

support arbitrary kernel sizes in the Bayesian convolutional layers
Source code(tar.gz)
Source code(zip)
v0.2.0-alpha(Jan 27, 2022)

Includes dnn_to_bnn new feature: An API to convert deterministic deep neural network (dnn) model of any architecture to Bayesian deep neural network (bnn) model, simplifying the model definition i.e. drop-in replacements of Convolutional, Linear and LSTM layers to corresponding Bayesian layers. This will enable seamless conversion of existing topology of larger models to Bayesian deep neural network models for extending towards uncertainty-aware applications.
Source code(tar.gz)
Source code(zip)
v0.2.0(Jan 27, 2022)

Includes dnn_to_bnn new feature: An API to convert deterministic deep neural network (dnn) model of any architecture to Bayesian deep neural network (bnn) model, simplifying the model definition i.e. drop-in replacements of Convolutional, Linear and LSTM layers to corresponding Bayesian layers. This will enable seamless conversion of existing topology of larger models to Bayesian deep neural network models for extending towards uncertainty-aware applications.

Full Changelog: https://github.com/IntelLabs/bayesian-torch/compare/v0.1...v0.2.0
Source code(tar.gz)
Source code(zip)
v0.1(Dec 16, 2021)

Source code(tar.gz)
Source code(zip)

Owner

Intel Labs

GitHub Repository

Create images and texts with the First Order Generative Adversarial Networks

First Order Divergence for training GANs This repository contains code accompanying the paper First Order Generative Advesarial Netoworks The majority

35 Dec 11, 2021

PyTorch implementations of the NeRF model described in "NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis"

PyTorch NeRF and pixelNeRF NeRF: Tiny NeRF: pixelNeRF: This repository contains minimal PyTorch implementations of the NeRF model described in "NeRF:

178 Dec 20, 2022

Official implementation of Densely connected normalizing flows

Densely connected normalizing flows This repository is the official implementation of NeurIPS 2021 paper Densely connected normalizing flows. Poster a

31 Dec 12, 2022

alfred-py: A deep learning utility library for human

Alfred Alfred is command line tool for deep-learning usage. if you want split an video into image frames or combine frames into a single video, then a

800 Jan 03, 2023

A framework for GPU based high-performance medical image processing and visualization

FAST is an open-source cross-platform framework with the main goal of making it easier to do high-performance processing and visualization of medical images on heterogeneous systems utilizing both mu

315 Dec 30, 2022

[AAAI22] Reliable Propagation-Correction Modulation for Video Object Segmentation

Reliable Propagation-Correction Modulation for Video Object Segmentation (AAAI22) Preview version paper of this work is available at: https://arxiv.or

70 Dec 04, 2022

《Rethinking Sptil Dimensions of Vision Trnsformers》(2021)

Rethinking Spatial Dimensions of Vision Transformers Byeongho Heo, Sangdoo Yun, Dongyoon Han, Sanghyuk Chun, Junsuk Choe, Seong Joon Oh | Paper NAVER

224 Dec 27, 2022

Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

Despite its importance, there are few previous works applying I2I translation to webtoon. I collected dataset from naver webtoon 연애혁명 and tried to transfer human faces to webtoon domain.

64 Oct 19, 2022

TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.

TCube: Domain-Agnostic Neural Time series Narration This repository contains the code for the paper: "TCube: Domain-Agnostic Neural Time series Narrat

7 Oct 31, 2021

This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization

Spherical Gaussian Optimization This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization. This code has b

41 Dec 14, 2022

PyGAD, a Python 3 library for building the genetic algorithm and training machine learning algorithms (Keras & PyTorch).

PyGAD: Genetic Algorithm in Python PyGAD is an open-source easy-to-use Python 3 library for building the genetic algorithm and optimizing machine lear

1.1k Dec 26, 2022

Investigating automatic navigation towards standard US views integrating MARL with the virtual US environment developed in CT2US simulation

AutomaticUSnavigation Investigating automatic navigation towards standard US views integrating MARL with the virtual US environment developed in CT2US

6 Dec 05, 2022

This repo uses a combination of logits and feature distillation method to teach the PSPNet model of ResNet18 backbone with the PSPNet model of ResNet50 backbone. All the models are trained and tested on the PASCAL-VOC2012 dataset.

PSPNet-logits and feature-distillation Introduction This repository is based on PSPNet and modified from semseg and Pixelwise_Knowledge_Distillation_P

6 Dec 01, 2022

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Related tags

Overview

Bayesian-Torch: Bayesian neural network layers for uncertainty estimation

Bayesian layers and utilities to perform stochastic variational inference in PyTorch

Installation

Example usage

Training

Evaluation

Citing

Comments

Releases(v0.3.0)

v0.3.0(Dec 14, 2022)

v0.2.0-alpha(Jan 27, 2022)

v0.2.0(Jan 27, 2022)

v0.1(Dec 16, 2021)

Owner

Intel Labs

Create images and texts with the First Order Generative Adversarial Networks

PyTorch implementations of the NeRF model described in "NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis"

Official implementation of Densely connected normalizing flows

alfred-py: A deep learning utility library for **human**

A framework for GPU based high-performance medical image processing and visualization

[AAAI22] Reliable Propagation-Correction Modulation for Video Object Segmentation

《Rethinking Sptil Dimensions of Vision Trnsformers》(2021)

Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.

This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization

PyGAD, a Python 3 library for building the genetic algorithm and training machine learning algorithms (Keras & PyTorch).

Investigating automatic navigation towards standard US views integrating MARL with the virtual US environment developed in CT2US simulation

This repo uses a combination of logits and feature distillation method to teach the PSPNet model of ResNet18 backbone with the PSPNet model of ResNet50 backbone. All the models are trained and tested on the PASCAL-VOC2012 dataset.

Example repository for custom C++/CUDA operators for TorchScript

UNAVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring

Cl datasets - PyTorch image dataloaders and utility functions to load datasets for supervised continual learning

Connecting Java/ImgLib2 + Python/NumPy

Myia prototyping

这是一个yolo3-tf2的源码，可以用于训练自己的模型。

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

alfred-py: A deep learning utility library for human