Implementation of the HMAX model of vision in PyTorch

Overview

PyTorch implementation of HMAX

PyTorch implementation of the HMAX model that closely follows that of the MATLAB implementation of The Laboratory for Computational Cognitive Neuroscience:

http://maxlab.neuro.georgetown.edu/hmax.html

The S and C units of the HMAX model can almost be mapped directly onto TorchVision's Conv2d and MaxPool2d layers, where channels are used to store the filters for different orientations. However, HMAX also implements multiple scales, which doesn't map nicely onto the existing TorchVision functionality. Therefore, each scale has its own Conv2d layer, which are executed in parallel.

Here is a schematic overview of the network architecture:

layers consisting of units with increasing scale
S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1
 \ /   \ /   \ /   \ /   \ /   \ /   \ /   \ /
  C1    C1    C1    C1    C1    C1    C1    C1
   \     \     \    |     /     /     /     /
           ALL-TO-ALL CONNECTIVITY
   /     /     /    |     \     \     \     \
  S2    S2    S2    S2    S2    S2    S2    S2
   |     |     |     |     |     |     |     |
  C2    C2    C2    C2    C2    C2    C2    C2

Installation

This script depends on the NumPy, SciPy, PyTorch and TorchVision packages.

Clone the repository somewhere and run the example.py script:

git clone https://github.com/wmvanvliet/pytorch_hmax
python example.py

Usage

See the example.py script on how to run the model on 10 example images.

You might also like...
Pytorch implementation of
Pytorch implementation of "Training a 85.4% Top-1 Accuracy Vision Transformer with 56M Parameters on ImageNet"

Token Labeling: Training an 85.4% Top-1 Accuracy Vision Transformer with 56M Parameters on ImageNet (arxiv) This is a Pytorch implementation of our te

This repository contains a pytorch implementation of
This repository contains a pytorch implementation of "StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision".

StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision | Project Page | Paper | This repository contains a pytorch implementation of "St

PyTorch implementation of
PyTorch implementation of "MLP-Mixer: An all-MLP Architecture for Vision" Tolstikhin et al. (2021)

mlp-mixer-pytorch PyTorch implementation of "MLP-Mixer: An all-MLP Architecture for Vision" Tolstikhin et al. (2021) Usage import torch from mlp_mixer

Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers.
Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers.

Less is More: Pay Less Attention in Vision Transformers Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers. By

A PyTorch Implementation of ViT (Vision Transformer)
A PyTorch Implementation of ViT (Vision Transformer)

ViT - Vision Transformer This is an implementation of ViT - Vision Transformer by Google Research Team through the paper "An Image is Worth 16x16 Word

Pytorch implementation of the DeepDream computer vision algorithm
Pytorch implementation of the DeepDream computer vision algorithm

deep-dream-in-pytorch Pytorch (https://github.com/pytorch/pytorch) implementation of the deep dream (https://en.wikipedia.org/wiki/DeepDream) computer

A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers.
A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers.

ViTGAN: Training GANs with Vision Transformers A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers. Refer

Unofficial PyTorch implementation of MobileViT based on paper
Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

MobileViT RegNet Unofficial PyTorch implementation of MobileViT based on paper MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE, AND MOBILE-FRIENDLY VISION TR

Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners This repository is built upon BEiT, thanks very much! Now, we on

Comments
  • Provide direct (not nested) path to stimuli

    Provide direct (not nested) path to stimuli

    Hi,

    great repo and effort. I really admire your courage to write HMAX in python. I have a question about loading data in, namely about this part of the code: https://github.com/wmvanvliet/pytorch_hmax/blob/master/example.py#L18

    I know that by default, the ImageFolder expects to have nested folders (as stated in docs or mentioned in this issue) but it's quite clumsy in this case. Eg even if you look at your example, having subfolders for each photo just doesn't look good. Would you have a way how to go around this? Any suggestion on how to provide only a path to all images and not this nested path? I was reading some discussions but haven't figured out how to implement it.


    One more question (I didn't want to open an extra issue for that), shouldn't in https://github.com/wmvanvliet/pytorch_hmax/blob/master/example.py#L28 be batch_size=len(images)) instead of batch_size=10 (written symbolically)?

    Thanks.

    opened by jankaWIS 5
  • Input of non-square images fails

    Input of non-square images fails

    Hi again, I was playing a bit around and discovered that it fails for non-square dimensional images, i.e. where height != width. Maybe I was looking wrong or missed something, but I haven't found it mentioned anywhere and the docs kind of suggests that it can be any height and any width. The same goes for the description of the layers (e.g. s1). In the other issue, you mentioned that

    One thing you may want to add to this transformer pipeline is a transforms.Resize followed by a transforms.CenterCrop to ensure all images end up having the same height and width

    but didn't mention why. Why is it not possible for non-square images? Is there any workaround if one doesn't want to crop? Maybe to pad like in this post*?

    To demonstrate the issue:

    import os
    import torch
    from torch.utils.data import DataLoader
    from torchvision import datasets, transforms
    import pickle
    
    import hmax
    
    path_hmax = './'
    # Initialize the model with the universal patch set
    print('Constructing model')
    model = hmax.HMAX(os.path.join(path_hmax,'universal_patch_set.mat'))
    
    # A folder with example images
    example_images = datasets.ImageFolder(
        os.path.join(path_hmax,'example_images'),
        transform=transforms.Compose([
            transforms.Resize((400, 500)),
            transforms.CenterCrop((400, 500)),
            transforms.Grayscale(),
            transforms.ToTensor(),
            transforms.Lambda(lambda x: x * 255),
        ])
    )
    
    # A dataloader that will run through all example images in one batch
    dataloader = DataLoader(example_images, batch_size=10)
    
    # Determine whether there is a compatible GPU available
    device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
    
    # Run the model on the example images
    print('Running model on', device)
    model = model.to(device)
    for X, y in dataloader:
        s1, c1, s2, c2 = model.get_all_layers(X.to(device))
    
    print('[done]')
    

    will give an error in the forward function:

    ---------------------------------------------------------------------------
    RuntimeError                              Traceback (most recent call last)
    [<ipython-input-7-a6bab15d9571>](https://localhost:8080/#) in <module>()
         33 model = model.to(device)
         34 for X, y in dataloader:
    ---> 35     s1, c1, s2, c2 = model.get_all_layers(X.to(device))
         36 
         37 # print('Saving output of all layers to: output.pkl')
    
    4 frames
    [/gdrive/MyDrive/Colab Notebooks/data_HMAX/pytorch_hmax/hmax.py](https://localhost:8080/#) in forward(self, c1_outputs)
        285             conv_output = conv_output.view(
        286                 -1, self.num_orientations, self.num_patches, conv_output_size,
    --> 287                 conv_output_size)
        288 
        289             # Pool over orientations
    
    RuntimeError: shape '[-1, 4, 400, 126, 126]' is invalid for input of size 203616000
    

    *Code for that:

    import torchvision.transforms.functional as F
    
    class SquarePad:
        def __call__(self, image):
            max_wh = max(image.size)
            p_left, p_top = [(max_wh - s) // 2 for s in image.size]
            p_right, p_bottom = [max_wh - (s+pad) for s, pad in zip(image.size, [p_left, p_top])]
            padding = (p_left, p_top, p_right, p_bottom)
            return F.pad(image, padding, 0, 'constant')
    
    target_image_size = (224, 224)  # as an example
    # now use it as the replacement of transforms.Pad class
    transform=transforms.Compose([
        SquarePad(),
        transforms.Resize(target_image_size),
        transforms.ToTensor(),
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
    ])
    
    opened by jankaWIS 1
Releases(v0.2)
  • v0.2(Jul 7, 2022)

    For this version, I've modified the HMAX code a bit to exactly match that of the original MATLAB code of Maximilian Riesenhuber. This is a bit slower and consumes a bit more memory, as the code needs to work around some subtle differences between the MATLAB and PyTorch functions. Perhaps in the future, we could add an "optimized" model that is allowed to deviate from the reference implementation for increased efficiency, but for now I feel it is more important to follow the reference implementation to the letter.

    Major change: default C2 activation function is now 'euclidean' instead of 'gaussian'.

    Source code(tar.gz)
    Source code(zip)
  • v0.1(Jul 7, 2022)

Owner
Marijn van Vliet
Research Software Engineer.
Marijn van Vliet
Post-Training Quantization for Vision transformers.

PTQ4ViT Post-Training Quantization Framework for Vision Transformers. We use the twin uniform quantization method to reduce the quantization error on

Zhihang Yuan 61 Dec 28, 2022
这是一个yolo3-tf2的源码,可以用于训练自己的模型。

YOLOV3:You Only Look Once目标检测模型在Tensorflow2当中的实现 目录 性能情况 Performance 所需环境 Environment 文件下载 Download 训练步骤 How2train 预测步骤 How2predict 评估步骤 How2eval 参考资料

Bubbliiiing 68 Dec 21, 2022
一个目标检测的通用框架(不需要cuda编译),支持Yolo全系列(v2~v5)、EfficientDet、RetinaNet、Cascade-RCNN等SOTA网络。

一个目标检测的通用框架(不需要cuda编译),支持Yolo全系列(v2~v5)、EfficientDet、RetinaNet、Cascade-RCNN等SOTA网络。

Haoyu Xu 203 Jan 03, 2023
An onlinel learning to rank python codebase.

OLTR Online learning to rank python codebase. The code related to Pairwise Differentiable Gradient Descent (ranker/PDGDLinearRanker.py) is copied from

ielab 5 Jul 18, 2022
Tutorial to set up TensorFlow Object Detection API on the Raspberry Pi

A tutorial showing how to set up TensorFlow's Object Detection API on the Raspberry Pi

Evan 1.1k Dec 26, 2022
Repository for reproducing `Model-Based Robust Deep Learning`

Model-Based Robust Deep Learning (MBRDL) In this repository, we include the code necessary for reproducing the code used in Model-Based Robust Deep Le

Alex Robey 16 Sep 19, 2022
CLNTM - Contrastive Learning for Neural Topic Model

Contrastive Learning for Neural Topic Model This repository contains the impleme

Thong Thanh Nguyen 25 Nov 24, 2022
App customer segmentation cohort rfm clustering

CUSTOMER SEGMENTATION COHORT RFM CLUSTERING TỔNG QUAN VỀ HỆ THỐNG DỮ LIỆU Nên chuyển qua theme màu dark thì sẽ nhìn đẹp hơn https://customer-segmentat

hieulmsc 3 Dec 18, 2021
Unofficial implement with paper SpeakerGAN: Speaker identification with conditional generative adversarial network

Introduction This repository is about paper SpeakerGAN , and is unofficially implemented by Mingming Huang ( 7 Jan 03, 2023

基于深度强化学习的原神自动钓鱼AI

原神自动钓鱼AI由YOLOX, DQN两部分模型组成。使用迁移学习,半监督学习进行训练。 模型也包含一些使用opencv等传统数字图像处理方法实现的不可学习部分。

4.2k Jan 01, 2023
Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

LESA Introduction This repository contains the official implementation of Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Cont

Chenglin Yang 20 Dec 31, 2021
Recurrent Neural Network Tutorial, Part 2 - Implementing a RNN in Python and Theano

Please read the blog post that goes with this code! Jupyter Notebook Setup System Requirements: Python, pip (Optional) virtualenv To start the Jupyter

Denny Britz 863 Dec 15, 2022
MogFace: Towards a Deeper Appreciation on Face Detection

MogFace: Towards a Deeper Appreciation on Face Detection Introduction In this repo, we propose a promising face detector, termed as MogFace. Our MogFa

48 Dec 20, 2022
Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers

DALLE2 Video (wip) ** only to be built after DALLE2 image is done and replicated, and the importance of the prior network is validated ** Direct appli

Phil Wang 105 May 15, 2022
CrossMLP - The repository offers the official implementation of our BMVC 2021 paper (oral) in PyTorch.

CrossMLP Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation Bin Ren1, Hao Tang2, Nicu Sebe1. 1University of Trento, Italy, 2ETH, Switzerla

Bingoren 16 Jul 27, 2022
Original Pytorch Implementation of FLAME: Facial Landmark Heatmap Activated Multimodal Gaze Estimation

FLAME Original Pytorch Implementation of FLAME: Facial Landmark Heatmap Activated Multimodal Gaze Estimation, accepted at the 17th IEEE Internation Co

Neelabh Sinha 19 Dec 17, 2022
Demonstration of the Model Training as a CI/CD System in Vertex AI

Model Training as a CI/CD System This project demonstrates the machine model training as a CI/CD system in GCP platform. You will see more detailed wo

Chansung Park 19 Dec 28, 2022
Newt - a Gaussian process library in JAX.

Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\

AaltoML 0 Nov 02, 2021
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)

Baleen Baleen is a state-of-the-art model for multi-hop reasoning, enabling scalable multi-hop search over massive collections for knowledge-intensive

Stanford Future Data Systems 22 Dec 05, 2022
A setup script to generate ITK Python Wheels

ITK Python Package This project provides a setup.py script to build ITK Python binary packages and infrastructure to build ITK external module Python

Insight Software Consortium 59 Dec 14, 2022