RealFormer-Pytorch Implementation of RealFormer using pytorch

Last update: Dec 08, 2022

Related tags

Overview

RealFormer-Pytorch

Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt CIFAR-10 dataset.

Original Paper of the model : https://arxiv.org/abs/2012.11747

So how are RealFormers at vision tasks?

Run the train.py with

model = ViR(
        image_pix = 32,
        patch_pix = 4,
        class_cnt = 10,
        layer_cnt = 4
    )

to Test how RealFormer works on CIFAR-10 dataset compared to just classical ViT, which is

model = ViT(
        image_pix = 32,
        patch_pix = 4,
        class_cnt = 10,
        layer_cnt = 4
    )

... which is of course, much, much smaller version of ViT compared to the origianl ones ().

Results

Model : layers = 4, hidden_dim = 128, feedforward_dim = 512, head_cnt = 4

Trained 10 epochs

After 10'th epoch, Realformer achieves 65.45% while Transformer achieves 64.59% RealFormer seems to consistently have about 1% greater accuracy, which seems reasonable (as the papaer suggested simillar result)

Model : layers = 8, hidden_dim = 128, feedforward_dim = 512, head_cnt = 4

Having 4 more layers obviously improves in general, and still, RealFormer consistently wins in terms of accuracy (68.3% vs 66.3%). Notice that larger the model, bigger the difference seems to follow here too. (I wonder how much of difference it would make on ViT-Large)

When it comes to computation time, there was almost zero difference. (I guess adding residual attention score is O(L^2) operation, compared to matrix multiplication in softmax which is O(L^2 * D))

Conclusion

Use RealFormer. It benifits with almost zero additional resource!

To make a custom RealFormer for other tasks

Its not a pip package, but you can use the ResEncoderBlock module in the models.py to make a Encoder Only Transformer like the following :

import ResEncoderBlock from models

def RealFormer(nn.Module):
...
  def __init__(self, ...):
  ...
    self.mains = nn.Sequential(*[ResEncoderBlock(emb_s = 32, head_cnt = 8, dp1 = 0.1, dp2 = 0.1) for _ in range(layer_cnt)])
  ...
  def forward(self, x):
  ...
    prev = None
    for resencoder in self.mains:
        x, prev = resencoder(x, prev = prev)
  ...
    return x

If you're not really clear what is going on or what to do, request me to make this a pip package.

RealFormer-Pytorch Implementation of RealFormer using pytorch

Related tags

Overview

RealFormer-Pytorch

So how are RealFormers at vision tasks?

Results

Conclusion

To make a custom RealFormer for other tasks

Owner

Simo Ryu

Face uncertainty quantification or estimation using PyTorch.

A visualization tool to show a TensorFlow's graph like TensorBoard

The code is the training example of AAAI2022 Security AI Challenger Program Phase 8: Data Centric Robot Learning on ML models.

[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

existing and custom freqtrade strategies supporting the new hyperstrategy format.

Locally Constrained Self-Attentive Sequential Recommendation

Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021

[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

This project contains an implemented version of Face Detection using OpenCV and Mediapipe. This is a code snippet and can be used in projects.

Experimental Python implementation of OpenVINO Inference Engine (very slow, limited functionality). All codes are written in Python. Easy to read and modify.

StellarGraph - Machine Learning on Graphs

particle tracking model, works with the ROMS output file(qck.nc, his.nc)

This git repo contains the implementation of my ML project on Heart Disease Prediction

Near-Duplicate Video Retrieval with Deep Metric Learning

Implementation for "Manga Filling Style Conversion with Screentone Variational Autoencoder" (SIGGRAPH ASIA 2020 issue)

Evaluation framework for testing segmentation networks in PyTorch

Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020).

Tensorflow implementation of "Learning Deep Features for Discriminative Localization"

Official Code Implementation of the paper : XAI for Transformers: Better Explanations through Conservative Propagation

LoFTR:Detector-Free Local Feature Matching with Transformers CVPR 2021