Source code of AAAI 2022 paper "Towards End-to-End Image Compression and Analysis with Transformers".

Last update: Dec 21, 2022

Overview

Towards End-to-End Image Compression and Analysis with Transformers

Source code of our AAAI 2022 paper "Towards End-to-End Image Compression and Analysis with Transformers".

Usage

The code is run with Python 3.7, Pytorch 1.8.1, Timm 0.4.9 and Compressai 1.1.4.

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class2/
      img4.jpeg

Pretrained model

The ./pretrained_model provides the pretrained model without compression.

Test

Please adjust --data-path and run sh test.sh:

python main.py --eval --resume ./pretrain_s/checkpoint.pth --model pretrained_model --data-path /path/to/imagenet/ --output_dir ./eval

The ./pretrain_s/checkpoint.pth can be downloaded from Baidu Netdisk, with access code aaai.

Train

Please adjust --data-path and run sh train.sh:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model pretrained_model --no-model-ema --clip-grad 1.0 --batch-size 128 --num_workers 16 --data-path /path/to/imagenet/ --output_dir ./ckp_pretrain

Full model

The ./full_model provides the full model with compression.

Test

Please adjust --data-path and --resume, respectively. Run sh test.sh:

python main.py --eval --resume ./ckp_s_q1/checkpoint.pth --model full_model --no-pretrained --data-path /path/to/imagenet/ --output_dir ./eval

The ./ckp_s_q1/checkpoint.pth, ./ckp_s_q2/checkpoint.pth and ./ckp_s_q3/checkpoint.pth can be downloaded from Baidu Netdisk, with access code aaai.

Train

Please download ./pretrain_s/checkpoint.pth from Baidu Netdisk with access code aaai, adjust --data-path and --quality, respectively.

quality	alpha	beta
1	0.1	0.001
2	0.3	0.003
3	0.6	0.006

Run sh train.sh:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model full_model --batch-size 128 --num_workers 16 --clip-grad 1.0 --quality 1 --data-path /path/to/imagenet/ --output_dir ./ckp_full

Citation

@InProceedings{Bai2022AAAI,
  title={Towards End-to-End Image Compression and Analysis with Transformers},
  author={Bai, Yuanchao and Yang, Xu and Liu, Xianming and Jiang, Junjun and Wang, Yaowei and Ji, Xiangyang and Gao, Wen},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2022}
}

Source code of AAAI 2022 paper "Towards End-to-End Image Compression and Analysis with Transformers".

Related tags

Overview

Towards End-to-End Image Compression and Analysis with Transformers

Usage

Data preparation

Pretrained model

Full model

Citation

Owner

The project covers common metrics for super-resolution performance evaluation.

Simple keras FCN Encoder/Decoder model for MS-COCO (food subset) segmentation

Code for Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations

EqGAN - Improving GAN Equilibrium by Raising Spatial Awareness

Optimising chemical reactions using machine learning

pytorch implementation of GPV-Pose

The code for SAG-DTA: Prediction of Drug–Target Affinity Using Self-Attention Graph Network.

Free Book about Deep-Learning approaches for Chess (like AlphaZero, Leela Chess Zero and Stockfish NNUE)

Multi agent DDPG algorithm written in Python + Pytorch

Kaggle | 9th place single model solution for TGS Salt Identification Challenge

Tensorflow implementation of our method: "Triangle Graph Interest Network for Click-through Rate Prediction".

My tensorflow implementation of "A neural conversational model", a Deep learning based chatbot

This reposityory contains the PyTorch implementation of our paper "Generative Dynamic Patch Attack".

"Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback"

PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Python implementation of the multistate Bennett acceptance ratio (MBAR)

Pytorch cuda extension of grid_sample1d

Semi-supervised semantic segmentation needs strong, varied perturbations

2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

Code for Recurrent Mask Refinement for Few-Shot Medical Image Segmentation (ICCV 2021).