CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.

Last update: Dec 28, 2022

Related tags

Deep Learning cvt2distilgpt2

Overview

CvT2DistilGPT2

Improving Chest X-Ray Report Generation by Leveraging Warm-Starting

This repository houses the implementation of CvT2DistilGPT2 from [1].
CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.
Checkpoints for CvT2DistilGPT2 on MIMIC-CXR and IU X-Ray are available.
This implementation could be adapted for any image captioning task by modifying the datamodule.


CvT2DistilGPT2 for MIMIC-CXR. Q, K, and V are the queries, keys, and values, respectively, for multi-head attention. * indicates that the linear layers for Q, K, and V are replaced with the convolutional layers depicted below the multi-head attention module. `[BOS]` is the beginning-of-sentence special token. `N_l` is the number of layers for each stage, where `N_l=1`, `N_l=4`, and `N_l=16` for the first, second, and third stage, respectively. The head for DistilGPT2 is the same used for language modelling. Subwords produced by DistilGPT2 are separated by a vertical bar.

CvT2DistilGPT2 for MIMIC-CXR. Q, K, and V are the queries, keys, and values, respectively, for multi-head attention. * indicates that the linear layers for Q, K, and V are replaced with the convolutional layers depicted below the multi-head attention module. [BOS] is the beginning-of-sentence special token. N_l is the number of layers for each stage, where N_l=1, N_l=4, and N_l=16 for the first, second, and third stage, respectively. The head for DistilGPT2 is the same used for language modelling. Subwords produced by DistilGPT2 are separated by a vertical bar.

Installation

The required packages are located in requirements.txt. It is recommended that these are installed in a virtualenv:

python3 -m venv --system-site-packages venv
source venv/bin/activate
pip install --upgrade pip
pip install --upgrade -r requirements.txt --no-cache-dir

Datasets

For MIMIC-CXR:

Download MIMIC-CXR-JPG from:

https://physionet.org/content/mimic-cxr-jpg/2.0.0/

Place in dataset/mimic_cxr_jpg such that dataset/mimic_cxr_jpg/physionet.org/files/mimic-cxr-jpg/2.0.0/files.

Download the Chen et al. labels for MIMIC-CXR from:

https://drive.google.com/file/d/1DS6NYirOXQf8qYieSVMvqNwuOlgAbM_E/view?usp=sharing

Place annotations.json in dataset/mimic_cxr_chen

For IU X-Ray:

Download the Chen et al. labels and the chest X-rays in png format for IU X-Ray from:
```
https://drive.google.com/file/d/1c0BXEuDy8Cmm2jfN0YYGkQxFZd2ZIoLg/view
```
Place files into dataset/iu_x-ray_chen such that dataset/iu_x-ray_chen/annotations.json and dataset/iu_x-ray_chen/images.

#####Note: the dataset directory can be changed for each task with the variable dataset_dir in task/mimic_cxr_jpg_chen/paths.yaml and task/mimic_cxr_jpg_chen/paths.yaml

Checkpoints

The checkpoints for MIMIC-CXR and IU X-Ray can be found at (the download link is located at the top right): https://doi.org/10.25919/hbqx-2p71. Place the checkpoints in the experiment directory for each version of each task, e.g., experiment/mimic_cxr_jpg_chen/cvt_21_to_gpt2_scst/epoch=0-val_chen_cider=0.410965.ckpt #####Note: the experiment directory can be changed for each task with the variable exp_dir in task/mimic_cxr_jpg_chen/paths.yaml and task/mimic_cxr_jpg_chen/paths.yaml

Instructions

The model configurations for each task can be found in its config directory, e.g. task/mimic_cxr_jpg_chen/config.
A job for a model is described in the tasks jobs.yaml file, e.g. task/mimic_cxr_jpg_chen/jobs.yaml.

To test the CvT2DistilGPT2 + SCST checkpoint, set task/mimic_cxr_jpg_chen/jobs.yaml to (default):

cvt_21_to_distilgpt2_scst:
    train: 0
    test: 1
    debug: 0
    num_nodes: 1
    num_gpus: 1
    num_workers: 5

To train CvT2DistilGPT2 with teacher forcing and then test, set task/mimic_cxr_jpg_chen/jobs.yaml to:

cvt_21_to_distilgpt2:
    train: 1
    test: 1
    debug: 0
    num_nodes: 1
    num_gpus: 1
    num_workers: 5

or with Slurm:

cvt_21_to_distilgpt2:
    train: 1
    test: 1
    debug: 0
    num_nodes: 1
    num_gpus: 1
    num_workers: 5
    resumable: 1
    sbatch: 1
    time_limit: 1-00:00:00

To run the job:

python3 main.py --task mimic_cxr_jpg_chen

#####Note: data from the job will be saved in the experiment directory.

Reference

[1] Aaron Nicolson, Jason Dowling, and Aaron Nicolson, Improving Chest X-Ray Report Generation by Leveraging Warm-Starting, Under review (January 2022)

CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.

Related tags

Overview

CvT2DistilGPT2

Improving Chest X-Ray Report Generation by Leveraging Warm-Starting

Installation

Datasets

For MIMIC-CXR:

For IU X-Ray:

Checkpoints

Instructions

Reference

Owner

The Australian e-Health Research Centre

JstDoS - HTTP Protocol Stack Remote Code Execution Vulnerability

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Semantic Segmentation.

code for paper"A High-precision Semantic Segmentation Method Combining Adversarial Learning and Attention Mechanism"

Do you like Quick, Draw? Well what if you could train/predict doodles drawn inside Streamlit? Also draws lines, circles and boxes over background images for annotation.

An unreferenced image captioning metric (ACL-21)

POPPY (Physical Optics Propagation in Python) is a Python package that simulates physical optical propagation including diffraction

Circuit Training: An open-source framework for generating chip floor plans with distributed deep reinforcement learning

Code for Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

Conversational text Analysis using various NLP techniques

Predicting the duration of arrival delays for commercial flights.

Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI 2022)

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

Official Implementation of HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

A python script to convert images to animated sus among us crewmate twerk jifs as seen on r/196

This repo provides the official code for TransBTS: Multimodal Brain Tumor Segmentation Using Transformer (https://arxiv.org/pdf/2103.04430.pdf).

SSPNet: Scale Selection Pyramid Network for Tiny Person Detection from UAV Images.

Notification Triggers for Python

[NeurIPS '21] Adversarial Attacks on Graph Classification via Bayesian Optimisation (GRABNEL)