CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.

Overview

CvT2DistilGPT2

Improving Chest X-Ray Report Generation by Leveraging Warm-Starting

  • This repository houses the implementation of CvT2DistilGPT2 from [1].
  • CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.
  • Checkpoints for CvT2DistilGPT2 on MIMIC-CXR and IU X-Ray are available.
  • This implementation could be adapted for any image captioning task by modifying the datamodule.

CvT2DistilGPT2 for MIMIC-CXR. Q, K, and V are the queries, keys, and values, respectively, for multi-head attention. * indicates that the linear layers for Q, K, and V are replaced with the convolutional layers depicted below the multi-head attention module. [BOS] is the beginning-of-sentence special token. N_l is the number of layers for each stage, where N_l=1, N_l=4, and N_l=16 for the first, second, and third stage, respectively. The head for DistilGPT2 is the same used for language modelling. Subwords produced by DistilGPT2 are separated by a vertical bar.

Installation

The required packages are located in requirements.txt. It is recommended that these are installed in a virtualenv:

python3 -m venv --system-site-packages venv
source venv/bin/activate
pip install --upgrade pip
pip install --upgrade -r requirements.txt --no-cache-dir

Datasets

For MIMIC-CXR:

  1. Download MIMIC-CXR-JPG from:

    https://physionet.org/content/mimic-cxr-jpg/2.0.0/
    
  2. Place in dataset/mimic_cxr_jpg such that dataset/mimic_cxr_jpg/physionet.org/files/mimic-cxr-jpg/2.0.0/files.

  3. Download the Chen et al. labels for MIMIC-CXR from:

    https://drive.google.com/file/d/1DS6NYirOXQf8qYieSVMvqNwuOlgAbM_E/view?usp=sharing
    
  4. Place annotations.json in dataset/mimic_cxr_chen

For IU X-Ray:

  1. Download the Chen et al. labels and the chest X-rays in png format for IU X-Ray from:
    https://drive.google.com/file/d/1c0BXEuDy8Cmm2jfN0YYGkQxFZd2ZIoLg/view
    
  2. Place files into dataset/iu_x-ray_chen such that dataset/iu_x-ray_chen/annotations.json and dataset/iu_x-ray_chen/images.

#####Note: the dataset directory can be changed for each task with the variable dataset_dir in task/mimic_cxr_jpg_chen/paths.yaml and task/mimic_cxr_jpg_chen/paths.yaml

Checkpoints

The checkpoints for MIMIC-CXR and IU X-Ray can be found at (the download link is located at the top right): https://doi.org/10.25919/hbqx-2p71. Place the checkpoints in the experiment directory for each version of each task, e.g., experiment/mimic_cxr_jpg_chen/cvt_21_to_gpt2_scst/epoch=0-val_chen_cider=0.410965.ckpt #####Note: the experiment directory can be changed for each task with the variable exp_dir in task/mimic_cxr_jpg_chen/paths.yaml and task/mimic_cxr_jpg_chen/paths.yaml

Instructions

  • The model configurations for each task can be found in its config directory, e.g. task/mimic_cxr_jpg_chen/config.

  • A job for a model is described in the tasks jobs.yaml file, e.g. task/mimic_cxr_jpg_chen/jobs.yaml.

  • To test the CvT2DistilGPT2 + SCST checkpoint, set task/mimic_cxr_jpg_chen/jobs.yaml to (default):

    cvt_21_to_distilgpt2_scst:
        train: 0
        test: 1
        debug: 0
        num_nodes: 1
        num_gpus: 1
        num_workers: 5
    
  • To train CvT2DistilGPT2 with teacher forcing and then test, set task/mimic_cxr_jpg_chen/jobs.yaml to:

    cvt_21_to_distilgpt2:
        train: 1
        test: 1
        debug: 0
        num_nodes: 1
        num_gpus: 1
        num_workers: 5
    

    or with Slurm:

    cvt_21_to_distilgpt2:
        train: 1
        test: 1
        debug: 0
        num_nodes: 1
        num_gpus: 1
        num_workers: 5
        resumable: 1
        sbatch: 1
        time_limit: 1-00:00:00
    
  • To run the job:

    python3 main.py --task mimic_cxr_jpg_chen

#####Note: data from the job will be saved in the experiment directory.

Reference

[1] Aaron Nicolson, Jason Dowling, and Aaron Nicolson, Improving Chest X-Ray Report Generation by Leveraging Warm-Starting, Under review (January 2022)

Owner
The Australian e-Health Research Centre
The Australian e-Health Research Centre
JstDoS - HTTP Protocol Stack Remote Code Execution Vulnerability

jstDoS If you are going to skid that, please give credits ! ^^ ¿How works? This

apolo 4 Feb 11, 2022
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Semantic Segmentation.

Swin Transformer for Semantic Segmentation of satellite images This repo contains the supported code and configuration files to reproduce semantic seg

23 Oct 10, 2022
code for paper"A High-precision Semantic Segmentation Method Combining Adversarial Learning and Attention Mechanism"

PyTorch implementation of UAGAN(U-net Attention Generative Adversarial Networks) This repository contains the source code for the paper "A High-precis

Tong 8 Apr 25, 2022
Do you like Quick, Draw? Well what if you could train/predict doodles drawn inside Streamlit? Also draws lines, circles and boxes over background images for annotation.

Streamlit - Drawable Canvas Streamlit component which provides a sketching canvas using Fabric.js. Features Draw freely, lines, circles, boxes and pol

Fanilo Andrianasolo 325 Dec 28, 2022
An unreferenced image captioning metric (ACL-21)

UMIC This repository provides an unferenced image captioning metric from our ACL 2021 paper UMIC: An Unreferenced Metric for Image Captioning via Cont

hwanheelee 14 Nov 20, 2022
POPPY (Physical Optics Propagation in Python) is a Python package that simulates physical optical propagation including diffraction

POPPY: Physical Optics Propagation in Python POPPY (Physical Optics Propagation in Python) is a Python package that simulates physical optical propaga

Space Telescope Science Institute 132 Dec 15, 2022
Circuit Training: An open-source framework for generating chip floor plans with distributed deep reinforcement learning

Circuit Training: An open-source framework for generating chip floor plans with distributed deep reinforcement learning. Circuit Training is an open-s

Google Research 479 Dec 25, 2022
Code for Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks Under construction. Description Code for Phase diagram of S

Rodrigo Veiga 3 Nov 24, 2022
Conversational text Analysis using various NLP techniques

PyConverse Let me try first Installation pip install pyconverse Usage Please try this notebook that demos the core functionalities: basic usage noteb

Rita Anjana 158 Dec 25, 2022
Predicting the duration of arrival delays for commercial flights.

Flight Delay Prediction Our objective is to predict arrival delays of commercial flights. According to the US Department of Transportation, about 21%

Jordan Silke 1 Jan 11, 2022
Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Semantic Diversity Learning for Zero-Shot Multi-label Classification Paper Official PyTorch Implementation Avi Ben-Cohen, Nadav Zamir, Emanuel Ben Bar

28 Aug 29, 2022
ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI 2022)

ShuttleNet: Position-aware Rally Progress and Player Styles Fusion for Stroke Forecasting in Badminton (AAAI 2022) Official code of the paper ShuttleN

Wei-Yao Wang 11 Nov 30, 2022
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

[ICCV2021] TransReID: Transformer-based Object Re-Identification [pdf] The official repository for TransReID: Transformer-based Object Re-Identificati

DamoCV 569 Dec 30, 2022
Official Implementation of HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation by Lukas Hoyer, Dengxin Dai, and Luc Van Gool [Arxiv] [Paper] Overview Unsup

Lukas Hoyer 149 Dec 28, 2022
Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices Abstract For practical deep neural network design on mobile devices, it is e

11 Dec 30, 2022
A python script to convert images to animated sus among us crewmate twerk jifs as seen on r/196

img_sussifier A python script to convert images to animated sus among us crewmate twerk jifs as seen on r/196 Examples How to use install python pip i

41 Sep 30, 2022
This repo provides the official code for TransBTS: Multimodal Brain Tumor Segmentation Using Transformer (https://arxiv.org/pdf/2103.04430.pdf).

TransBTS: Multimodal Brain Tumor Segmentation Using Transformer This repo is the official implementation for TransBTS: Multimodal Brain Tumor Segmenta

Raymond 247 Dec 28, 2022
SSPNet: Scale Selection Pyramid Network for Tiny Person Detection from UAV Images.

SSPNet: Scale Selection Pyramid Network for Tiny Person Detection from UAV Images (IEEE GRSL 2021) Code (based on mmdetection) for SSPNet: Scale Selec

Italian Cannon 37 Dec 28, 2022
Notification Triggers for Python

Notipyer Notification triggers for Python Send async email notifications via Python. Get updates/crashlogs from your scripts with ease. Installation p

Chirag Jain 17 May 16, 2022
[NeurIPS '21] Adversarial Attacks on Graph Classification via Bayesian Optimisation (GRABNEL)

Adversarial Attacks on Graph Classification via Bayesian Optimisation @ NeurIPS 2021 This repository contains the official implementation of GRABNEL,

Xingchen Wan 12 Dec 23, 2022