Cleaned up code for DSTC 10: SIMMC 2.0 track: subtask 2: multimodal coreference resolution

Last update: Dec 05, 2022

Related tags

Overview

UNITER-Based Situated Coreference Resolution with Rich Multimodal Input: arXiv

MMCoref_cleaned

Code for the MMCoref task of the SIMMC 2.0 dataset.
Pretrained vision-language models adapted from Transformers-VQA.
Zero-shot visual feature extraction using CLIP and BUTD.
Zero-shot non-visual prefab feature (flattened into strings) extraction using BERT and SBERT.

Dependencies

requirements.txt

Download the data and pretrained/trained model checkpoints

Data: Put the data in ./data. Unpack all image in ./data/all_images and all scene.jsons (including teststd split) in ./data/simmc2_scene_jsons_dstc10_public/public.
Pretrained models: Checkpoints in ./pretrained and ./model/Transformers-VQA-master/models/pretrained. Download links in placeholder.txt in these folders.
Trained models: Checkpints in ./trained. Download from ./trained/placeholder.txt

Preprocess

Convert json files ~~using ./scripts/converter.py~~ *Currently not working. (Someone managed to lose the latest converter.py.) Download the processed data instead.
Get BERT/SBERT embeddings of non-visual prefab features using ./scripts/{get_KB_embedding, get_KB_embedding_SBERT, get_KB_embedding_no_duplicate}.py
Get CLIP/BUTD embeddigns for images using scripts ./scripts/get-visual-features-{CLIP, RCNN}.ipynb
Or just download everything from ./processed/placeholder.txt

Train

Under ./sh/train. See the arguments for used input.

Inference and evaluate

Under ./sh/infer_eval (devtest split) and ./sh/infer_eval_dev (dev split)
Outputs at ./output (same format as the original dialogue json).
Logits at ./output/logit {dialogue_idx: {round_idx: [[logit, label], ...]}}
run ./scripts/output_filter_error.py to select and reformat error cases.

Ensemble

cd script python ensemble --method optuna

output saved to output/logit/blended_devtest.json

Cleaned up code for DSTC 10: SIMMC 2.0 track: subtask 2: multimodal coreference resolution

Related tags

Overview

UNITER-Based Situated Coreference Resolution with Rich Multimodal Input: arXiv

MMCoref_cleaned

Dependencies

Download the data and pretrained/trained model checkpoints

Preprocess

Train

Inference and evaluate

Ensemble

Owner

Yichen (William) Huang

Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).

FS-Mol: A Few-Shot Learning Dataset of Molecules

Multi-Modal Fingerprint Presentation Attack Detection: Evaluation On A New Dataset

The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

Pytorch implementation of "ARM: Any-Time Super-Resolution Method"

Official pytorch implement for “Transformer-Based Source-Free Domain Adaptation”

On-device speech-to-index engine powered by deep learning.

Convert dog pictures into various painting styles. Try LimnPet

An introduction to satellite image analysis using Python + OpenCV and JavaScript + Google Earth Engine

Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch

Bulk2Space is a spatial deconvolution method based on deep learning frameworks

Official repository of the paper 'Essentials for Class Incremental Learning'

[NeurIPS '21] Adversarial Attacks on Graph Classification via Bayesian Optimisation (GRABNEL)

Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

CVPR '21: In the light of feature distributions: Moment matching for Neural Style Transfer

A Nim frontend for pytorch, aiming to be mostly auto-generated and internally using ATen.

Genetic feature selection module for scikit-learn

automatic color-grading

EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers