VLG-Net: Video-Language Graph Matching Networks for Video Grounding

Related tags

Deep LearningVLG-Net
Overview

VLG-Net: Video-Language Graph Matching Networks for Video Grounding

Introduction

Official repository for VLG-Net: Video-Language Graph Matching Networks for Video Grounding. [ArXiv Preprint]

The paper is accepted to the first edition fo the ICCV workshop: AI for Creative Video Editing and Understanding (CVEU).

Installation

Clone the repository and move to folder:

git clone https://github.com/Soldelli/VLG-Net.git
cd VLG-Net

Install environmnet:

conda env create -f environment.yml

If installation fails, please follow the instructions in file doc/environment.md (link).

Data

Download the following resources and extract the content in the appropriate destination folder. See table.

Resource Download Link File Size Destination Folder
StandfordCoreNLP-4.0.0 link (~0.5GB) ./datasets/
TACoS link (~0.5GB) ./datasets/
ActivityNet-Captions link (~29GB) ./datasets/
DiDeMo link (~13GB) ./datasets/
GCNeXt warmup link (~0.1GB) ./datasets/
Pretrained Models link (~0.1GB) ./models/

The folder structure should be as follows:

.
├── configs
│
├── datasets
│   ├── activitynet1.3
│   │    ├── annotations
│   │    └── features
│   ├── didemo
│   │    ├── annotations
│   │    └── features
│   ├── tacos
│   │    ├── annotations
│   │    └── features
│   ├── gcnext_warmup
│   └── standford-corenlp-4.0.0
│
├── doc
│
├── lib
│   ├── config
│   ├── data
│   ├── engine
│   ├── modeling
│   ├── structures
│   └── utils
│
├── models
│   ├── activitynet
│   └── tacos
│
├── outputs
│
└── scripts

Training

Copy paste the following commands in the terminal.

Load environment:

conda activate vlg
  • For ActivityNet-Captions dataset, run:
python train_net.py --config-file configs/activitynet.yml OUTPUT_DIR outputs/activitynet
  • For TACoS dataset, run:
python train_net.py --config-file configs/tacos.yml OUTPUT_DIR outputs/tacos

Evaluation

For simplicity we provide scripts to automatically run the inference on pretrained models. See script details if you want to run inference on a different model.

Load environment:

conda activate vlg

Then run one of the following scripts to launch the evaluation.

  • For ActivityNet-Captions dataset, run:
    bash scripts/activitynet.sh
  • For TACoS dataset, run:
    bash scripts/tacos.sh

Expected results:

After cleaning the code and fixing a couple of minor bugs, performance changed (slightly) with respect to reported numbers in the paper. See below table.

ActivityNet [email protected] [email protected] [email protected] [email protected]
Paper 46.32 29.82 77.15 63.33
Current 46.32 29.79 77.19 63.36

TACoS [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
Paper 57.21 45.46 34.19 81.80 70.38 56.56
Current 57.16 45.56 34.14 81.48 70.13 56.34

Citation

If any part of our paper and code is helpful to your work, please cite with:

@inproceedings{soldan2021vlg,
  title={VLG-Net: Video-Language Graph Matching Network for Video Grounding},
  author={Soldan, Mattia and Xu, Mengmeng and Qu, Sisi and Tegner, Jesper and Ghanem, Bernard},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={3224--3234},
  year={2021}
}
Owner
Mattia Soldan
PhD student @ KAUST. Working at the intersection between language and video. #Deeplearning #MachineLearning
Mattia Soldan
Semi-supervised Transfer Learning for Image Rain Removal. In CVPR 2019.

Semi-supervised Transfer Learning for Image Rain Removal This package contains the Python implementation of "Semi-supervised Transfer Learning for Ima

Wei Wei 59 Dec 26, 2022
🛠️ SLAMcore SLAM Utilities

slamcore_utils Description This repo contains the slamcore-setup-dataset script. It can be used for installing a sample dataset for offline testing an

SLAMcore 7 Aug 04, 2022
Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)

Length-Adaptive Transformer This is the official Pytorch implementation of Length-Adaptive Transformer. For detailed information about the method, ple

Clova AI Research 93 Dec 28, 2022
Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages"

Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data

Ayush Daksh 12 Dec 01, 2022
Multi-Stage Episodic Control for Strategic Exploration in Text Games

XTX: eXploit - Then - eXplore Requirements First clone this repo using git clone https://github.com/princeton-nlp/XTX.git Please create two conda envi

Princeton Natural Language Processing 9 May 24, 2022
Asterisk is a framework to generate high-quality training datasets at scale

Asterisk is a framework to generate high-quality training datasets at scale

Mona Nashaat 44 Apr 25, 2022
You Only Look Once for Panopitic Driving Perception

You Only 👀 Once for Panoptic 🚗 Perception You Only Look at Once for Panoptic driving Perception by Dong Wu, Manwen Liao, Weitian Zhang, Xinggang Wan

Hust Visual Learning Team 1.4k Jan 04, 2023
Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

StrengthNet Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis" https://arxiv.org/abs/2110

RuiLiu 65 Dec 20, 2022
Let's Git - Versionsverwaltung & Open Source Hausaufgabe

Let's Git - Versionsverwaltung & Open Source Hausaufgabe Herzlich Willkommen zu dieser Hausaufgabe für unseren MOOC: Let's Git! Wir hoffen, dass Du vi

1 Dec 13, 2021
Codes for 'Dual Parameterization of Sparse Variational Gaussian Processes'

Dual Parameterization of Sparse Variational Gaussian Processes Documentation | Notebooks | API reference Introduction This repository is the official

AaltoML 7 Dec 23, 2022
PyJokes - Joking around with Python library pyjokes

Hi, it's Muhaimin again 👋 This is something unorthodox but cool. Don't forget t

Muhaimin A. Salay Kanton 1 Feb 02, 2022
[CVPR 2022] Deep Equilibrium Optical Flow Estimation

Deep Equilibrium Optical Flow Estimation This is the official repo for the paper Deep Equilibrium Optical Flow Estimation (CVPR 2022), by Shaojie Bai*

CMU Locus Lab 136 Dec 18, 2022
Sequence to Sequence Models with PyTorch

Sequence to Sequence models with PyTorch This repository contains implementations of Sequence to Sequence (Seq2Seq) models in PyTorch At present it ha

Sandeep Subramanian 708 Dec 19, 2022
Plaything for Autistic Children (demo for PaddlePaddle/Wechaty/Mixlab project)

星星的孩子 - 一款为孤独症孩子设计的聊天机器人游戏 孤独症儿童是目前常常被忽视的一类群体。他们有着类似性格内向的特征,实际却受着广泛性发育障碍的折磨。 项目背景 这类儿童在与人交往时存在着沟通障碍,其特点表现在: 社交交流差,互动障碍明显 认知能力有限,被动认知 兴趣狭窄,重复刻板,缺乏变化和想象

Tianyi Pan 35 Nov 24, 2022
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

GPT2-Pytorch with Text-Generator Better Language Models and Their Implications Our model, called GPT-2 (a successor to GPT), was trained simply to pre

Tae-Hwan Jung 775 Jan 08, 2023
[ICCV2021] 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds

3DVG-Transformer This repository is for the ICCV 2021 paper "3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds" Our method "3DV

22 Dec 11, 2022
An experimentation and research platform to investigate the interaction of automated agents in an abstract simulated network environments.

CyberBattleSim April 8th, 2021: See the announcement on the Microsoft Security Blog. CyberBattleSim is an experimentation research platform to investi

Microsoft 1.5k Dec 25, 2022
On Generating Extended Summaries of Long Documents

ExtendedSumm This repository contains the implementation details and datasets used in On Generating Extended Summaries of Long Documents paper at the

Georgetown Information Retrieval Lab 76 Sep 05, 2022
A simple code to convert image format and channel as well as resizing and renaming multiple images.

Rename-Resize-and-convert-multiple-images A simple code to convert image format and channel as well as resizing and renaming multiple images. This cod

Happy N. Monday 3 Feb 15, 2022
Learning to Reconstruct 3D Manhattan Wireframes from a Single Image

Learning to Reconstruct 3D Manhattan Wireframes From a Single Image This repository contains the PyTorch implementation of the paper: Yichao Zhou, Hao

Yichao Zhou 50 Dec 27, 2022