This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

Overview

Wizard of Search Engine: Access to Information Through Conversations with Search Engines

by Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zhumin Chen, Zhaochun Ren and Maarten de Rijke

@inproceedings{ren2021wizard,
title={Wizard of Search Engine: Access to Information Through Conversations with Search Engines},
author={Ren, Pengjie and Liu, Zhongkun and Song, Xiaomeng and Tian, Hongtao and Chen, Zhumin and Ren, Zhaochun and de Rijke, Maarten},
booktitle={Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval},
year={2021}
}

Paper summary

task
Task pipeline for conversational information seeking (CIS)
model
Model pipeline for conversational information seeking (CIS)

In this work, we make efforts to facilitate research on conversational information seeking (CIS) from three angles: (1) We formulate a pipeline for CIS with six sub-tasks: intent detection, keyphrase extraction, action prediction, query selection, passage selection, and response generation. (2) We release a benchmark dataset, called wizard of search engine(WISE), which allows for comprehensive and in-depth research on all aspects of CIS. (3) We design a neural architecture capable of training and evaluating both jointly and separately on the six sub-tasks, and devise a pre-train/fine-tune learning scheme, that can reduce the requirements of WISE in scale by making full use of available data.

Running experiments

Requirements

This code is written in PyTorch. Any version later than 1.6 is expected to work with the provided code. Please refer to the official website for an installation guide.

We recommend to use conda for installing the requirements. If you haven't installed conda yet, you can find instructions here. The steps for installing the requirements are:

  • Create a new environment

    conda create env -n WISE
    

    In the environment, a python version >3.6 should be used.

  • Activate the environment

    conda activate WISE
    
  • Install the requirements within the environment via pip:

    pip install -r requirements.txt
    

Datasets

We use WebQA, DuReader, KdConv and DuConv datasets for pretraining. You can get them from the provided links and put them in the corresponding folders in ./data/. For example, WebQA datasets should be put in ./data/WebQA, and DuReader datasets in ./data/Dureader and so on. We use the WISE dataset to fine-tune the model, and this dataset is available in ./data/WISE. Details about the WISE dataset can be found here.

Training

  • Run the following scripts to automatically process the pretraining datasets into the required format:
python ./Run.py --mode='data'
  • Run the following scripts sequentially:
python -m torch.distributed.launch --nproc_per_node=4 ./Run.py --mode='pretrain'
python -m torch.distributed.launch --nproc_per_node=4 ./Run.py --mode='finetune'

Note that you should select the appropriate pretrain models from the folder ./output/pretrained, and put them into ./output/pretrained_ready which is newly created by yourself before finetuning. The hyperparameters are set to the default values used in our experiments. To see an overview of all hyperparameters, please refer to ./Run.py.

Evaluating

  • Run the following scripts:
python -m torch.distributed.launch --nproc_per_node=4 ./Run.py --mode='infer-valid'
python -m torch.distributed.launch --nproc_per_node=4 ./Run.py --mode='eval-valid'
python -m torch.distributed.launch --nproc_per_node=4 ./Run.py --mode='infer-test'
python -m torch.distributed.launch --nproc_per_node=4 ./Run.py --mode='eval-test'
A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!

CoVA: Context-aware Visual Attention for Webpage Information Extraction Abstract Webpage information extraction (WIE) is an important step to create k

Keval Morabia 41 Jan 01, 2023
Source code for the paper "Periodic Traveling Waves in an Integro-Difference Equation With Non-Monotonic Growth and Strong Allee Effect"

Source code for the paper "Periodic Traveling Waves in an Integro-Difference Equation With Non-Monotonic Growth and Strong Allee Effect" by Michael Ne

M Nestor 1 Apr 19, 2022
Code base of object detection

rmdet code base of object detection. 环境安装: 1. 安装conda python环境 - `conda create -n xxx python=3.7/3.8` - `conda activate xxx` 2. 运行脚本,自动安装pytorch1

3 Mar 08, 2022
Info and sample codes for "NTU RGB+D Action Recognition Dataset"

"NTU RGB+D" Action Recognition Dataset "NTU RGB+D 120" Action Recognition Dataset "NTU RGB+D" is a large-scale dataset for human action recognition. I

Amir Shahroudy 578 Dec 30, 2022
An algorithm study of the 6th iOS 10 set of Boost Camp Web Mobile

알고리즘 스터디 🔥 부스트캠프 웹모바일 6기 iOS 10조의 알고리즘 스터디 입니다. 개인적인 사정 등으로 S034, S055만 참가하였습니다. 스터디 목적 상진: 코테 합격 + 부캠끝나고 아침에 일어나기 위해 필요한 사이클 기완: 꾸준하게 자리에 앉아 공부하기 +

2 Jan 11, 2022
Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

face-vid2vid Usage Dataset Preparation cd datasets wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl chmod a+rx youtube-dl python load_

worstcoder 68 Dec 30, 2022
PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation This is the PyTorch implemention of ICCV'21 paper SGPA: Structure

Chen Kai 24 Dec 05, 2022
Discerning Decision-Making Process of Deep Neural Networks with Hierarchical Voting Transformation

Configurations Change HOME_PATH in CONFIG.py as the current path Data Prepare CENSINCOME Download data Put census-income.data and census-income.test i

2 Aug 14, 2022
QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

Introduction QRec is a Python framework for recommender systems (Supported by Python 3.7.4 and Tensorflow 1.14+) in which a number of influential and

Yu 1.4k Dec 30, 2022
Multi Task Vision and Language

12-in-1: Multi-Task Vision and Language Representation Learning Please cite the following if you use this code. Code and pre-trained models for 12-in-

Facebook Research 712 Dec 19, 2022
Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data

Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data This is the official PyTorch implementation of the SeCo paper: @articl

ElementAI 101 Dec 12, 2022
This code is 3d-CNN model that can predict environmental value

Predict-environmental-value-3dCNN This code is 3d-CNN model that can predict environmental value. Firstly, I built a model that can create a lot of bu

1 Jan 06, 2022
Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

1 Oct 11, 2021
GNN-based Recommendation Benchma

GRecX A Fair Benchmark for GNN-based Recommendation Preliminary Comparison DiffNet-Yelp dataset (featureless) Algo 73 Oct 17, 2022

Detectorch - detectron for PyTorch

Detectorch - detectron for PyTorch (Disclaimer: this is work in progress and does not feature all the functionalities of detectron. Currently only inf

Ignacio Rocco 558 Dec 23, 2022
Async API for controlling Hue Lights

Hue API Async API for controlling Hue Lights Documentation: hue-api.nirantak.com Source: github.com/nirantak/hue-api Installation This is an async cli

Nirantak Raghav 4 Nov 16, 2022
WatermarkRemoval-WDNet-WACV2021

WatermarkRemoval-WDNet-WACV2021 Thank you for your attention. Citation Please cite the related works in your publications if it helps your research: @

LUYI 63 Dec 05, 2022
A package to predict protein inter-residue geometries from sequence data

trRosetta This package is a part of trRosetta protein structure prediction protocol developed in: Improved protein structure prediction using predicte

Ivan Anishchenko 185 Jan 07, 2023
[ECCV2020] Content-Consistent Matching for Domain Adaptive Semantic Segmentation

[ECCV20] Content-Consistent Matching for Domain Adaptive Semantic Segmentation This is a PyTorch implementation of CCM. News: GTA-4K list is available

Guangrui Li 88 Aug 25, 2022
NNR conformation conditional and global probabilities estimation and analysis in peptides or proteins fragments

NNR and global probabilities estimation and analysis in peptides or protein fragments This module calculates global and NNR conformation dependent pro

0 Jul 15, 2021