Code for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators"

Last update: Nov 20, 2022

Overview

Query Variation Generators

This repository contains the code and annotation data for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators".

Setup

Install the requirements using

pip install -r requirements.txt

Steps to reproduce the results

First we need to generate_weak supervsion for the desired test sets. We can do that with the scripts/generate_weak_supervision.py. In the paper we test for TREC-DL ('msmarco-passage/trec-dl-2019/judged') and ANTIQUE ('antique/train/split200-valid'), but any IR-datasets (https://ir-datasets.com/index.html) can be used here (as TASK).

python ${REPO_DIR}/examples/generate_weak_supervision.py 
    --task $TASK \
    --output_dir $OUT_DIR

This will generate one query variation for each method for the original queries. After this, we manually annotated the query variations generated, in order to keep only valid ones for analysis. For that we use analyze_weak_supervision.py (prepares data for manual anotation) and analyze_auto_query_generation_labeling.py (combines auto labels and anotations.).

However, for reproducing the results we can directly use the annotated query set to test neural ranking models robustness (RQ1):

python ${REPO_DIR}/disentangled_information_needs/evaluation/query_rewriting.py \
        --task 'irds:msmarco-passage/trec-dl-2019/judged' \
        --output_dir $OUT_DIR/ \
        --variations_file $OUT_DIR/$VARIATIONS_FILE_TREC_DL \
        --retrieval_model_name "BM25+KNRM" \
        --train_dataset "irds:msmarco-passage/train" \
        --max_iter $MAX_ITER

by using the annotated variations file directly here "$OUT_DIR/$VARIATIONS_FILE_TREC_DL". The same can be done to run rank fusion (RQ2) by replacing query_rewriting.py with rank_fusion.py.

The scripts evaluate_weak_supervision.sh and evaluate_rank_fusion.sh run all models and datasets for both research questions . The first generates the main table of results, Table 4 in the paper, and the second generates the tables for the rank fusion experiments (only available in the Arxiv version of the paper).

Modules and Folders

scripts: Contain most of the analysis scripts and also commands to run entire experiments.
examples: Contain an example on how to generate query variations.
disentangled_information_needs/evaluation: Scripts to evaluate robustness of models for query variations and also to evaluate rank fusion of query variations.
disentangled_information_needs/transformations: Methods to generate query variations.

Code for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators"

Related tags

Overview

Query Variation Generators

Setup

Steps to reproduce the results

Modules and Folders

Owner

Gustavo Penha

FaceAnon - Anonymize people in images and videos using yolov5-crowdhuman

Compares various time-series feature sets on computational performance, within-set structure, and between-set relationships.

Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

This is the repo for the paper `SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization'. (published in Bioinformatics'21)

Official implementation of NeuralFusion: Online Depth Map Fusion in Latent Space

Implement object segmentation on images using HOG algorithm proposed in CVPR 2005

An efficient PyTorch implementation of the evaluation metrics in recommender systems.

Using Tensorflow Object Detection API to detect Waymo open dataset

Canonical Capsules: Unsupervised Capsules in Canonical Pose (NeurIPS 2021)

Semi-supervised Video Deraining with Dynamical Rain Generator (CVPR, 2021, Pytorch)

The implementation of "Bootstrapping Semantic Segmentation with Regional Contrast".

QKeras: a quantization deep learning library for Tensorflow Keras

FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE)

The repo for the paper "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection".

Pytorch Implementation of Adversarial Deep Network Embedding for Cross-Network Node Classification

Elevation Mapping on GPU.

Code for "Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations"

Project page for End-to-end Recovery of Human Shape and Pose

Automatically creates genre collections for your Plex media