MEND: Model Editing Networks using Gradient Decomposition

Setup

Environment

This codebase uses Python 3.7.9. Other versions may work as well.

Create a virtualenv (pyenv can help with this) and install the dependencies:

$ python -m venv env
$ source env/bin/activate
(env) $ pip install -r requirements.txt

Data

You can download the data needed for this project from this Google Drive link. Unzip each sub-directory into mend/data and you should be good to go.

Running the code

Run MEND training/evaluation for distilGPT-2 on the wikitext editing problem with:

(env) $ python -m run +alg=mend +experiment=gen +model=distilgpt2

Other valid algs include efk (KnowledgeEditor) and enn (Editable Neural Networks). Valid experiments include fc (FEVER fact checking) and qa (zsRE question-answering). Splits and rephrases for both come from De Cao et. al. Check config/model for options for editable models (note that all models don't work for all experiments; GPT-style models only work with gen, seq2seq models only work with qa, and BERT only works with fc).

Also note that in the paper, we sample locality data from different datasets depending on the model. By default, training will use Natural Questions data (not zsRE data) for computing drawdown in the qa experiment and OpenWebText. For models such as the distilgpt2 model we use (which was fine-tuned on wikitext) or the BART-base model, this behavior should be disabled with data.wiki_webtext=False or data.zsre_nq=False, respectively.

Citing the paper

If this code or paper was useful, please consider using the following citation:

@article{mitchell2021fast,
    title={Fast Model Editing at Scale},
    author={Mitchell, Eric and Lin, Charles and Bosselut, Antoine and Finn, Chelsea and Manning, Chris}
    year={2021}
}

MEND: Model Editing Networks using Gradient Decomposition

Related tags

Overview

MEND: Model Editing Networks using Gradient Decomposition

Setup

Environment

Data

Running the code

Citing the paper

Owner

Eric Mitchell

A Factor Model for Persistence in Investment Manager Performance

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

Breast Cancer Detection 🔬 ITI "AI_Pro" Graduation Project

A Broader Picture of Random-walk Based Graph Embedding

Code accompanying the paper "How Tight Can PAC-Bayes be in the Small Data Regime?"

A Python library for Deep Graph Networks

Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

Realtime segmentation with ENet, the fast and accurate segmentation net.

Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

SpecAugmentPyTorch - A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

Code for ViTAS_Vision Transformer Architecture Search

Traffic4D: Single View Reconstruction of Repetitious Activity Using Longitudinal Self-Supervision

[arXiv] What-If Motion Prediction for Autonomous Driving ❓🚗💨

Python implementation of the multistate Bennett acceptance ratio (MBAR)

RefineMask (CVPR 2021)

P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

基于DouZero定制AI实战欢乐斗地主