Learning to compose soft prompts for compositional zero-shot learning.

Overview

Compositional Soft Prompting (CSP)

Compositional soft prompting (CSP), a parameter-efficient learning technique to improve the zero-shot compositionality of large-scale pretrained vision-language models (VLMs) without the overhead of fine-tuning the entire model.

Reference Paper: Learning to Compose Soft Prompts for Compositional Zero-Shot Learning

alt text

If you find CSP helpful, please cite our paper:

@article{csp2022,
  author = {Nayak, Nihal V. and Yu, Peilin and Bach, Stephen H.},
  title = {Learning to Compose Soft Prompts for Compositional Zero-Shot Learning},
  volume = {arXiv:2204.03574 [cs.LG]},
  year = {2022},
}

Setup

conda create --name clip python=3.7
conda activate clip
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
pip3 install ftfy regex tqdm scipy pandas
pip3 install git+https://github.com/openai/CLIP.git

Alternatively, you can use pip install -r requirements.txt to install all the dependencies.

Download Dataset

We experiment with three datasets: MIT-States, UT-Zappos, and C-GQA.

sh download_data.sh

If you already have setup the datasets, you can use symlink and ensure the following paths exist: data/<dataset> where <datasets> = {'mit-states', 'ut-zappos', 'cgqa'}.

Training

python -u train.py \
  --dataset mit-states \
  --model ViT-L/14 \
  --experiment_name csp \
  --seed 0 \
  --epochs 20 \
  --lr 5e-05 \
  --attr_dropout 0.3 \
  --weight_decay 0.00001 \
  --train_batch_size 64 \
  --gradient_accumulation_steps 2 \
  --context_length 8 \
  --save_path data/model/mit-states/sample_model \
  --save_every_n 1

You can replace --dataset with {mit-states, ut-zappos, cgqa}. The best hyperparameters are included in the paper.

Evaluation

We evaluate our models in two settings: closed-world and open-world.

Closed-World Evaluation

python -u evaluate.py \
  --dataset mit-states \
  --soft_embeddings data/model/mit-states/sample_model/soft_embeddings_epoch_20.pt \
  --context_length 16 \
  --text_encoder_batch_size 36 \
  --eval_batch_size 16 \
  --experiment_name csp

Open-World Evaluation

For our open-world evaluation, we compute the feasbility calibration and then evaluate on the dataset.

Feasibility Calibration

We use GloVe embeddings to compute the similarities between objects and attributes. Download the GloVe embeddings in the data directory:

cd data
wget https://nlp.stanford.edu/data/glove.6B.zip

Move glove.6B.300d.txt into data/glove.6B.300d.txt.

To compute feasibility calibration for each dataset, run the following command:

python -u datasets/feasibility.py --dataset mit-states

The feasibility similarities are saved at data/feasibility_<dataset>.pt.

Evaluation

The open-world evaluation with the thresholds (feasibility calibration).

python -u evaluate.py \
  --dataset mit-states \
  --soft_embeddings data/model/mit-states/sample_model/soft_embeddings_epoch_5.pt \
  --context_length 16 \
  --text_encoder_batch_size 36 \
  --eval_batch_size 256 \
  --experiment_name czsl \
  --threshold <threshold> \
  --open_world

If <threshold> is None, then the model picks the best threshold on the validation set. We use the following thresholds:

Dataset Threshold
mit-states 0.4069159426
ut-zappos 0.5299109123
cgqa 0.49937106273612186

Note: We use 256GB of cpu memory to evaluate cgqa.

Generalization to Higher-Order Compositions

Evaluate the trained CSP vocabulary on the new AAO-MIT-States dataset.

python aao/evaluate_att_att_obj.py \
  --experiment_name csp \
  --soft_embeddings data/model/mit-states/sample_model/soft_embeddings_epoch_20.pt

We thank Andrew Delworth and Elise Carman for helping us annotate this dataset.

Generalization to Mixed Pretrained and Fine-Tuned Vocabulary

Ablation experiment to train and evaluate CSP with reduced fine-tuned vocabulary. We run experiment on the ut-zappos dataset.

Training

python -u mix/mix_train.py \
  --dataset ut-zappos \
  --model ViT-L/14 \
  --experiment_name mix_csp \
  --seed 0 \
  --epochs 20 \
  --lr 5e-04 \
  --attr_dropout 0.2 \
  --weight_decay 0.00001 \
  --train_batch_size 64 \
  --context_length 8 \
  --save_path data/model/ut-zappos/mix_train_model_0.25 \
  --save_every_n 5 \
  --attr_keep_ratio 0.25 \
  --gradient_accumulation_steps 2

We change the --attr_keep_ratio to {0.25, 0.50, 0.75}.

Evaluation

python -u mix/evaluate_mix_train.py \
  --dataset ut-zappos \
  --soft_embeddings data/model/ut-zappos/mix_train_model_0.25/soft_embeddings.pt \
  --context_length 16 \
  --text_encoder_batch_size 36 \
  --eval_batch_size 256 \
  --experiment_name csp

Credits

The project uses openly available model, code, and datasets. Please see the credits.

Owner
Bats Research
Bats Research
A local Socks5 server written in python, used for integrating Multi-hop

proxy-Zata proxy-Zata v1.0 This is a local Socks5 server written in python, used for integrating Multi-hop (Socks4/Socks5/HTTP) forward proxy then pro

4 Feb 24, 2022
WebScan is a web vulnerability Scanning tool, which scans sites for SQL injection and XSS vulnerabilities

WebScan is a web vulnerability Scanning tool, which scans sites for SQL injection and XSS vulnerabilities Which is a great tool for web pentesters. Coded in python3, CLI. WebScan is capable of scanni

AnonyminHack5 12 Dec 02, 2022
On the 11/11/21 the apache 2.4.49-2.4.50 remote command execution POC has been published online and this is a loader so that you can mass exploit servers using this.

ApacheRCE ApacheRCE is a small little python script that will allow you to input the apache version 2.4.49-2.4.50 and then input a list of ip addresse

3 Dec 04, 2022
Script Crack Facebook Yang Kaya Akan Teh Hijau 🚶‍♂

r-mbf Script Crack Facebook 🚶‍♂ Bukti Recode [•] Install Script $ pkg update && pkg upgrade $ pkg install python $ pkg install git $ pip install requ

O'Hayo Smrn 3 Apr 02, 2022
⛤Keylogger Generator for Windows written in Python⛤

⛤Keylogger Generator for Windows written in Python⛤

FZGbzuw412 33 Nov 24, 2022
Shell hunter for AF

AF-ShellHunter AF-ShellHunter: Auto shell lookup AF-ShellHunter its a script designed to automate the search of WebShell's in AF Team How to pip3 ins

Eduardo 34 May 13, 2022
Encrypted Python Password Manager

PyPassKeep Encrypted Python Password Manager About PyPassKeep (PPK for short) is an encrypted python password manager used to secure your passwords fr

KrisIsHere 1 Nov 17, 2021
This program will brute force any Instagram account you send it its way given a list of proxies.

Instagram Bruter This program will brute force any Instagram account you send it its way given a list of proxies. NOTICE I'm no longer maintaining thi

1 Nov 15, 2021
com_media allowed paths that are not intended for image uploads to RCE

CVE-2021-23132 com_media allowed paths that are not intended for image uploads to RCE. CVE-2020-24597 Directory traversal in com_media to RCE Two CVEs

KIEN HOANG 67 Nov 09, 2022
Use scrapli to retrieve security zone information from a Juniper SRX firewall

Get Security Zones with Scrapli Overview This example will show how to retrieve security zone information on Juniper's SRX firewalls. In addition to t

Calvin Remsburg 2 Jun 19, 2022
PassLock is a medium-security password manager that encrypts passwords using Advanced Encryption Standards (AES)

A medium security python password manager that encrypt passwords using Advanced Encryption Standard (AES) PassLock is a password manager and password

Akshay Vs 44 Nov 18, 2022
vulnerable APIs

vulnerable-apis vulnerable APIs inspired by https://github.com/mattvaldes/vulnerable-api Setup Docker If, Out of the box docker pull kmmanoj/vulnerabl

9 Jun 01, 2022
An intranet tool for easily intranet pentesting

IntarKnife v1.0 a tool can be used in intarnet for easily pentesting moudle hash spray U can use this tool to spray hash on a webshell IntraKnife.exe

4 Nov 24, 2021
This repository consists of the python scripts for execution and automation of vivid tasks.

Scripting.py is a repository being maintained to keep log of the python scripts that I create for automating and executing some of my boring manual task.

Prakriti Regmi 1 Feb 07, 2022
Tinyman exploit finder - Tinyman exploit finder for python

tinyman_exploit_finder There was a big tinyman exploit. You can read about it he

fish.exe 9 Dec 27, 2022
BETA: Layla - recon tool for bug bounty

WELCOME TO LAYLA Layla is a python script that automatically performs recon on a

Matheus Faria 68 Jan 04, 2023
Implementation of an attack on a tropical algebra discrete logarithm based protocol

Implementation of an attack on a tropical algebra discrete logarithm based protocol This code implements the attack detailed in the paper: On the trop

3 Dec 30, 2021
A script based on sqlmap that uses sql injection vulnerabilities to traverse the existence of a file

A script based on sqlmap that uses sql injection vulnerabilities to traverse the existence o

2 Nov 09, 2022
Discord Region Swapping Exploit (VC Overload)

Discord-VC-Exploit Discord Region Swapping Exploit (VC Overload) aka VC Crasher How does this work? Discord has multiple servers that lets people arou

Rainn 11 Sep 10, 2022
Dlint is a tool for encouraging best coding practices and helping ensure Python code is secure.

Dlint Dlint is a tool for encouraging best coding practices and helping ensure Python code is secure. The most important thing I have done as a progra

Dlint 127 Dec 27, 2022