CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data

Last update: Mar 10, 2022

Related tags

Overview

CLIP-Indonesian

CLIP (Radford et al., 2021) is a multimodal model that can connect images and text by training a vision encoder and a text encoder jointly to project the representation of images and the corresponding text into the same embedding space. The expected outcome is the text embeddings and image embeddings are located near each other.

This repository hosts the code for CLIP-Indonesian, which is a CLIP multimodal model trained on Indonesian data.

For the image encoder, we use VIT, more specifically openai/clip-vit-base-patch32. Meanwhile, for the text encoder, we experimented with two models: IndoBERT Large (indobenchmark/indobert-base-p2) and Indonesian RoBERTa Base (flax-community/indonesian-roberta-base).

Most of the CLIP script is based on HybridCLIP and clip-italian.

Still a work in progress so may not give the best result (yet) :)

clip-indonesian was presented in PyCon ID 2021. You can view the slide deck here.

Dataset

More details about the dataset used can be found here.

Results

The results of the training can be accessed here.

Demo

References

Bianchi, F., Attanasio, G., Pisoni, R., Terragni, S., Sarti, G., Lakshmi, S. (2021). Contrastive Language-Image Pre-training for the Italian Language arXiv preprint arXiv:2108.08688.

Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning Transferable Visual Models From Natural Language Supervision. ICML.

Wilie, B., Vincentio, K., Winata, G. I., Cahyawijaya, S., Li, X., Lim, Z. Y., ... & Purwarianti, A. (2020). IndoNLU: Benchmark and resources for evaluating Indonesian natural language understanding. arXiv preprint arXiv:2009.05387.

Hybrid CLIP by the HuggingFace team

Indonesian Roberta Base by Wilson Wongso, Steven Limcorn, Samsul Rahmadani, and Chew Kok Wah

Indonesian Translated Datasets by Samsul Rahmadani

Acknowledgment

All training was done on a TPUv3-8 VM sponsored by TPU Research Cloud.

CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data

Related tags

Overview

CLIP-Indonesian

Dataset

Results

Demo

Links

References

Acknowledgment

Owner

Galuh

Fully convolutional deep neural network to remove transparent overlays from images

[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

A Pytorch implement of paper "Anomaly detection in dynamic graphs via transformer" (TADDY).

An index of algorithms for learning causality with data

This repository contains pre-trained models and some evaluation code for our paper Towards Unsupervised Dense Information Retrieval with Contrastive Learning

CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

This repository contains the source code of our work on designing efficient CNNs for computer vision

Calibrated Hyperspectral Image Reconstruction via Graph-based Self-Tuning Network.

A smaller subset of 10 easily classified classes from Imagenet, and a little more French

Deploy a ML inference service on a budget in less than 10 lines of code.

This repository contains the source code for the paper First Order Motion Model for Image Animation

git《Joint Entity and Relation Extraction with Set Prediction Networks》(2020) GitHub:

DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

Gesture Volume Control v.2

TensorFlow implementation of "Learning from Simulated and Unsupervised Images through Adversarial Training"

Multi-Task Learning as a Bargaining Game

To propose and implement a multi-class classification approach to disaster assessment from the given data set of post-earthquake satellite imagery.

DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

Tensors and Dynamic neural networks in Python with strong GPU acceleration

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus