Boostcamp AI Tech 3rd / Basic Paper reading w.r.t Embedding

Overview

Boostcamp AI Tech 3rd : Basic Paper Reading w.r.t Embedding

TL;DR

1992년부터 2018년도까지 이루어진 word/sentence embedding의 중요한 줄기를 이루는 기초 논문 스터디를 진행하고자 합니다. 

논문 정리 발표에 들어갈 내용

  • 저자가 풀려고 하는 문제는 어떤 것인가?
  • 어떤 식으로 해결하고자 했는가. 어떤 장점이 있는가(시간 여유가 된다면, 이전에는 어떤 방법이 있었고 그 방법들의 단점)
  • 그 방법에 대한 intuition (수학 없이)
  • 방법에 대한 이해(수학적으로)
  • 방법의 성공성을 보여주기 위해 사용한 데이터, 메트릭, 성능비교
  • 부족하다 생각되는 것, 애매한 것, 혹은 좋았던 점 등의 Discussion point

리딩 리스트

Dates Paper(author) Year Presenter File upload Code explained
Class-Based n-gram Models of Natural Language(Peter F Brown, et al.) 1992 소연 설명
Efficient Estimation of Word Representations in Vector Space(Tomas Mikolov, et al) 2013 동진 발표
Distributed Representations of Words and Phrases and their Compositionality(Tomas Mikolov, et al) 2013 나연 설명 skip-gram, CBOW
Distributed Representations of Sentences and Documents(Quoc V. Le and Tomas Mikolov) 2014 기원 설명 Doc2Vec
GloVe: Global Vectors for Word Representation(Jeffrey Pennington, et al.) 2015 수정 설명
Skip-Thought Vectors(Ryan Kiros, et al.) 2015 기범 설명
Enriching Word Vectors with Subword Information(Piotr Bojanowski, et al.) 2017 은기 설명
Universal Sentence Encoder(Daniel Cer et al.) 2018

issue & 추가 스터디 자료

Dates Topic Presenter File upload
04/14 genism을 이용한 word2vec 사용 현지 링크
04/14 negative samping & subsampling 나경 링크
04/14 hierarchical softmax 소연 링크
04/14 negative contrastive estimation(NCE) 수정 링크

스터디 룰

  • 스터디 시간 : 목요일 저녁 9시 30분!
  • 스터디 분량 : 매주 1주씩! (프로덕트 서빙 커리큘럼 기간에 집중할 수 있게 그전에 끝내보아영)
    • 각각 읽고, 질문 최소 1개를 github issue에 올림(+ 거기에 대한 답변을 안다면 답변 달아주기!)
  • 발표자 : 해당 요일에 랜덤 선택. 발표 자료는 자유 양식
    • 논문 발표 : 발표자는 발표 후 정리 내용 해당 레포 폴더파서 업로드. 발표자 외 사람 중 공유하고 싶은 사람은 issues에 남기거나 file upload 에 마찬가지로 링크 추가 가능(자율)
    • 코드뷰 설명: 해당 논문 발표자는 다음주차에 코드뷰 설명(e.g, 어떤 라이브러리로 쉽게 쓸 수 있는지 usage 설명, 알고리즘이 복잡한 경우 코드뷰로 어떻게 구현되었는지 설명 등 본인 기호에 맞게)

참여자

강나경, 김소연, 김현지, 박기범, 임동진, 임수정, 정기원, 한나연 , 김은기

참고 링크

논문을 정리하는 틀과 issues를 통한 discussion이 좋았던 깃헙 레포 참고

리딩 리스트를 참고한 NLP Must Read paper 정리된 깃헙 레포 참고

국내 NLP 리뷰 모임 참고 (season1의 beginners에 중복되는 논문들 있어요!)

Owner
Soyeon Kim
Soyeon Kim
Head and Neck Tumour Segmentation and Prediction of Patient Survival Project

Head-and-Neck-Tumour-Segmentation-and-Prediction-of-Patient-Survival Welcome to the Head and Neck Tumour Segmentation and Prediction of Patient Surviv

5 Oct 20, 2022
Milano is a tool for automating hyper-parameters search for your models on a backend of your choice.

Milano (This is a research project, not an official NVIDIA product.) Documentation https://nvidia.github.io/Milano Milano (Machine learning autotuner

NVIDIA Corporation 147 Dec 17, 2022
Generate pixel-style avatars with python.

face2pixel Generate pixel-style avatars with python. Run: Clone the project: git clone https://github.com/theodorecooper/face2pixel install requiremen

Theodore Cooper 2 May 11, 2022
PyTorch implementation of TSception V2 using DEAP dataset

TSception This is the PyTorch implementation of TSception V2 using DEAP dataset in our paper: Yi Ding, Neethu Robinson, Su Zhang, Qiuhao Zeng, Cuntai

Yi Ding 27 Dec 15, 2022
I-BERT: Integer-only BERT Quantization

I-BERT: Integer-only BERT Quantization HuggingFace Implementation I-BERT is also available in the master branch of HuggingFace! Visit the following li

Sehoon Kim 139 Dec 27, 2022
Exploring Visual Engagement Signals for Representation Learning

Exploring Visual Engagement Signals for Representation Learning Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie and Ser-Nam Lim C

Menglin Jia 9 Jul 23, 2022
NumQMBasic - A mini-course offered to Undergrad physics students

The best way to use this material is by forking it by click the Fork button at the top, right corner. Then you will get your own copy to play with! Th

Raghu 35 Dec 05, 2022
Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

[Unofficial code-base] NeRF--: Neural Radiance Fields Without Known Camera Parameters [ Project | Paper | Official code base ] ⬅️ Thanks the original

Jianfei Guo 239 Dec 22, 2022
[cvpr22] Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation

PS-MT [cvpr22] Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation by Yuyuan Liu, Yu Tian, Yuanhong Chen, Fengbei Liu, Vasile

Yuyuan Liu 132 Jan 03, 2023
Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation

Segmenter: Transformer for Semantic Segmentation Segmenter: Transformer for Semantic Segmentation by Robin Strudel*, Ricardo Garcia*, Ivan Laptev and

594 Jan 06, 2023
This repository collects project-relevant Isabelle/HOL formalizations.

Isabelle/HOL formalizations related to the AuReLeE project Formalization of Abstract Argumentation Frameworks See AbstractArgumentation folder for the

AuReLeE project 1 Sep 10, 2022
Source code for CAST - Crisis Domain Adaptation Using Sequence-to-sequence Transformers (Accepted to ISCRAM 2021, CorePaper).

Source code for CAST: Crisis Domain Adaptation UsingSequence-to-sequenceTransformers (Paper, BibTeX, Accepted to ISCRAM 2021, CorePaper) Quick start D

Congcong Wang 0 Jul 14, 2021
This repository contains part of the code used to make the images visible in the article "How does an AI Imagine the Universe?" published on Towards Data Science.

Generative Adversarial Network - Generating Universe This repository contains part of the code used to make the images visible in the article "How doe

Davide Coccomini 9 Dec 18, 2022
[NeurIPS 2019] Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, Tengyu Ma This is the offi

Kaidi Cao 528 Jan 01, 2023
A basic neural network for image segmentation.

Unet_erythema_detection A basic neural network for image segmentation. 前期准备 1.在logs文件夹中下载h5权重文件,百度网盘链接在logs文件夹中 2.将所有原图 放置在“/dataset_1/JPEGImages/”文件夹

1 Jan 16, 2022
Citation Intent Classification in scientific papers using the Scicite dataset an Pytorch

Citation Intent Classification Table of Contents About the Project Built With Installation Usage Acknowledgments About The Project Citation Intent Cla

Federico Nocentini 4 Mar 04, 2022
[ICLR 2022] DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR

DAB-DETR This is the official pytorch implementation of our ICLR 2022 paper DAB-DETR. Authors: Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi

336 Dec 25, 2022
Repository features UNet inspired architecture used for segmenting lungs on chest X-Ray images

Lung Segmentation (2D) Repository features UNet inspired architecture used for segmenting lungs on chest X-Ray images. Demo See the application of the

163 Sep 21, 2022
This repository attempts to replicate the SqueezeNet architecture and implement the same on an image classification task.

SqueezeNet-Implementation This repository attempts to replicate the SqueezeNet architecture using TensorFlow discussed in the research paper: "Squeeze

Rohan Mathur 3 Dec 13, 2022
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

DiffGAN-TTS - PyTorch Implementation PyTorch implementation of DiffGAN-TTS: High

Keon Lee 157 Jan 01, 2023