Mastering Transformers, published by Packt

Last update: Jan 01, 2023

Related tags

Text Data & NLP Mastering-Transformers

Overview

Mastering Transformers

This is the code repository for Mastering Transformers, published by Packt.

Build state-of-the-art models from scratch with advanced natural language processing techniques

What is this book about?

Transformer-based language models have dominated natural language processing (NLP) studies and have now become a new paradigm. With this book, you'll learn how to build various transformer-based NLP applications using the Python Transformers library.

This book covers the following exciting features:

Explore state-of-the-art NLP solutions with the Transformers library
Train a language model in any language with any transformer architecture
Fine-tune a pre-trained language model to perform several downstream tasks
Select the right framework for the training, evaluation, and production of an end-to-end solution
Get hands-on experience in using TensorBoard and Weights & Biases
Visualize the internal representation of transformer models for interpretability

If you feel this book is for you, get your copy today!

Instructions and Navigations

All of the code is organized into folders. For example, Chapter03.

The code will look like the following:

import pandas as pd
imdb_df = pd.read_csv("IMDB Dataset.csv")
reviews = imdb_df.review.to_string(index=None)
with open("corpus.txt", "w") as f:
      f.writelines(reviews)

Following is what you need for this book: This book is for deep learning researchers, hands-on NLP practitioners, as well as ML/NLP educators and students who want to start their journey with Transformers. Beginner-level machine learning knowledge and a good command of Python will help you get the best out of this book.

With the following software and hardware list you can run all code files present in the book (Chapter 1-11).

Software and Hardware List

Chapter	Software required	OS required
1-11	Python 3.6x, Transformers, Google Colaboratory, Jupyter Notebook, TensorFlow	Windows, Mac OS X, and Linux (Any)
10	Docker, Locust.io	Windows, Mac OS X, and Linux (Any)

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.

Code in Action

Click on the following link to see the Code in Action:

https://bit.ly/3i4vFzJ

Get to Know the Author

Savaş Yıldırım He graduated from the Istanbul Technical University Department of Computer Engineering and holds a Ph.D. degree in Natural Language Processing (NLP). Currently, he is an associate professor at the Istanbul Bilgi University, Turkey, and is a visiting researcher at the Ryerson University, Canada. He is a proactive lecturer and researcher with more than 20 years of experience teaching courses on machine learning, deep learning, and NLP.

Meysam Asgari-Chenaghlu He is an AI manager at Carbon Consulting and is also a Ph.D. candidate at the University of Tabriz. He has been a consultant for Turkey's leading telecommunication and banking companies. He has also worked on various projects, including natural language understanding and semantic search.

Mastering Transformers, published by Packt

Related tags

Overview

Mastering Transformers

What is this book about?

Instructions and Navigations

Software and Hardware List

Code in Action

Related products

Get to Know the Author

Owner

Packt

HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN

Binary LSTM model for text classification

An evaluation toolkit for voice conversion models.

VD-BERT: A Unified Vision and Dialog Transformer with BERT

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

The Sudachi synonym dictionary in Solar format.

Code for producing Japanese GPT-2 provided by rinna Co., Ltd.

Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃

A python script that will use hydra to get user and password to login to ssh, ftp, and telnet

This repository structures data in title, summary, tags, sentiment given a fragment of a conversation

Yomichad - a Japanese pop-up dictionary that can display readings and English definitions of Japanese words

Code for the Python code smells video on the ArjanCodes channel.

Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies

The official implementation of "BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies?, ACL 2021 main conference"

Code for ACL 2020 paper "Rigid Formats Controlled Text Generation"

Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective