This is a project built for FALLABOUT2021 event under SRMMIC, This project deals with NLP poetry generation.

Last update: Sep 28, 2021

Related tags

Text Data & NLP NLP

Overview

FALLABOUT-SRMMIC 21

POETRY-GENERATION

HINGLISH

DESCRIPTION

We have developed a NLP(natural language processing) model which automatically generates a poem based on the initial/promt text given as input by the user.

Motivation

The majority of ML/DL models result is usualy based on the training/validation accuracy and loss. And one of the models which does not depend on either on accuracy or loss is NLP text generating model. Irrespective of the accuracy the generated text may or maynot make sense. Sometimes the accuracy can be very high and not give satisfactory results or end up in a loop. So this can only be done by looking at the result after many trails and training.

Uses

Can be used for creative and fun purposes.
Can sometimes used for reproducing or generating the text for larger datasets.
Literature purpose like understanding and analysing a certain poetric style.

What's unique?

Unlike many poetry generation, we also built a hindi poetry text generation model.
We provide an analysis for LSTM layers and transformers with an example for better understanding.

Built with

Streamlit for frontend
tensorflow keras for hindi poetry
aitextgen for english poetry

Deeper into the project

The english poetry generation is developed with the help of an open-sourse library known as aitextgen. The famous GPT-2 transformer is used in this project, finetuned on Shakespeares poems and sonnets alone. The hindi poetry generation is built with tensorflow keras. The front-end is simply handled by streamlit.

Here is an example of how aitextgen is fine tuned. Here is an example on how to train your own model using tensorflow keras.

A peek into our project

Installation

The app.py file should be installed and download the model from this link. The trained_model folder should specify the path to your downloaded model. And you have to install trained_model_hindi from this link and specify the path as above. The trained_model_hindi forlder contains the trained model, tokenizer and etc. Similarly the trained_model folder for english also contains the model and uses the default built in GPT-2 transformer. Finally streamlit run app.py in your terminal and enjoy the app.

This is how Your code should look while running on local.

Future works

Planning on including a translator to slide easily between languages.
Introduce more poet based model in many languages.

Authors

Paras Rawat
Daketi Yatin

This is a project built for FALLABOUT2021 event under SRMMIC, This project deals with NLP poetry generation.

Related tags

Overview

FALLABOUT-SRMMIC 21

POETRY-GENERATION

HINGLISH

DESCRIPTION

Motivation

Uses

What's unique?

Built with

Deeper into the project

A peek into our project

Installation

Future works

Authors

Owner

Official codebase for Can Wikipedia Help Offline Reinforcement Learning?

This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular intervals.It sends out the most recent news at random!

Toward Model Interpretability in Medical NLP

Code for our ACL 2021 (Findings) Paper - Fingerprinting Fine-tuned Language Models in the wild .

Mlcode - Continuous ML API Integrations

A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.

Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies

An open-source NLP research library, built on PyTorch.

Tensorflow implementation of paper: Learning to Diagnose with LSTM Recurrent Neural Networks.

Deep Learning for Natural Language Processing - Lectures 2021

Natural Language Processing for Adverse Drug Reaction (ADR) Detection

KLUE-baseline contains the baseline code for the Korean Language Understanding Evaluation (KLUE) benchmark.

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).

Baseline code for Korean open domain question answering(ODQA)

Beyond Accuracy: Behavioral Testing of NLP models with CheckList

KoBERTopic은 BERTopic을 한국어 데이터에 적용할 수 있도록 토크나이저와 BERT를 수정한 코드입니다.

Basic Utilities for PyTorch Natural Language Processing (NLP)

中文空间语义理解评测

An open collection of annotated voices in Japanese language

Experiments in converting wikidata to ftm