Searches a document for hash tags. Support multiple natural languages. Works in various contexts.

Overview

ht-getter

Searches a document for hash tags. Supports multiple natural languages. Works in various contexts.

This package uses a non-regex approach and supports both halfwidth and fullwidth alphanumeric characters as well as various writing systems.

Installation

pip install ht-getter

Function

def get_hash_tags(source, mode = "strings")

Arguments:

source -> The source text to be searched. Must be passed as a str type.

mode -> Specifies the mode of the results. The default value is “strings”

mode = “strings” -> The results are returned as a list of strings

mode = “indices” -> The results are returned as a list of lists of the start and end indices of the hash tags.

Code Sample

from ht_getter.getter import get_hash_tags

source_text = '''This simple package helps find you find #hash_tags in various types of #documents#. It also works with other languages like #日本語 or #한국어.
It supports #fullwidth #alpha-numeric characters. You can get a #list of the #hash_tags or a list of their #indices in the #####source_text."
'''

hash_tags = get_hash_tags(source_text)
hash_tag_indices = get_hash_tags(source_text, mode = "indices")

print(hash_tags)
print(hash_tag_indices)

Things to Keep in Mind:

  1. This package can be used in various contexts. (Social media, news articles, etc.)
  2. This package looks for substrings that have the structure of a hash tag but does not check that the substring is a valid hash tag on any platform.
You might also like...
Soccerdata - Efficiently scrape soccer data from various sources
Soccerdata - Efficiently scrape soccer data from various sources

SoccerData is a collection of wrappers over soccer data from Club Elo, ESPN, FBr

Python Tool to Easily Generate Multiple Documents

Python Tool to Easily Generate Multiple Documents Running the script doesn't require internet Max Generation is set to 10k to avoid lagging/crashing R

Build documentation in multiple repos into one site.

mkdocs-multirepo-plugin Build documentation in multiple repos into one site. Setup Install plugin using pip: pip install git+https://github.com/jdoiro

Quick tutorial on orchest.io that shows how to build multiple deep learning models on your data with a single line of code using python
Quick tutorial on orchest.io that shows how to build multiple deep learning models on your data with a single line of code using python

Deep AutoViML Pipeline for orchest.io Quickstart Build Deep Learning models with a single line of code: deep_autoviml Deep AutoViML helps you build te

Type hints support for the Sphinx autodoc extension

sphinx-autodoc-typehints This extension allows you to use Python 3 annotations for documenting acceptable argument types and return value types of fun

AWS Tags As A Database is a Python library using AWS Tags as a Key-Value database.

AWS Tags As A Database is a Python library using AWS Tags as a Key-Value database. This database is completely free* 💸

Craxk is a SINGLE AND NON-REPLICABLE Hash that uses data from the hardware where it is executed to form a hash that can only be reproduced by a single machine.
Craxk is a SINGLE AND NON-REPLICABLE Hash that uses data from the hardware where it is executed to form a hash that can only be reproduced by a single machine.

What is Craxk ? Craxk is a UNIQUE AND NON-REPLICABLE Hash that uses data from the hardware where it is executed to form a hash that can only be reprod

Find target hash collisions for Apple's NeuralHash perceptual hash function.💣
Find target hash collisions for Apple's NeuralHash perceptual hash function.💣

neural-hash-collider Find target hash collisions for Apple's NeuralHash perceptual hash function. For example, starting from a picture of this cat, we

That Hash will name that hash type! Identify MD5, SHA256 and 300+ other hashes Comes with
That Hash will name that hash type! Identify MD5, SHA256 and 300+ other hashes Comes with

Call for translators! We're looking for translators to help translate this spec for everyone! Read this documentation in the following languages 한국어 中

A Pytorch implementation of
A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).

Splitter ⠀⠀ A PyTorch implementation of Splitter: Learning Node Representations that Capture Multiple Social Contexts (WWW 2019). Abstract Recent inte

A Pytorch implementation of
A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).

Splitter ⠀⠀ A PyTorch implementation of Splitter: Learning Node Representations that Capture Multiple Social Contexts (WWW 2019). Abstract Recent inte

Implementation of Natural Language Code Search in the project CodeBERT: A Pre-Trained Model for Programming and Natural Languages.

CodeBERT-Implementation In this repo we have replicated the paper CodeBERT: A Pre-Trained Model for Programming and Natural Languages. We are interest

document organizer with tags and full-text-search, in a simple and clean sqlite3 schema
document organizer with tags and full-text-search, in a simple and clean sqlite3 schema

document organizer with tags and full-text-search, in a simple and clean sqlite3 schema

Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.
Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.

mtomo Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation.

Transparent proxy server that works as a poor man's VPN. Forwards over ssh. Doesn't require admin. Works with Linux and MacOS. Supports DNS tunneling.

sshuttle: where transparent proxy meets VPN meets ssh As far as I know, sshuttle is the only program that solves the following common case: Your clien

A Regex based linter tool that works for any language and works exclusively with custom linting rules.

renag Documentation Available Here Short for Regex (re) Nag (like "one who complains"). Now also PEGs (Parsing Expression Grammars) compatible with py

Code for CVPR 2021 oral paper
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

ToxiChat Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Install depen

Releases(v0.0.1)
Owner
Rairye
NLP (mostly tokenization, normalization, and topics concerning CJK languages) Most projects currently available in JS and Python.
Rairye
Grokking the Object Oriented Design Interview

Grokking the Object Oriented Design Interview

Tusamma Sal Sabil 2.6k Jan 08, 2023
Docov - Light-weight, recursive docstring coverage analysis for python modules

docov Light-weight, recursive docstring coverage analysis for python modules. Ov

Richard D. Paul 3 Feb 04, 2022
Explorative Data Analysis Guidelines

Explorative Data Analysis Get data into a usable format! Find out if the following predictive modeling phase will be successful! Combine everything in

Florian Rohrer 18 Dec 26, 2022
EasyMultiClipboard - Python script written to handle more than 1 string in clipboard

EasyMultiClipboard - Python script written to handle more than 1 string in clipboard

WVlab 1 Jun 18, 2022
Crystal Smp plugin for show scoreboards

MCDR-CrystalScoreboards Crystal plugin for show scoreboards | Only 1.12 Usage !!s : Plugin help message !!s hide : Hide scoreboard !!s show : Show Sco

CristhianCd 3 Oct 12, 2021
Plotting and analysis tools for ARTIS simulations

Artistools Artistools is collection of plotting, analysis, and file format conversion tools for the ARTIS radiative transfer code. Installation First

ARTIS Monte Carlo Radiative Transfer 8 Nov 07, 2022
epub2sphinx is a tool to convert epub files to ReST for Sphinx

epub2sphinx epub2sphinx is a tool to convert epub files to ReST for Sphinx. It uses Pandoc for converting HTML data inside epub files into ReST. It cr

Nihaal 8 Dec 15, 2022
A curated list of python programming language blogs

Python Blogs A curated list of python programming language blogs Contribute Companies/Organization # A B C D E F G H I J K L M N O P Q R S T U V W X Y

Rizky D. Onto 48 Nov 15, 2022
MonsterManualPlus - An advanced monster manual for Tower of the Sorcerer.

Monster Manual + This is an advanced monster manual for Tower of the Sorcerer mods. Users can get a plenty of extra imformation for decision making wh

Yifan Zhou 1 Jan 01, 2022
Dynamic Resume Generator

Dynamic Resume Generator

Quinten Lisowe 15 May 19, 2022
Test utility for validating OpenAPI documentation

DRF OpenAPI Tester This is a test utility to validate DRF Test Responses against OpenAPI 2 and 3 schema. It has built-in support for: OpenAPI 2/3 yaml

snok 106 Jan 05, 2023
Exercism exercises in Python.

Exercism exercises in Python.

Exercism 1.3k Jan 04, 2023
Bring RGB to life in Neovim

Bring RGB to life in Neovim Change your RGB devices' color depending on Neovim's mode. Fast and asynchronous plugin to live your vim-life to the fulle

Antoine 40 Oct 27, 2022
Resource hub for Obsidian resources.

Obsidian Community Vault Welcome! This is an experimental vault that is maintained by the Obsidian community. For best results we recommend downloadin

Obsidian Community 320 Jan 02, 2023
Generates, filters, parses, and cleans data regarding the financial disclosures of judges in the American Judicial System

This repository contains code that gets data regarding financial disclosures from the Court Listener API main.py: contains driver code that interacts

Ali Rastegar 2 Aug 06, 2022
Lightweight, configurable Sphinx theme. Now the Sphinx default!

What is Alabaster? Alabaster is a visually (c)lean, responsive, configurable theme for the Sphinx documentation system. It is Python 2+3 compatible. I

Jeff Forcier 670 Dec 19, 2022
Spin-off Notice: the modules and functions used by our research notebooks have been refactored into another repository

Fecon235 - Notebooks for financial economics. Keywords: Jupyter notebook pandas Federal Reserve FRED Ferbus GDP CPI PCE inflation unemployment wage income debt Case-Shiller housing asset portfolio eq

Adriano 825 Dec 27, 2022
SqlAlchemy Flask-Restful Swagger Json:API OpenAPI

SAFRS: Python OpenAPI & JSON:API Framework Overview Installation JSON:API Interface Resource Objects Relationships Methods Custom Methods Class Method

Thomas Pollet 361 Nov 16, 2022
Pydantic model generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.

datamodel-code-generator This code generator creates pydantic model from an openapi file and others. Help See documentation for more details. Supporte

Koudai Aono 1.3k Dec 29, 2022
Literate-style documentation generator.

888888b. 888 Y88b 888 888 888 d88P 888 888 .d8888b .d8888b .d88b. 8888888P" 888 888 d88P" d88P" d88""88b 888 888 888

Pycco 808 Dec 27, 2022