A collection of common regular expressions bundled with an easy to use interface.

Overview

CommonRegex

Find all times, dates, links, phone numbers, emails, ip addresses, prices, hex colors, and credit card numbers in a string. We did the hard work so you don't have to.

Pull requests welcome!

Installation

Install via pip

sudo pip install commonregex

or via setup.py

python setup.py install

Usage

>>> from commonregex import CommonRegex
>>> parsed_text = CommonRegex("""John, please get that article on www.linkedin.com to me by 5:00PM 
                               on Jan 9th 2012. 4:00 would be ideal, actually. If you have any 
                               questions, You can reach me at (519)-236-2723x341 or get in touch with
                               my associate at [email protected]""")
>>> parsed_text.times
['5:00PM', '4:00']
>>> parsed_text.dates
['Jan 9th 2012']
>>> parsed_text.links
['www.linkedin.com']
>>> parsed_text.phones
['(519)-236-2727']
>>> parsed_text.phones_with_exts
['(519)-236-2723x341']
>>> parsed_text.emails
['[email protected]']

Alternatively, you can generate a single CommonRegex instance and use it to parse multiple segments of text.

>>> parser = CommonRegex()
>>> parser.times("When are you free?  Do you want to meet up for coffee at 4:00?")
['4:00']

Finally, all regular expressions used are publicly exposed.

>>> from commonregex import email
>>> import re
>>> text = "...get in touch with my associate at [email protected]"
>>> re.sub(email, "[email protected]", text)
'...get in touch with my associate at [email protected]'
>>> from commonregex import time
>>> for m in time.finditer("Does 6:00 or 7:00 work better?"):
>>>     print m.start(), m.group()     
5 6:00 
13 7:00 

Please note that this module is currently English/US specific.

Supported Methods/Attributes

  • obj.dates, obj.dates()
  • obj.times, obj.times()
  • obj.phones, obj.phones()
  • obj.phones_with_exts, obj.phones_with_exts()
  • obj.links, obj.links()
  • obj.emails, obj.emails()
  • obj.ips, obj.ips()
  • obj.ipv6s, obj.ipv6s()
  • obj.prices, obj.prices()
  • obj.hex_colors, obj.hex_colors()
  • obj.credit_cards, obj.credit_cards()
  • obj.btc_addresses, obj.btc_addresses()
  • obj.street_addresses, obj.street_addresses()
  • obj.zip_codes, obj.zip_codes()
  • obj.po_boxes, obj.po_boxes()
  • obj.ssn_number, obj.ssn_number()

CommonRegex Ports:

CommonRegexRust

[CommonRegexJS] (https://github.com/talyssonoc/CommonRegexJS)

[CommonRegexScala] (https://github.com/everpeace/CommonRegexScala)

[CommonRegexJava] (https://github.com/talyssonoc/CommonRegexJava)

[CommonRegexCobra] (https://github.com/PurityLake/CommonRegex-Cobra)

[CommonRegexDart] (https://github.com/aufdemrand/CommonRegexDart)

[CommonRegexRuby] (https://github.com/talyssonoc/CommonRegexRuby)

[CommonRegexPHP] (https://github.com/james2doyle/CommonRegexPHP)

Analytics

Owner
Madison May
Machine Learning Architect at @IndicoDataSolutions
Madison May
PyCASCLib: CASC interface for Warcraft III

PyCASCLib CASC interface for Warcraft III. This repo provides bindings for JCASC: https://github.com/DrSuperGood/JCASC Installation Jdk is required fo

2 Jun 04, 2022
OWASP Foundation Web Respository

WWWGrep OWASP Foundation Web Respository Author: Mark Deen & Aditi Mohan Introduction WWWGrep is a rapid search “grepping” mechanism that examines HTM

OWASP 34 Jun 15, 2022
Painel de consulta

⚙ FullP 1.1 Instalação 💻 git clone https://github.com/gav1x/FullP.git cd FullP pip3 install -r requirements.txt python3 main.py Um pequeno

gav1x 26 Oct 11, 2022
Python plugin for Krita that assists with making python plugins for Krita

Krita-PythonPluginDeveloperTools Python plugin for Krita that assists with making python plugins for Krita Introducing Python Plugin developer Tools!

18 Dec 01, 2022
End-to-End text sumarization, QAs generation using flask.

Help-Me-Read A web application created with Flask + BootStrap + HuggingFace 🤗 to generate summary and question-answer from given input text. It uses

Ankush Kuwar 12 Nov 13, 2022
Programming in Bioinformatics, Block 3

Programming in Bioinformatics - Block 3 I. Setting up Environment and Running the Code Create the environment using the pibi_block3.yml file with the

2 Dec 10, 2021
rebalance is a simple Python 3.9+ library for rebalancing investment portfolios

rebalance rebalance is a simple Python 3.9+ library for rebalancing investment portfolios. It supports cash flow rebalancing with contributions and wi

Darik Harter 5 Feb 26, 2022
A feed generator. Currently supports generating RSS feeds from Google, Bing, and Yahoo news.

A feed generator. Currently supports generating RSS feeds from Google, Bing, and Yahoo news.

Josh Cardenzana 0 Dec 13, 2021
Datargsing is a data management and manipulation Python library

Datargsing What is It? Datargsing is a data management and manipulation Python library which is currently in deving Why this library is good? This Pyt

CHOSSY Lucas 10 Oct 24, 2022
Artificial intelligence based on 5-dimensional quantum selection

Deep Thought An artificial intelligence based on 5-dimensional quantum selection. Algorithm The payload Make an random bit array (e.g. 1101...) Conver

Larry Holst 3 Dec 14, 2022
Unofficial Valorant documentation and tools for third party developers

Valorant Third Party Toolkit This repository contains unofficial Valorant documentation and tools for third party developers. Our goal is to centraliz

Noah Kim 20 Dec 21, 2022
A Python wrapper around Bacting

pybacting Python wrapper around bacting. Usage Based on the example from the bacting page, you can do: from pybacting import cdk print(cdk.fromSMILES

Charles Tapley Hoyt 5 Jan 03, 2022
A pypi package details search python module

A pypi package details search python module

Fayas Noushad 5 Nov 30, 2021
Watcher for systemdrun user scopes

Systemctl Memory Watcher Animated watcher for systemdrun user scopes. Usage Launch some process in your GNU-Linux or compatible OS with systemd-run co

Antonio Vanegas 2 Jan 20, 2022
Transform your boring distro into a hacking powerhouse.

Pentizer Transform your boring distro into a hacking powerhouse. Pentizer is a personal project that imports Kali and Parrot repositories in any Debia

Michail Tsimpliarakis 2 Nov 05, 2021
Automated Content Feed Curator

Gathers posts from content feeds, filters, formats, delivers to you.

Alper S. Soylu 2 Jan 22, 2022
Bookmarkarchiver - Python script that archives all of your bookmarks on the Internet Archive

bookmarkarchiver Python script that archives all of your bookmarks on the Internet Archive. Supports all major browsers. bookmarkarchiver uses the off

Anthony Chen 3 Oct 09, 2022
Tools, guides, and resources for blockchain analysts to interface with data on the Ergo platform.

Ergo Intelligence Objective Provide a suite of easy-to-use toolkits, guides, and resources for blockchain analysts and data scientists to quickly unde

Chris 5 Mar 15, 2022
Lagrange Interpolation Method-Python

Lagrange Interpolation Method-Python The Lagrange interpolation formula is a way to find a polynomial, called Lagrange polynomial, that takes on certa

Motahare Soltani 2 Jul 05, 2022
Convert long numbers into a human-readable format in Python

Convert long numbers into a human-readable format in Python

Alex Zaitsev 73 Dec 28, 2022