Minimal, super readable string pattern matching for python.

Overview

logo

simplematch

Minimal, super readable string pattern matching for python.

PyPI Version PyPI - License tests

import simplematch

simplematch.match("He* {planet}!", "Hello World!")
>>> {"planet": "World"}

simplematch.match("It* {temp:float}°C *", "It's -10.2°C outside!")
>>> {"temp": -10.2}

Installation

pip install simplematch

(Or just drop the simplematch.py file in your project.)

Syntax

simplematch has only two syntax elements:

  • wildcard *
  • capture group {name}

Capture groups can be named ({name}), unnamed ({}) and typed ({name:float}).

The following types are available:

  • int
  • float
  • email
  • url
  • ipv4
  • ipv6
  • bitcoin
  • ssn (social security number)
  • ccard (matches Visa, MasterCard, American Express, Diners Club, Discover, JCB)

For now, only named capture groups can be typed.

Then use one of these functions:

import simplematch

simplematch.match(pattern, string) # -> returns a `dict` on match, `None` otherwise.
simplematch.test(pattern, string)  # -> returns `True` on match, `False` otherwise.

Or use a Matcher object:

import simplematch as sm

matcher = sm.Matcher(pattern)

matcher.match(string) # -> returns a dict or None
matcher.test(string)  # -> returns True / False
matcher.regex         # -> shows the generated regex

Basic usage

import simplematch as sm

# extracting data
sm.match(
    pattern="Invoice_*_{year}_{month}_{day}.pdf",
    string="Invoice_RE2321_2021_01_15.pdf")
>>> {"year": "2021", "month": "01", "day": "15"}

# test match only
sm.test("ABC-{value:int}", "ABC-13")
>>> True

Typed matches

import simplematch as sm

matcher = sm.Matcher("{year:int}-{month:int}: {value:float}")

# extracting data
matcher.match("2021-01: -12.786")
>>> {"year": 2021, "month": 1, "value": -12.786}

# month is no integer -> no match and return `None`.
matcher.match("2021-AB: Hello")
>>> None

# no extraction, only test for match
matcher.test("1234-01: 123.123")
>>> True

# show generated regular expression
matcher.regex
>>> '^(?P<year>[+-]?[0-9]+)\\-(?P<month>[+-]?[0-9]+):\\ (?P<value>[+-]?(?:[0-9]*[.])?[0-9]+)$'

# show registered converters
matcher.converters
>>> {'year': <class 'int'>, 'month': <class 'int'>, 'value': <class 'float'>}

Register your own types

You can register your own types to be available for the {name:type} matching syntax with the register_type function.

simplematch.register_type(name, regex, converter=str)

  • name is the name to use in the matching syntax
  • regex is a regular expression to match your type
  • converter is a callable to convert a match (str by default)

Example

Register a smiley type to detect smileys (:), :(, :/) and getting their moods:

import simplematch as sm

def mood_convert(smiley):
    moods = {
        ":)": "good",
        ":(": "bad",
        ":/": "sceptic",
    }
    return moods.get(smiley, "unknown")

sm.register_type("smiley", r":[\)\(\/]", mood_convert)

sm.match("I'm feeling {mood:smiley} *", "I'm feeling :) today!")
>>> {"mood": "good"}

Background

simplematch aims to fill a gap between parsing with str.split() and regular expressions. It should be as simple as possible, fast and stable.

The simplematch syntax is transpiled to regular expressions under the hood, so matching performance should be just as good.

I hope you get some good use out of this!

Contributions

Contributions are welcome! Just submit a PR and maybe get in touch with me via email before big changes.

License

MIT

Owner
Thomas Feldmann
Thomas Feldmann
Imports an object based on a string import_string('package.module:function_name')() - Based on werkzeug.utils

DEPRECATED don't use it. Please do: import importlib foopath = 'src.apis.foo.Foo' module_name = '.'.join(foopath.split('.')[:-1]) # to get src.apis.f

Bruno Rocha Archived Projects 11 Nov 12, 2022
An async API wrapper for Dress To Impress written in Python.

dti.py An async API wrapper for Dress To Impress written in Python. Some notes: For the time being, there are no front-facing docs for this beyond doc

Steve C 1 Dec 14, 2022
Dot Browser is a privacy-conscious web browser with smarts built-in for protection against trackers and advertisments online.

🌍 Take back your privacy with Dot Browser, the privacy-conscious web browser that protects you from being tracked and monitored online.

Dot HQ 1k Jan 07, 2023
Web service which feeds Navitia with real-time disruptions

Chaos Chaos is the web service which can feed Navitia with real-time disruptions. It can work together with Kirin which can feed Navitia with real-tim

KISIO Digital 7 Jan 07, 2022
SuperMario - Python programming class ending assignment SuperMario, using pygame

SuperMario - Python programming class ending assignment SuperMario, using pygame

mars 2 Jan 04, 2022
Web UI for your scripts with execution management

Script-server is a Web UI for scripts. As an administrator, you add your existing scripts into Script server and other users would be ab

Iaroslav Shepilov 1.1k Jan 09, 2023
Replay Felica Exchange For Python

FelicaReplay Replay Felica Exchange Description Standalone Replay Module Usage Save FelicaRelay (=2.0) output to file, then python replay.py [FILE].

3 Jul 14, 2022
Generating rent availability info from Effort rent

Rent-info Generating rent availability info from Effort rent Pre-Installation Latest version of python Pip module json, os, requests, datetime, time i

Laixuan 1 Oct 20, 2021
Aggressor script that gets the latest commands from CobaltStrikes web site and creates an aggressor script based on tool options.

opsec-aggressor Aggressor script that gets the latest commands from CobaltStrikes opsec page and creates an aggressor script based on tool options. Gr

JP 10 Nov 26, 2022
Ked interpreter built with Lex, Yacc and Python

Ked Ked is the first programming language known to hail from The People's Republic of Cork. It was first discovered and partially described by Adam Ly

Eoin O'Brien 1 Feb 08, 2022
ARA Records Ansible and makes it easier to understand and troubleshoot.

ARA Records Ansible ARA Records Ansible and makes it easier to understand and troubleshoot. It's another recursive acronym. What it does Simple to ins

Community managed Ansible repositories 1.6k Dec 25, 2022
Minecraft Multi-Server Pinger Discord Embed

Minecraft Network Pinger Minecraft Multi-Server Pinger Discord Embed What does this bot do? It sends an embed and uses mcsrvstat API and checks if the

YungHub 2 Jan 05, 2022
Python library and cli util for https://www.zerochan.net/

Zerochan Library for Zerochan.net with pics parsing and downloader included! Features CLI utility for pics downloading from zerochan.net Library for c

kiriharu 10 Oct 11, 2022
Arabic to Roman Converter in Python

Arabic-to-Roman-Converter Made together with https://github.com/goltaraya . Arabic to Roman Converter in Python. -Instructions: 1 - Make sure you have

Pedro Lucas Tomazeti Fernandes 6 Oct 28, 2021
An universal linux port of deezer, supporting both Flatpak and AppImage

Deezer for linux This repo is an UNOFFICIAL linux port of the official windows-only Deezer app. Being based on the windows app, it allows downloading

Aurélien Hamy 154 Jan 06, 2023
The ROS package for Airbotics.

airbotics The ROS package for Airbotics: Developed for ROS 1 and Python 3.8. The package has not been officially released on ROS yet so manual install

Airbotics 19 Dec 25, 2022
Enjoyable scripting experience with Python

Enjoyable scripting experience with Python

8 Jun 08, 2022
The calculator on Python.

Calculator Contributors: Delitanast An official website. Information Hello! I am Damir. It`s my first Python project. I think you want see this. I imp

3 Mar 13, 2022
Gunakan Dengan Bijak!!

YMBF Made with ❤️ by ikiwzXD_ menu Results notice me: if you get cp results, save 3/7 days then log in. Install script on Termux $ pkg update && pkg u

Ikiwz 0 Jul 11, 2022
[x]it! support for working with todo and check list files in Sublime Text

[x]it! for Sublime Text This Sublime Package provides syntax-highlighting, shortcuts, and auto-completions for [x]it! files. Features Syntax highlight

Jan Heuermann 18 Sep 19, 2022