PostQF is a user-friendly Postfix queue data filter which operates on data produced by postqueue -j.

Related tags

Data Analysispostqf
Overview

PostQF

Copyright © 2022 Ralph Seichter

PostQF is a user-friendly Postfix queue data filter which operates on data produced by postqueue -j. See the manual page's subsection titled "JSON object format" for details. PostQF offers convenient features for analysis and and cleanup of Postfix mail queues.

I have used the all-purpose JSON manipulation utility "jq" before, but found it inconvenient for everyday Postfix administration tasks. "jq" offers great flexibility and handles all sorts of JSON input, but it comes at the cost of complexity. PostQF is an alternative specifically tailored for easier access to Postfix queues.

To facilitate the use of Unix-like pipelines, PostQF usually reads from stdin and writes to stdout. Using command line arguments, you can override this behaviour and define one or more input files and/or an output file. Depending on the context, a horizontal dash - represents either stdin or stdout. See the command line usage description below.

Example usage

Find all messages in the deferred queue where the delay reason contains the string connection timed out.

postqueue -j | postqf -q deferred -d 'connection timed out'

Find all messages in the active or hold queues which have at least one recipient in the example.com or example.org domains, and write the matching JSON records into the file /tmp/output.

postqueue -j | postqf -q 'active|hold' -r '@example\.(com|org)' -o /tmp/output

Find all messages all queues for which the sender address is [email protected] or [email protected], and pipe the queue IDs to postsuper in order to place the matching messages on hold.

postqueue -j | postqf -s '^(alice|bob)@gmail\.com$' -i | postsuper -h -

Print the number of messages which arrived during the last 30 minutes.

postqueue -j | postqf -a 30m | wc -l

The final example assumes a directory /tmp/data with several files, each containing JSON output produced at some previous time. The command pipes all queue IDs which have ever been in the hold queue into the file idlist, relying on BASH wildcard expansion to generate a list of input files.

postqf -i -q hold /tmp/data/*.json > idlist

Filters

Queue entries can be easily filtered by

  • Arrival time
  • Delay reason
  • Queue name
  • Recipient address
  • Sender address

and combinations thereof, using regular expressions. Anchoring is optional, meaning that plain text is treated as a substring pattern.

The arrival time filters do not use regular expressions, but instead a human-readable representation of a time difference. The format is W unit, without spaces. W is a "whole number" (i.e. a number ≥ 0). The unit is a single letter among s, m, h, d (seconds, minutes, hours, days).

-b 3d and -a 90m are both examples of valid command line arguments. Note that arrival filters are interpreted relative to the time PostQF is run. The two examples signify "message arrived more than 3 days ago" (before timestamp) and "message arrived less than 90 minutes ago" (after timestamp), respectively.

Command line usage

postqf [-h] [-i] [-d [REGEX]] [-q [REGEX]] [-r [REGEX]] [-s [REGEX]]
       [-a [TS] | -b [TS]] [-o [PATH]] [PATH [PATH ...]]

positional arguments:
  PATH        Input file. Use a dash "-" for standard input.

optional arguments:
  -h, --help  show this help message and exit
  -i          ID output only
  -o [PATH]   Output file. Use a dash "-" for standard output.

Regular expression filters:
  -d [REGEX]  Delay reason filter
  -q [REGEX]  Queue name filter
  -r [REGEX]  Recipient address filter
  -s [REGEX]  Sender address filter

Arrival time filters (mutually exclusive):
  -a [TS]     Message arrived after TS
  -b [TS]     Message arrived before TS

Installation

The only installation requirement is Python 3.7 or newer. PostQF is distributed via PyPI.org and can usually be installed using pip. If this fails, or if both Python 2.x and 3.x are installed on your machine, use pip3 instead.

If possible, use the recommended installation with a Python virtual environment. Site-wide installation usually requires root privileges.

# Recommended: Installation using a Python virtual environment.
mkdir ~/postqf
cd ~/postqf
python3 -m venv .venv
source .venv/bin/activate
pip install -U pip postqf
# Alternative: Site-wide installation, requires root access.
sudo pip install postqf

The pip installation process adds a launcher executable postqf, either site-wide or in the Python virtual environment. In the latter case, the launcher will be placed into the directory .venv/bin which is automatically added to your PATH variable when you activate the venv environment as shown above.

Contact

The project is hosted on GitHub in the rseichter/postqf repository. If you have suggestions or run into any problems, please check the discussions section first. There is also an issue tracker available, and the build configuration file contains a contact email address.

You might also like...
Functional Data Analysis, or FDA, is the field of Statistics that analyses data that depend on a continuous parameter. Fancy data functions that will make your life as a data scientist easier.
Fancy data functions that will make your life as a data scientist easier.

WhiteBox Utilities Toolkit: Tools to make your life easier Fancy data functions that will make your life as a data scientist easier. Installing To ins

A Big Data ETL project in PySpark on the historical NYC Taxi Rides data

Processing NYC Taxi Data using PySpark ETL pipeline Description This is an project to extract, transform, and load large amount of data from NYC Taxi

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

Utilize data analytics skills to solve real-world business problems using Humana’s big data

Humana-Mays-2021-HealthCare-Analytics-Case-Competition- The goal of the project is to utilize data analytics skills to solve real-world business probl

Python data processing, analysis, visualization, and data operations

Python This is a Python data processing, analysis, visualization and data operations of the source code warehouse, book ISBN: 9787115527592 Descriptio

PrimaryBid - Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift
PrimaryBid - Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift

Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift This project is composed of two parts: Part1 and Part2

Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials
Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials

Data Scientist Learning Plan Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials

Catalogue data - A Python Scripts to prepare catalogue data

catalogue_data Scripts to prepare catalogue data. Setup Clone this repo. Install

Comments
  • Permit using

    Permit using "before" and "after" time filters at the same time

    The command line arguments -a and -b are mutually exclusive as of release 0.1. If using both at the same time was permitted, users could express an interval, allowing searches for "message arrived between timestamps X and Y".

    enhancement 
    opened by rseichter 1
  • Support absolute time for before/after filter arguments

    Support absolute time for before/after filter arguments

    Command line arguments -a and -b currently allow only passing a time difference like 45m or 3d. It would be helpful to also support strings representing absolute points in time. Here is an example for how it might look when using the ISO 8601 format:

    $ date --iso-8601=s
    2022-01-23T22:10:56+01:00
    
    $ postqueue -b '2022-01-23T22:10:56+01:00'
    

    It would also be useful to allow passing epoch time arguments, because postqueue -j returns message arrival times as seconds since the Epoch.

    enhancement 
    opened by rseichter 1
Releases(0.5)
  • 0.5(Feb 6, 2022)

    In addition to filtering JSON input and producing JSON output in the process, PostQF can now also generate a number of simple reports to answer some frequently asked questions about message queue content. The following data can be shown in reports:

    • Delay reason
    • Recipient address
    • Recipient domain
    • Sender address
    • Sender domain
    Source code(tar.gz)
    Source code(zip)
  • 0.4(Feb 2, 2022)

  • 0.3(Jan 28, 2022)

    • Output is now correctly rendered as JSON instead of a Python dict.
    • Simplified installation process. In addition to pip based setup, an installation BASH script is now provided.
    Source code(tar.gz)
    Source code(zip)
  • 0.2(Jan 24, 2022)

    • Release 0.2 introduces the ability to use both -a and -b time filters simultaneously, in order to specify time intervals.
    • Time filter strings can now use ISO 8601 strings and Unix time in addition to relative time differences expressed in the form 42m or 2d. This allows users to also specify absolute points in time as arrival thresholds.
    Source code(tar.gz)
    Source code(zip)
  • 0.1(Jan 23, 2022)

Owner
Ralph Seichter
Ralph Seichter
Hangar is version control for tensor data. Commit, branch, merge, revert, and collaborate in the data-defined software era.

Overview docs tests package Hangar is version control for tensor data. Commit, branch, merge, revert, and collaborate in the data-defined software era

Tensorwerk 193 Nov 29, 2022
Vaex library for Big Data Analytics of an Airline dataset

Vaex-Big-Data-Analytics-for-Airline-data A Python notebook (ipynb) created in Jupyter Notebook, which utilizes the Vaex library for Big Data Analytics

Nikolas Petrou 1 Feb 13, 2022
wikirepo is a Python package that provides a framework to easily source and leverage standardized Wikidata information

Python based Wikidata framework for easy dataframe extraction wikirepo is a Python package that provides a framework to easily source and leverage sta

Andrew Tavis McAllister 35 Jan 04, 2023
Detecting Underwater Objects (DUO)

Underwater object detection for robot picking has attracted a lot of interest. However, it is still an unsolved problem due to several challenges. We take steps towards making it more realistic by ad

27 Dec 12, 2022
MIR Cheatsheet - Survival Guidebook for MIR Researchers in the Lab

MIR Cheatsheet - Survival Guidebook for MIR Researchers in the Lab

SeungHeonDoh 3 Jul 02, 2022
A utility for functional piping in Python that allows you to access any function in any scope as a partial.

WithPartial Introduction WithPartial is a simple utility for functional piping in Python. The package exposes a context manager (used with with) calle

Michael Milton 1 Oct 26, 2021
BinTuner is a cost-efficient auto-tuning framework, which can deliver a near-optimal binary code that reveals much more differences than -Ox settings.

BinTuner is a cost-efficient auto-tuning framework, which can deliver a near-optimal binary code that reveals much more differences than -Ox settings. it also can assist the binary code analysis rese

BinTuner 42 Dec 16, 2022
Pipeline to convert a haploid assembly into diploid

HapDup (haplotype duplicator) is a pipeline to convert a haploid long read assembly into a dual diploid assembly. The reconstructed haplotypes

Mikhail Kolmogorov 50 Jan 05, 2023
Containerized Demo of Apache Spark MLlib on a Data Lakehouse (2022)

Spark-DeltaLake-Demo Reliable, Scalable Machine Learning (2022) This project was completed in an attempt to become better acquainted with the latest b

8 Mar 21, 2022
Anomaly Detection with R

AnomalyDetection R package AnomalyDetection is an open-source R package to detect anomalies which is robust, from a statistical standpoint, in the pre

Twitter 3.5k Dec 27, 2022
Churn prediction with PySpark

It is expected to develop a machine learning model that can predict customers who will leave the company.

3 Aug 13, 2021
Python Implementation of Scalable In-Memory Updatable Bitmap Indexing

PyUpBit CS490 Large Scale Data Analytics — Implementation of Updatable Compressed Bitmap Indexing Paper Table of Contents About The Project Usage Cont

Hyeong Kyun (Daniel) Park 1 Jun 28, 2022
Analyzing Earth Observation (EO) data is complex and solutions often require custom tailored algorithms.

eo-grow Earth observation framework for scaled-up processing in Python. Analyzing Earth Observation (EO) data is complex and solutions often require c

Sentinel Hub 18 Dec 23, 2022
Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which he recommends to buy. We will use this data to build a portfolio

Backtesting the "Cramer Effect" & Recommendations from Cramer Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which

Gábor Vecsei 12 Aug 30, 2022
Titanic data analysis for python

Titanic-data-analysis This Repo is an analysis on Titanic_mod.csv This csv file contains some assumed data of the Titanic ship after sinking This full

Hardik Bhanot 1 Dec 26, 2021
yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data.

The yt Project yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data. yt supports structured, varia

The yt project 367 Dec 25, 2022
Generate lookml for views from dbt models

dbt2looker Use dbt2looker to generate Looker view files automatically from dbt models. Features Column descriptions synced to looker Dimension for eac

lightdash 126 Dec 28, 2022
Programmatically access the physical and chemical properties of elements in modern periodic table.

API to fetch elements of the periodic table in JSON format. Uses Pandas for dumping .csv data to .json and Flask for API Integration. Deployed on "pyt

the techno hack 3 Oct 23, 2022
A pipeline that creates consensus sequences from a Nanopore reads. I

A pipeline that creates consensus sequences from a Nanopore reads. It clusters reads that are similar to each other and creates a consensus that is then identified using BLAST.

Ada Madejska 2 May 15, 2022
💬 Python scripts to parse Messenger, Hangouts, WhatsApp and Telegram chat logs into DataFrames.

Chatistics Python 3 scripts to convert chat logs from various messaging platforms into Pandas DataFrames. Can also generate histograms and word clouds

Florian 893 Jan 02, 2023