Screen scraping and web crawling framework

Last update: Jun 21, 2021

Overview

Pomp

Pomp is a screen scraping and web crawling framework. Pomp is inspired by and similar to Scrapy, but has a simpler implementation that lacks the hard Twisted dependency.

Features:

Pure python
Only one dependency for Python 2.x - concurrent.futures (backport of package for Python 2.x)
Supports one file applications; Pomps doesn't force a specific project layout or other restrictions.
Pomp is a meta framework like Paste: you may use it to create your own scraping framework.
Extensible networking: you may use any sync or async method.
No parsing libraries in the core; use you preferred approach.
Pomp instances may be distributed and are designed to work with an external queue.

Pomp makes no attempt to accomodate:

redirects
proxies
caching
database integration
cookies
authentication
etc.

If you want proxies, redirects, or similar, you may use the excellent requests library as the Pomp downloader.

Pomp examples

Pomp docs

Pomp is written and maintained by Evgeniy Tatarkin and is licensed under the BSD license.

Screen scraping and web crawling framework

Related tags

Overview

Pomp

Owner

Evgeniy Tatarkin

Anonymously scrapes onlinesim.ru for new usable phone numbers.

This script is intended to crawl license information of repositories through the GitHub API.

This program scrapes information and images for movies and TV shows.

Visual scraping for Scrapy

Open Crawl Vietnamese Text

VG-Scraper is a python program using the module called BeautifulSoup which allows anyone to scrape something off an website. This program lets you put in a number trough an input and a number is 1 news article.

Scraping followers of an instagram account

京东茅台抢购 2021年4月最新版

CreamySoup - a helper script for automated SourceMod plugin updates management.

爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

UdemyBot - A Simple Udemy Free Courses Scrapper

LSpider 一个为被动扫描器定制的前端爬虫

A simple python script to fetch the latest covid info

A simple Discord scraper for discord bots

爬取各大SRC当日公告 | 通过微信通知的小工具 | 赏金工具

Here I provide the source code for doing web scraping using the python library, it is Selenium.

Python Web Scrapper Project

A pure-python HTML screen-scraping library

jd_maotai rpa 基于selenium驱动的jd抢购rpa机器人

Scrapping Connections' info on Linkedin