This program scrapes information and images for movies and TV shows.

Last update: Dec 05, 2021

Related tags

Overview

Media-WebScraper

This program scrapes information and images for movies and TV shows.

Summary

For more information on the program, read the WebScrape_help text file (this can also be accessed while running the program).

For a given list of media, the program will scrape and save general information, images and any episode information for each media.

General Information (default):

Saved as a .txt file

This will scrape general information:

Title
Release date
Runtime
Genre
Director
Cast
Plot description

Additional information saved:

Source database used for scrape
ID for media in source database
Poster image link

Images (default):

Saved as a .jpg file

This will scrape the poster.

Episode Information (if specified):

Saved as a .csv file

This will scrape information for each episode for a TV show:

Season number
Episode number
Episode title
Episode air date
Episode description

Features:

Multithreaded scraping for media in list to greatly improve the time taken when scraping for large media lists.
Can generate a media list from folders and files in a specified directory or from user input.
Can specify save location for scraped data.
Can specify search tags for media list for a more accurate scrape.
Can choose to scrape all episode information for a TV show.
Can detect if data is already scraped which allows for scraping new media from an already scraped list of media very efficient.
Can recover missing scraped files if one or more are missing without rescraping all data.
Can retry the scrape before exiting the program if there were any incomplete scrapes (successfully scraped files will not be altered or rescraped).
Currently only supports scraping data from IMDb.

Usage:

For more information on the program, read the WebScrape_help text file (this can also be accessed while running the program).

Currently a terminal-based program.

Running the program using python:

Requirements: Python 3.2+ (additional libraries: requests, beautifulsoup4)

Running the program from bundled executable file (created using pyinstaller):

Requirements: Windows 10
Creates a 'temp' folder containing extracted libraries and support files in the same location as the program while running.
- The temporary files will delete automatically but if the program is closed abruptly, the files will remain.
- The 'temp' folder can be manually deleted after closing the program.
- (As of pyinstaller v4.7, a one-file bundled executable will leave any temp '_MEIxxxxxx' folders if the program is force closed)

Updates:

For information on version history, read the HISTORY markdown file.

Scrapes proxies and saves them to a text file

Proxy Scraper Scrapes proxies from https://proxyscrape.com and saves them to a file. Also has a customizable theme system Made by nell and Lamp

2 Dec 22, 2021

Meme-videos - Scrapes memes and turn them into a video compilations

Meme Videos Scrapes memes from reddit using praw and request and then converts t

12 Oct 28, 2022

This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

1 Feb 10, 2022

WebScraping - Scrapes Job website for python developer jobs and exports the data to a csv file

WebScraping Web scraping Pyton program that scrapes Job website for python devel

2 Jul 22, 2022

:arrow_double_down: Dumb downloader that scrapes the web

You-Get NOTICE: Read this if you are looking for the conventional "Issues" tab. You-Get is a tiny command-line utility to download media contents (vid

46.4k Jan 3, 2023

Anonymously scrapes onlinesim.ru for new usable phone numbers.

phone-scraper Anonymously scrapes onlinesim.ru for new usable phone numbers. Usage Clone the repository $ git clone https://github.com/thomasgruebl/ph

16 Oct 8, 2022

A Python package that scrapes Google News article data while remaining undetected by Google.

A Python package that scrapes Google News article data while remaining undetected by Google. Our scraper can scrape page data up until the last page and never trigger a CAPTCHA (download stats: https://pepy.tech/project/GoogleNewsScraper)

6 Aug 10, 2022

Scrapes Every Email Address of Every Society in Every University

society-email-scrape Site Live at https://kcsoc.github.io/society-email-scrape/ How to automatically generate new data Go to unis.yml Add your uni Cre

18 Dec 14, 2022

Automatically scrapes all menu items from the Taco Bell website

Automatically scrapes all menu items from the Taco Bell website. Returns as PANDAS dataframe.

2 Jan 15, 2022

Releases(v1.3.0)

v1.3.0(Dec 5, 2021)
WebScrape v1.3.0

See version history document for all changes.

Running the program using python:

Download the source code.

Requirements:

Python 3.2+ (additional libraries: requests, beautifulsoup4)

Running the program from bundled executable:

Download the WebScrape-1.3.0 zip file containing the bundled executable (created using pyinstaller).

Requirements:

Windows 10

Note:

The executable file creates a 'temp' folder containing extracted libraries and support files in the same location as the program while running.

The temporary files will delete automatically but if the program is closed abruptly, the files will remain.

The 'temp' folder can be manually deleted after closing the program.

(As of pyinstaller v4.7, a one-file bundled executable will leave any temp '_MEIxxxxxx' folders if the program is force closed)

Source code(tar.gz)
Source code(zip)
WebScrape-1.3.0.zip(8.71 MB)

This program scrapes information and images for movies and TV shows.

Related tags

Overview

Media-WebScraper

Summary

General Information (default):

Images (default):

Episode Information (if specified):

Features:

Usage:

Running the program using python:

Running the program from bundled executable file (created using pyinstaller):

Updates:

You might also like...

Scrapes proxies and saves them to a text file

Meme-videos - Scrapes memes and turn them into a video compilations

This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

WebScraping - Scrapes Job website for python developer jobs and exports the data to a csv file

:arrow_double_down: Dumb downloader that scrapes the web

Anonymously scrapes onlinesim.ru for new usable phone numbers.

A Python package that scrapes Google News article data while remaining undetected by Google.

Scrapes Every Email Address of Every Society in Every University

Automatically scrapes all menu items from the Taco Bell website

Releases(v1.3.0)

v1.3.0(Dec 5, 2021)

WebScrape v1.3.0

Running the program using python:

Requirements:

Running the program from bundled executable:

Requirements:

Note:

Owner

Google Developer Profile Badge Scraper

UsernameScraperTool - Username Scraper Tool With Python

download NCERT books using scrapy

The open-source web scrapers that feed the Los Angeles Times California coronavirus tracker.

Scrape all the media from an OnlyFans account - Updated regularly

A simple flask application to scrape gogoanime website.

Web Content Retrieval for Humans™

Webservice wrapper for hhursev/recipe-scrapers (python library to scrape recipes from websites)

A tool can scrape product in aliexpress: Title, Price, and URL Product.

Scraping followers of an instagram account

Open Crawl Vietnamese Text

SearchifyX, predecessor to Searchify, is a fast Quizlet, Quizizz, and Brainly webscraper with various stealth features.

An Web Scraping API for MDL(My Drama List) for Python.

爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

Bigdata - This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster

A dead simple crawler to get books information from Douban.

A Pixiv web crawler module

Amazon web scraping using Scrapy Framework

A python script to extract answers to any question on Quora (Quora+ included)

Scrape puzzle scrambles from csTimer.net