0.-Webscrapping-using-python

Scraping Top Repositories for Topics on GitHub,
Web scraping is the process of extracting and parsing data from websites in an automated fashion using a computer program. It's a useful technique for creating datasets for research and learning. Follow these steps to build a web scraping project from scratch using Python and its ecosystem of libraries:
Pick a website and describe your objective
Browse through different sites and pick on to scrape. Check the "Project Ideas" section for inspiration.
Identify the information you'd like to scrape from the site. Decide the format of the output CSV file.
Summarize your project idea and outline your strategy in a Juptyer notebook.
Use the requests library to download web pages.
Inspect the website's HTML source and identify the right URLs to download.
Download and save web pages locally using the requests library.
Create a function to automate downloading for different topics/search queries.
Use Beautiful Soup to parse and extract information
Parse and explore the structure of downloaded web pages using Beautiful soup.
Use the right properties and methods to extract the required information.
Create functions to extract from the page into lists and dictionaries.
Use a REST API to acquire additional information if required.
Create CSV file(s) with the extracted information.
Create functions for the end-to-end process of downloading, parsing, and saving CSVs.
Execute the function with different inputs to create a dataset of CSV files.
Verify the information in the CSV files by reading them back using Pandas.
Document and share your work
Add proper headings and documentation in your Jupyter notebook.
Write a blog post about your project and share it online.

Scraping Top Repositories for Topics on GitHub,

Related tags

Overview

0.-Webscrapping-using-python

Owner

Dev Aravind D Satprem

A web Scraper for CSrankings.com that scrapes University and Faculty list for a particular country

A web scraper for nomadlist.com, made to avoid website restrictions.

Web scrapping

A python script to extract answers to any question on Quora (Quora+ included)

Web Scraping COVID 19 Meta Portal with Python

Python script to check if there is any differences in responses of an application when the request comes from a search engine's crawler.

DaProfiler allows you to get emails, social medias, adresses, works and more on your target using web scraping and google dorking techniques

A python tool to scrape NFT's off of OpenSea

A package that provides you Latest Cyber/Hacker News from website using Web-Scraping.

WebScraper - A script that prints out a list of all EXTERNAL references in the HTML response to an HTTP/S request

Unja is a fast & light tool for fetching known URLs from Wayback Machine

京东云无线宝积分推送，支持查看多设备积分使用情况

A dead simple crawler to get books information from Douban.

基于Github Action的定时HITsz疫情上报脚本，开箱即用

This is a script that scrapes the longitude and latitude on food.grab.com

京东茅台抢购

A web crawler for recording posts in "sina weibo"

This is a web scraper, using Python framework Scrapy, built to extract data from the Deals of the Day section on Mercado Livre website.

script to scrape direct download links (ddls) from google drive index.

An Automated udemy coupons scraper which scrapes coupons and autopost the result in blogspot post