This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

Last update: Feb 10, 2022

Related tags

Web Crawling scrapper-for-faculty

Overview

EXCTRATING EMAIL IDS FROM AN HTML PAGE

CONTENTS OF THIS FILE

Introduction
Requirements
Installation
Maintainers

INTRODUCTION

This project aims to store multiple email ids on a page in a csv file. While scouting different faculty pages of IITs, I discovered that the email ids stored on these pages is in different formats and cannot be detected by the mail regex we use. That is why I improved my regex in a way to detect and store email ids in multiple formats such as :

name AT domain DOT com
name(AT)domain(DOT)com
name[AT]domain[DOT]com
name{AT}domain{DOT}com
name[AT*]domain[DOT*]com

REQUIREMENTS

This module requires Python 3 to be installed in your system. The different libraries required in the project are Beautiful Soup and Urllib.

INSTALLATION

Install the Extracting Email Ids module by forking or cloning the project in your system

MAINTAINERS

Devansh Singh - [email protected]

Owner

Devansh Singh

GitHub Repository

A web Scraper for CSrankings.com that scrapes University and Faculty list for a particular country

A look into what we're building Demo.mp4 Prerequisites Python 3 Node v16+ Steps to run Create a virtual environment. Activate the virtual environment.

2 Jun 06, 2022

This is a web crawler that works on employ email data by gmane.org and visualizes it in different ways.

crawler_to_visual_gmane Analyzing an EMAIL Archive from gmane and vizualizing the data using the D3 JavaScript library. This is a set of tools that al

1 Dec 20, 2021

A Powerful Spider(Web Crawler) System in Python.

pyspider A Powerful Spider(Web Crawler) System in Python. Write script in Python Powerful WebUI with script editor, task monitor, project manager and

15.7k Jan 04, 2023

Newsscraper - A simple Python 3 module to get crypto or news articles and their content from various RSS feeds.

NewsScraper A simple Python 3 module to get crypto or news articles and their content from various RSS feeds. 🔧 Installation Clone the repo locally.

3 Jan 02, 2022

Dictionary - Application focused on word search through web scraping

Dictionary - Application focused on word search through web scraping, in addition to other functions such as dictation, spell and conjugation of syllables.

2 May 09, 2022

茅台抢购最新优化版本，茅台秒杀，优化了抢购协程队列

33 Sep 03, 2022

A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items

combined-shop-scraper A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items. Features Define an

2 Dec 13, 2021

Linkedin webscraping - Linkedin web scraping with python

linkedin_webscraping This is the first step of a full project called "LinkedIn J

4 Apr 24, 2022

jd_maotai rpa 基于selenium驱动的jd抢购rpa机器人

jd_maotai rpa 基于selenium驱动的jd抢购rpa机器人, 照顾我们这样的马大哈, 不会忘记抢购了, 祝大家过年都能喝上茅台. 特别声明: 本仓库发布的jd_maotai_rpa项目定义为自动化rpa项目, 是用于防止忘记参与jd茅台的活动(由于本人时常忘记), 而不是为了秒杀和抢

35 Nov 18, 2022

This tool can be used to extract information from any website

WEB-INFO- This tool can be used to extract information from any website Install Termux and run the command --- $ apt-get update $ apt-get upgrade $ pk

1 Oct 24, 2021

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par

8.4k Jan 08, 2023