Twitter Scraper

Last update: Dec 30, 2022

Related tags

Overview

tweety

Twitter's API is annoying to work with, and has lots of limitations — luckily their frontend (JavaScript) has it's own API, which I reverse–engineered. No API rate limits. No restrictions. Extremely fast.

Prerequisites

Before you begin, ensure you have met the following requirements:

Internet Connection
Python 3.6+
BeautifulSoup (Python Module)
Requests (Python Module)

All Functions

get_tweets()
get_user_info()
get_trends() (can be used without username)
search() (can be used without username)
tweet_detail() (can be used without username)

Using tweety

Getting Tweets:

Description:

Get 20 Tweets of a Twitter User

Required Parameter:

Username or User profile URL while initiating the Twitter Object

Optional Parameter:

pages : int (default is 1,starts from 2) -> Get the mentioned number of pages of tweets
include_extras : boolean (default is False) -> Get different extras on the page like Topics etc

Output:

Type -> dictionary

Structure

    {
      "p-1" : {
        "result": {
            "tweets": []
        }
      },
      "p-2":{
        "result": {
            "tweets": []
        }
      }
    }

Example:

>> from tweet import Twitter >>> all_tweet = Twitter("Username or URL").get_tweets(pages=2) >>> for i in all_tweet: ... print(all_tweet[i]) ">

python
Python 3.7.3 (default, Mar 26 2019, 21:43:19) 
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tweet import Twitter
>>> all_tweet = Twitter("Username or URL").get_tweets(pages=2)
>>> for i in all_tweet:
...   print(all_tweet[i])

Getting Trends:

Description:

Get 20 Locale Trends

Output:

Type -> dictionary

Structure

", "url":"
" }, { "name":"

", "url":"

" } ] } ">
  {
    "trends":[
      {
        "name":"
      
       "
      ,
        "url":"
      
       "
      
      },
      {
        "name":"
      
       "
      ,
        "url":"
      
       "
      
      }
    ]
  } 

Example :

>> from tweet import Twitter >>> trends = Twitter().get_trends() >>> for i in trends['trends']: ... print(i['name']) ">

python
Python 3.7.3 (default, Mar 26 2019, 21:43:19) 
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tweet import Twitter
>>> trends = Twitter().get_trends()
>>> for i in trends['trends']:
...   print(i['name'])

Searching a keyword:

Description:

Get 20 Tweets for a specific Keyword or Hashtag

Required Parameter:

keyword : str -> Keyword begin search

Optional Parameter:

latest : boolean (Default is False) -> Get the latest tweets

Output:

Type -> list

Example:

>> from tweet import Twitter >>> trends = Twitter().search("Pakistan") ">

python
Python 3.7.3 (default, Mar 26 2019, 21:43:19) 
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tweet import Twitter
>>> trends = Twitter().search("Pakistan")

Getting USER Info:

Description:

Get the information about the user

Required Parameter:

Username or User profile URL while initiating the Twitter Object

Optional Parameter:

banner_extensions : boolean (Default is False) -> get more information about user banner image
image_extensions : boolean (Default is False) -> get more information about user profile image

Output:

Type -> dict

Example:

>> from tweet import Twitter >>> trends = Twitter("Username or URL").get_user_info() ">

python
Python 3.7.3 (default, Mar 26 2019, 21:43:19) 
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tweet import Twitter
>>> trends = Twitter("Username or URL").get_user_info()

Getting a Tweet Detail:

Description:

Get the detail of a tweet including its reply

Required Parameter:

Identifier of the Tweet -> Either Tweet URL OR Tweet ID

Output:

Type -> dict
Structure

  {
    "conversation_threads":[],
    "tweet": {}
  }

Example:

>> from tweet import Twitter >>> trends = Twitter().tweet_detail("https://twitter.com/Microsoft/status/1442542812197801985") ">

python
Python 3.7.3 (default, Mar 26 2019, 21:43:19) 
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tweet import Twitter
>>> trends = Twitter().tweet_detail("https://twitter.com/Microsoft/status/1442542812197801985")

Updates:

Update 0.1:

Get Multiple Pages of tweets using pages parameter in get_tweets() function
output of get_tweets has been reworked.

Update 0.2:

Again reworked and simplified tweets in get_tweets function 😜
Added tweet_detail function for getting details about a tweet including replies to it

Update 0.2.1:

Fixed Hashtag Search

Twitter Scraper

Related tags

Overview

tweety

Prerequisites

All Functions

Using tweety

Getting Tweets:

Description:

Required Parameter:

Optional Parameter:

Output:

Example:

Getting Trends:

Description:

Output:

Example :

Searching a keyword:

Description:

Required Parameter:

Optional Parameter:

Output:

Example:

Getting USER Info:

Description:

Required Parameter:

Optional Parameter:

Output:

Example:

Getting a Tweet Detail:

Description:

Required Parameter:

Output:

Example:

Updates:

Update 0.1:

Update 0.2:

Update 0.2.1:

Owner

Tayyab Kharl

Visual scraping for Scrapy

This code will be able to scrape movies from a movie website and also provide download links to newly uploaded movies.

New World Market Scraper

This repo has the source code for the crawler and data crawled from auto-data.net

A python script to extract answers to any question on Quora (Quora+ included)

Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.

This tool can be used to extract information from any website

Python based Web Scraper which can discover javascript files and parse them for juicy information (API keys, IP's, Hidden Paths etc)

script to scrape direct download links (ddls) from google drive index.

A Spider for BiliBili comments with a simple API server.

EBay-email-tracker - Scapes an entire search page of a particular item on eBay and sends regular updates to an email address

Poolbooru gelscraper - a simple python script for scraping images off gelbooru pools.

此脚本为 python 脚本,实现原理为利用 selenium 定位相关元素,再配合点击事件完成浏览器的自动化.

WebScraping - Scrapes Job website for python developer jobs and exports the data to a csv file

Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.

Amazon scraper using scrapy, a python framework for crawling websites.

Screenhook is a script that captures an image of a web page and send it to a discord webhook.

Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Transistor, a Python web scraping framework for intelligent use cases.

A low-code tool that generates python crawler code based on curl or url