This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.

Overview

Robots.txt tester

With this script, you can enumerate all URLs present in robots.txt files, and test whether you can access them or not.

example

Setup

Clone the repository and install the dependencies :

git clone https://github.com/p0dalirius/robotstester
cd robotstester
python3 setup.py install

Usage

robotstester -u http://www.example.com/

You can find here a complete list of options :

[~] Robots.txt tester, v1.2.0

usage: robotstester.py [-h] (-u URL | -f URLSFILE) [-v] [-q] [-k] [-L] [-t THREADS] [-p] [-j JSONFILE] [-x PROXY] [-b COOKIES]

This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.

optional arguments:
  -h, --help            show this help message and exit
  -u URL, --url URL     URL to the robots.txt to test e.g. https://example.com:port/path
  -f URLSFILE, --urlsfile URLSFILE
                        List of robots.txt urls to test
  -v, --verbose         verbosity level (-v for verbose, -vv for debug)
  -q, --quiet           Show no information at all
  -k, --insecure        Allow insecure server connections when using SSL (default: False)
  -L, --location        Follow redirects (default: False)
  -t THREADS, --threads THREADS
                        Number of threads (default: 5)
  -p, --parsable        Parsable output
  -j JSONFILE, --jsonfile JSONFILE
                        Save results to specified JSON file.
  -x PROXY, --proxy PROXY
                        Specify a proxy to use for requests (e.g., http://localhost:8080)
  -b COOKIES, --cookies COOKIES
                        Specify cookies to use in requests. (e.g., --cookies "cookie1=blah;cookie2=blah")

Contributing

Pull requests are welcome. Feel free to open an issue if you want to add other features.

You might also like...
Import modules and files straight from URLs.

Import Python code from modules straight from the internet.

A python script made for personal use to monitor for sports card restocks on target.com since they are sold out often

TargetProductMonitor A python script made for personal use to monitor for sports card resocks on target.com since they are sold out often. When a rest

My sister is a GR of her class. She had to mark attendance of students from screenshots of teams meeting on an excel sheet. I resolved her problem by reading names from screenshots using PyTesseract and marking them present on the excel using Pandas in Python. It took me 1hr to write the code and it is saving half an hour everyday.
Manipulation OpenAI Gym environments to simulate robots at the STARS lab

liegroups Python implementation of SO2, SE2, SO3, and SE3 matrix Lie groups using numpy or PyTorch. [Documentation] Installation To install, cd into t

Serverless demo showing users how they can capture (and obfuscate) their Lambda payloads in Datadog APM
Serverless demo showing users how they can capture (and obfuscate) their Lambda payloads in Datadog APM

Serverless-capture-lambda-payload-demo Serverless demo showing users how they can capture (and obfuscate) their Lambda payloads in Datadog APM This wi

A one place destination to check whatever is trending on the top social and news websites at present.
A one place destination to check whatever is trending on the top social and news websites at present.

UpTrend A one place destination to check whatever is trending on the top social and news websites at present. Explore the docs » View Demo · Report Bu

 Python requirements.txt Guesser
Python requirements.txt Guesser

Python-Requirements-Guesser ⚠️ This is alpha quality software. Work in progress Attempt to guess requirements.txt modules versions based on Git histor

Birthday program - A program that lookups a birthday txt file and compares to the current date to check for birthdays
Birthday program - A program that lookups a birthday txt file and compares to the current date to check for birthdays

Birthday Program This is a program that lookups a birthday txt file and compares

Write a program that works out whether if a given year is a leap year
Write a program that works out whether if a given year is a leap year

Leap Year 💪 This is a Difficult Challenge 💪 Instructions Write a program that works out whether if a given year is a leap year. A normal year has 36

Comments
  • [Feature]  Add waybackmachine capability

    [Feature] Add waybackmachine capability

    In the past few days I've been experiencing using waybackmachine to enumerate robots.txt endpoints.

    Sometimes robots.txt gets removed and sometimes the removed content can be juicy. Thus the ideia of searching every WBM to look for old robots entries.

    I've implemented a quick and basic script to do a PoC, but I feel like this repo has the power to bring it to the next level since a lot of good features are already done.

    https://gist.github.com/felipecaon/035ad1718c3cae681d2afb03c699795f

    The gist works by getting all the robots.txt entries from WBM, parsing and sending to stdout. The script does not remove dps, just do a basic word removal.

    If I have the time I may be able to open a PR. But if someone wants to takes it further, I would love to see that. The core waybackmachine endpoints to be used are on my gist file.

    opened by felipecaon 0
Releases(1.2)
  • 1.2(Jul 7, 2021)

    Added --parsable option :cat2:

    usage: robotstester.py [-h] (-u URL | -f URLSFILE) [-v] [-q] [-k] [-L] [-t THREADS] [-p] [-j JSONFILE] [-x PROXY] [-b COOKIES]
    
    This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.
    
    optional arguments:
      -h, --help            show this help message and exit
      -u URL, --url URL     URL to the robots.txt to test e.g. https://example.com:port/path
      -f URLSFILE, --urlsfile URLSFILE
                            List of robots.txt urls to test
      -v, --verbose         verbosity level (-v for verbose, -vv for debug)
      -q, --quiet           Show no information at all
      -k, --insecure        Allow insecure server connections when using SSL (default: False)
      -L, --location        Follow redirects (default: False)
      -t THREADS, --threads THREADS
                            Number of threads (default: 5)
      -p, --parsable        Parsable output
      -j JSONFILE, --jsonfile JSONFILE
                            Save results to specified JSON file.
      -x PROXY, --proxy PROXY
                            Specify a proxy to use for requests (e.g., http://localhost:8080)
      -b COOKIES, --cookies COOKIES
                            Specify cookies to use in requests. (e.g., --cookies "cookie1=blah;cookie2=blah")
    
    Source code(tar.gz)
    Source code(zip)
  • 1.0(Jul 5, 2021)

    [~] Robots.txt tester, v1.0
    
    usage: robotstester.py [-h] [-u URL | -f URLSFILE] [-v] [-q] [-k] [-L] [-t THREADS] [-j JSONFILE] [-x PROXY] [-b COOKIES]
    
    This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.
    
    optional arguments:
      -h, --help            show this help message and exit
      -u URL, --url URL     URL to the robots.txt to test e.g. https://example.com:port/path
      -f URLSFILE, --urlsfile URLSFILE
                            List of robots.txt urls to test
      -v, --verbose         verbosity level (-v for verbose, -vv for debug)
      -q, --quiet           Show no information at all
      -k, --insecure        Allow insecure server connections when using SSL (default: False)
      -L, --location        Follow redirects (default: False)
      -t THREADS, --threads THREADS
                            Number of threads (default: 5)
      -j JSONFILE, --jsonfile JSONFILE
                            Save results to specified JSON file.
      -x PROXY, --proxy PROXY
                            Specify a proxy to use for requests (e.g., http://localhost:8080)
      -b COOKIES, --cookies COOKIES
                            Specify cookies to use in requests. (e.g., --cookies "cookie1=blah;cookie2=blah")
    
    Source code(tar.gz)
    Source code(zip)
Owner
Podalirius
Hacker of everything
Podalirius
Make your functions return something meaningful, typed, and safe!

Make your functions return something meaningful, typed, and safe! Features Brings functional programming to Python land Provides a bunch of primitives

dry-python 2.5k Jan 03, 2023
Safely pass trusted data to untrusted environments and back.

ItsDangerous ... so better sign this Various helpers to pass data to untrusted environments and to get it back safe and sound. Data is cryptographical

The Pallets Projects 2.6k Jan 01, 2023
Easytile blender - Simple Blender 2.83 addon for tiling meshes easily

easytile_blender Dead simple, barebones Blender (2.83) addon for placing meshes as tiles. Installation In Blender, go to Edit Preferences Add-ons

Sam Gibson 6 Jul 19, 2022
Traits for Python3

Do you like Python, but think that multiple inheritance is a bit too flexible? Are you looking for a more constrained way to define interfaces and re-use code?

121 Nov 15, 2022
Fix Eitaa Messenger's Font Problem on Linux

Fix Eitaa Messenger's Font Problem on Linux

6 Oct 15, 2022
Aesthetic NFT Generator

A E S T H E T I C Dependencies Pillow numpy OpenCV You can use pip to install any missing dependencies. Basic Usage Vaporwave artwork can be generated

Mentor Elezi 4 Mar 13, 2022
We want to check several batch of web URLs (1~100 K) and find the phishing website/URL among them.

We want to check several batch of web URLs (1~100 K) and find the phishing website/URL among them. This module is designed to do the URL/web attestation by using the API from NUS-Phishperida-Project.

3 Dec 28, 2022
A simple flashcard app built as a final project for a databases class.

CS2300 Final Project - Flashcard app 'FlashStudy' Tech stack Backend Python (Language) Django (Web framework) SQLite (Database) Frontend HTML/CSS/Java

Christopher Spencer 2 Feb 03, 2022
Ningyu Jia(nj2459)/Mengyin Ma(mm5937) Call Analysis group project(Group 36)

Group and Section Group 36 Section 001 name and UNI Name UNI Ningyu Jia nj2459 Mengyin Ma mm5937 code explanation Parking.py (1) Calculate the rate of

1 Dec 04, 2021
Reload all Blender add-on modules

Reload-Addon This add-on creates a list of the modules that the add-on selected in the drop-down menu contains and reloads them with the keyboard shor

2 Dec 02, 2021
A python script that changes your desktop background based on current weather and time of the day.

Desktop background wallpaper, based on current weather and time A python script that changes your computer's desktop background based on current weath

Maj Gaberšček 1 Nov 16, 2021
A project for Perotti's MGIS350 for incorporating Flask

MGIS350_5 This is our project for Perotti's MGIS350 for incorporating Flask... RIT Dev Biz Apps Web Project A web-based Inventory system for company o

1 Nov 07, 2021
JD扫码获取Cookie 本地版

JD扫码获取Cookie 本地版 请无视手机上的提示升级京东版本的提示! 下载链接 https://github.com/Zy143L/jd_cookie/releases 使用Python实现 代码很烂 没有做任何异常捕捉 但是能用 请不要将获取到的Cookie发送给任何陌生人 如果打开闪退 请使

Zy143L 420 Dec 11, 2022
Fluxos de captura e subida de dados no datalake da Prefeitura do Rio de Janeiro.

Pipelines Este repositório contém fluxos de captura e subida de dados no datalake da Prefeitura do Rio de Janeiro. O repositório é gerido pelo Escritó

Prefeitura do Rio de Janeiro 19 Dec 15, 2022
WriteAIr is a website which allows users to stream their writing.

WriteAIr is a website which allows users to stream their writing. It uses HSV masking to detect a pen which the user writes with. Plus, users can select a wide range of options through hand gestures!

Atharva Patil 1 Nov 01, 2021
【AI创造营】参赛作品

-AI-emmmm 【AI创造营】参赛作品 鬼畜小视频 AiStuido地址:https://aistudio.baidu.com/aistudio/projectdetail/1647685 BiliBili视频地址:https://www.bilibili.com/video/BV1Zv411b

107 Nov 09, 2022
An application for automation of the mining function in the game Alienworlds.IO

alienautomation A Python script made to automate the tidious job of mining on AlienWorlds This script: Automatically opens the browser Automatically l

anonieXdev 42 Dec 03, 2022
Programming labs for 6.S060 (Foundations of Computer Security).

6.S060 Labs This git repository contains the code for the labs in 6.S060. In these labs, you will add a series of security features to a photo-sharing

MIT PDOS 10 Nov 02, 2022
A dot matrix rendered using braille characters.

⣿ dotmatrix A dot matrix rendered using braille characters. Description This library provides class called Matrix which represents a dot matrix that c

Tim Fischer 25 Dec 12, 2022
This Curve Editor, written by Jehee Lee in 2015

Splines Abstract This Curve Editor, written by Jehee Lee in 2015, is a freeware. You can use, modify, redistribute the code without restriction. This

Movement Research Lab 8 Mar 11, 2022