HashDB is a community-sourced library of hashing algorithms used in malware.

Related tags

Algorithmshashdb
Overview

overview_hashdb

AWS Deploy Chat Support

HashDB

HashDB is a community-sourced library of hashing algorithms used in malware.

How To Use HashDB

HashDB can be used as a stand alone hashing library, but it also feeds the HashDB Lookup Service run by OALabs. This service allows analysts to reverse hashes and retrieve hashed API names and string values.

Stand Alone Module

HashDB can be cloned and used in your reverse engineering scripts like any standard Python module. Some example code follows.

>>> import hashdb
>>> hashdb.list_algorithms()
['crc32']
>>> hashdb.algorithms.crc32.hash(b'test')
3632233996

HashDB Lookup Service

OALabs run a free HashDB Lookup Service that can be used to query a hash table for any hash listed in the HashDb library. Included in the hash tables are the complete set of Windows APIs as well as a many common strings used in malware. You can even add your own strings!

HashDB IDA Plugin

The HashDB lookup service has an IDA Pro plugin that can be used to automate hash lookups directly from IDA! The client can be downloaded from GitHub here.

How To Add New Hashes

HashDB relies on community support to keep our hash library current! Our goal is to have contributors spend no more than five minutes adding a new hash, from first commit, to PR. To achieve this goal we offer the following streamlined process.

  1. Make sure the hash algorithm doesn’t already exist… we know that seems silly but just double check.

  2. Create a branch with a descriptive name.

  3. Add a new Python file to the /algorithms directory with the name of your hash algorithm. Try to use the official name of the algorithm, or if it is unique, use the name of the malware that it is unique to.

  4. Use the following template to setup your new hash algorithm. All fields are mandatory and case sensitive.

    #!/usr/bin/env python
    
    DESCRIPTION = "your hash description here"
    # Type can be either 'unsigned_int' (32bit) or 'unsigned_long' (64bit)
    TYPE = 'unsigned_int'
    # Test must match the exact has of the string 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'
    TEST_1 = hash_of_string_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
    
    
    def hash(data):
        # your hash code here
  5. Double check your Python style, we use Flake8 on Python 3.9. You can try the following lint commands locally from the root of the git repository.

    pip install flake8
    flake8 ./algorithms --count --exit-zero --max-complexity=15 --max-line-length=127 --statistics --show-source
    
  6. Test your code locally using our test suite. Run the folling commands locally from the root of the git repository. Note that you must run pytest as a module rather than directly or it won't pick up our test directory.

    pip install pytest
    python -m pytest
    
  7. Issue a pull request — your new algorithm will be automatically queued for testing and if successful it will be merged.

That’s it! Not only will your new hash be available in the HashDB library but a new hash table will be generated for the HashDB Lookup Service and you can start reversing hashes immediately!

Rules For New Hashes

PRs with changes outside of the /algorithms directory are not part of our automated CI and will be subjected to extra scrutiny.

All hashes must have a valid description in the DESCRIPTION field.

All hashes must have a type of either unsigned_int or unsigned_long in the TYPE field. HashDB currently only accepts unsigned 32bit or 64bit hashes.

All hashes must have the hash of the string ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 in the TEST_1 field.

All hashes must include a function hash(data) that accepts a byte string and returns a hash of the string.

Adding Custom API Hashes

Some hash algorithms hash the module name and API separately and combine the hashes to create a single module+API hash. An example of this is the standard Metasploit ROR13 hash. These algorithms will not work with the standard wordlist and require a custom wordlist that includes both the module name and API. To handle these we allow custom algorithms that will only return a valid hash for some words.

Adding a custom API hash requires the following additional components.

  1. The TEST_1 field must be set to 4294967294 (-1).

  2. The hash algorithm must return the value 4294967294 for all invalid hashes.

  3. An additional TEST_API_DATA_1 field must be added with an example word that is valid for the algorithm.

  4. An additional TEST_API_1 field must be added with the hash of the TEST_API_DATA_1 field.

Standing On The Shoulders of Giants

A big shout out to the FLARE team for their efforts with shellcode_hashes. Many years ago this project set the bar for quick and easy malware hash reversing and it’s still an extremely useful tool. So why duplicate it?

Frankly, it’s all about the wordlist and accessibility. We have seen a dramatic shift towards using hashes for all sorts of strings in malware now, and the old method of hashing all the Windows’ DLL exports just isn’t good enough. We wanted a solution that could continuously process millions of registry keys and values, filenames, and process names. And we wanted that data available via a REST API so that we could use it our automation workflows, not just our static analysis tools. That being said, we wouldn’t exist without shellcode_hashes, so credit where credit is due 🙌


Owner
OALabs
OALabs
Python sample codes for robotics algorithms.

PythonRobotics Python codes for robotics algorithm. Table of Contents What is this? Requirements Documentation How to use Localization Extended Kalman

Atsushi Sakai 17.2k Jan 01, 2023
Better control of your asyncio tasks

quattro: task control for asyncio quattro is an Apache 2 licensed library, written in Python, for task control in asyncio applications. quattro is inf

Tin Tvrtković 37 Dec 28, 2022
Path tracing obj - (taichi course final project) a path tracing renderer that can import and render obj files

Path tracing obj - (taichi course final project) a path tracing renderer that can import and render obj files

5 Sep 10, 2022
Algoritmos de busca:

Algoritmos-de-Buscas Algoritmos de busca: Abaixo está a interface da aplicação: Ao selecionar o tipo de busca e o caminho, então será realizado o cálc

Elielson Barbosa 5 Oct 04, 2021
Resilient Adaptive Parallel sImulator for griD (rapid)

Rapid is an open-source software library that implements a novel “parallel-in-time” (Parareal) algorithm and semi-analytical solutions for co-simulation of integrated transmission and distribution sy

Richard Lincoln 7 Sep 07, 2022
Python Client for Algorithmia Algorithms and Data API

Algorithmia Common Library (python) Python client library for accessing the Algorithmia API For API documentation, see the PythonDocs Algorithm Develo

Algorithmia 138 Oct 26, 2022
A minimal implementation of the IQRM interference flagging algorithm for radio pulsar and transient searches

A minimal implementation of the IQRM interference flagging algorithm for radio pulsar and transient searches. This module only provides the algorithm that infers a channel mask from some spectral sta

Vincent Morello 6 Nov 29, 2022
FLIght SCheduling OPTimization - a simple optimization library for flight scheduling and related problems in the discrete domain

Fliscopt FLIght SCheduling OPTimization 🛫 or fliscopt is a simple optimization library for flight scheduling and related problems in the discrete dom

33 Dec 17, 2022
Algorithms for calibrating power grid distribution system models

Distribution System Model Calibration Algorithms The code in this library was developed by Sandia National Laboratories under funding provided by the

Sandia National Laboratories 2 Oct 31, 2022
FingerPy is a algorithm to measure, analyse and monitor heart-beat using only a video of the user's finger on a mobile cellphone camera.

FingerPy is a algorithm using python, scipy and fft to measure, analyse and monitor heart-beat using only a video of the user's finger on a m

Thiago S. Brasil 37 Oct 21, 2022
Implementation of core NuPIC algorithms in C++

NuPIC Core This repository contains the C++ source code for the Numenta Platform for Intelligent Computing (NuPIC)

Numenta 270 Nov 19, 2022
A simple python implementation of A* and bfs algorithm solving Eight-Puzzle

A simple python implementation of A* and bfs algorithm solving Eight-Puzzle

2 May 22, 2022
A lightweight, pure-Python mobile robot simulator designed for experiments in Artificial Intelligence (AI) and Machine Learning, especially for Jupyter Notebooks

aitk.robots A lightweight Python robot simulator for JupyterLab, Notebooks, and other Python environments. Goals A lightweight mobile robotics simulat

3 Oct 22, 2021
This project consists of a collaborative filtering algorithm to predict movie reviews ratings from a dataset of Netflix ratings.

Collaborative Filtering - Netflix movie reviews Description This project consists of a collaborative filtering algorithm to predict movie reviews rati

Shashank Kumar 1 Dec 21, 2021
Exact algorithm for computing two-sided statistical tolerance intervals under a normal distribution assumption using Python.

norm-tol-int Exact algorithm for computing two-sided statistical tolerance intervals under a normal distribution assumption using Python. Methods The

Jed Ludlow 1 Jan 06, 2022
FPE - Format Preserving Encryption with FF3 in Python

ff3 - Format Preserving Encryption in Python An implementation of the NIST approved FF3 and FF3-1 Format Preserving Encryption (FPE) algorithms in Pyt

Privacy Logistics 42 Dec 16, 2022
Python algorithm to determine the optimal elevation threshold of a GNSS receiver, by using a statistical test known as the Brown-Forsynthe test.

Levene and Brown-Forsynthe: Test for variances Application to Global Navigation Satellite Systems (GNSS) Python algorithm to determine the optimal ele

Nicolas Gachancipa 2 Aug 09, 2022
frePPLe - open source supply chain planning

frePPLe Open source supply chain planning FrePPLe is an easy-to-use and easy-to-implement open source advanced planning and scheduling tool for manufa

frePPLe 385 Jan 06, 2023
A custom prime algorithm, implementation, and performance code & review

Colander A custom prime algorithm, implementation, and performance code & review Pseudocode Algorithm 1. given a number of primes to find, the followi

Finn Lancaster 3 Dec 17, 2021
Parameterising Simulated Annealing for the Travelling Salesman Problem

Parameterising Simulated Annealing for the Travelling Salesman Problem Abstract The Travelling Salesman Problem is a well known NP-Hard problem. Given

Gary Sun 55 Jun 15, 2022