A way to analyse how malware and/or goodware samples vary from each other using Shannon Entropy, Hausdorff Distance and Jaro-Winkler Distance

Overview

A way to analyse how malware and/or goodware samples vary from each other using Shannon Entropy, Hausdorff Distance and Jaro-Winkler Distance

Python version Project version Codacy Grade


UsageDownload

Introduction

ByteCog is a python script that aims to help security researchers and others a like to classify malicious software compared to other samples, depending on what the unknown file(s) is/are being tested against. This script can be extended to use a machine learning model to classify malware if you wanted to do so. ByteCog uses multiple methods of analyzing and classifying samples given to it, such as using Shannon Entropy to give a visual aspect for the researchers to look at while analyzing the code and finding possible readable code/text in a sample. ByteCog also uses Hausdorff Distance to calculate a 'raw similarity' value based on the difference in the entropy graphs of both samples, and finally ByteCog uses Jaro-Winkler Distance to calculate the 'true similarity' since the Hausdorff Distance will in most cases return a very high value if the sample is mostly the same entropy wise, so the Jaro-Winkler Distance is used to 'adjust' the simliarity value for this case of a sample.

Requirements

  • A python installation above 3.5+, which you can download from the official python website here.

Installation

Clone this repository to your local machine by following these instructions layed out here

Then proceed to download the dependencies file by running the following line in your console window

pip install -r requirements.txt

Usage

======================================================
|      ____          __         ______               |
|     / __ ) __  __ / /_ ___   / ____/____   ____    |
|    / __  |/ / / // __// _ \ / /    / __ \ / __ \   |
|   / /_/ // /_/ // /_ /  __// /___ / /_/ // /_/ /   |
|  /_____/ \__, / \__/ \___/ \____/ \____/ \__, /    |
|         /____/                          /____/     |
|                                                    |
|                    Version: 0.4                    |
|               Author: IlluminatiFish               |
======================================================

usage: bytecog.py [-h] -k KNOWN -u UNKNOWN -i IDENTIFIER -v VISUAL

Determine whether an unknown provided sample is similar to a known sample

optional arguments:
  -h, --help            show this help message and exit
  -k KNOWN, --known KNOWN
                        The file path to the known sample
  -u UNKNOWN, --unknown UNKNOWN
                        The file path to the unknown sample
  -i IDENTIFIER, --identifier IDENTIFIER
                        The antivirus identifier of the known file
  -v VISUAL, --visual VISUAL
                        If you want to show a visual representation of the file entropy

Features & Use Cases

  • Calculates sample similarity
  • Generates chunked entropy graph
  • Able to possibly detect malicious and benign software samples

Screenshots

Chunked Entropy Graph
chunk_entropy_graph

Output of ByteCog
bytecog_output

ByteCog Log File
bytecog log file

License

ByteCog - A way to analyse how malware and/or goodware samples vary from each other using Shannon Entropy, Hausdorff Distance and Jaro-Winkler Distance Copyright (c) 2021 IlluminatiFish

This program is free software; you can redistribute it and/or modify the code base under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but without ANY warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/

Acknowledgements

  • Using a modified version of @venkat-abhi's Shannon Entropy calculator to work with my project script, you can find the original one here.

  • Using the fastest method to get maximum key from a dictionary using this snippet here.

References

Entropy Wiki
Jaro-Winkler Distance Wiki
Hausdorff Distance Wiki
Shannon Calculator
Referenced Article #1
Referenced Paper #1
Referenced Paper #2
Referenced Paper #3

Owner
I am a developer that has a passion for programming, mathematics and cyber security. Currently Developer @South-Hollow
Dependency injection in python with autoconfiguration

The base is a DynamicContainer to autoconfigure services using the decorators @services for regular services and @command_handler for using command pattern.

Sergio Gómez 2 Jan 17, 2022
Malware arcane - Scripts and notes on my malware analysis journey

Malware Arcane Repository of notes and scripts I use when doing malware analysis

This exploit allows to connect to the remote RemoteMouse 3.008 service to virtually press arbitrary keys and execute code on the machine.

RemoteMouse-3.008-Exploit The RemoteMouse application is a program for remotely controlling a computer from a phone or tablet. This exploit allows to

Podalirius 25 Dec 04, 2022
GitLab CE/EE Preauth RCE using ExifTool

CVE-2021-22205 GitLab CE/EE Preauth RCE using ExifTool This project is for learning only, if someone's rights have been violated, please contact me to

3ND 164 Dec 10, 2022
Infoga is a tool gathering email accounts informations (ip,hostname,country,...) from different public source

Infoga - Email OSINT Infoga is a tool gathering email accounts informations (ip,hostname,country,...) from different public source (search engines, pg

m4ll0k (mallok) 1.8k Jan 04, 2023
Exploit grafana Pre-Auth LFI

Grafana-LFI-8.x Exploit grafana Pre-Auth LFI How to use python3

2 Jul 25, 2022
🐝 ℹ️ Honeybee extension for export to IES-VE gem file format

honeybee-ies Honeybee extension for export a HBJSON file to IES-VE GEM file format Installation pip install honeybee-ies QuickStart import pathlib fro

Ladybug Tools 4 Jul 12, 2022
IDA2Obj is a tool to implement SBI (Static Binary Instrumentation).

IDA2Obj IDA2Obj is a tool to implement SBI (Static Binary Instrumentation). The working flow is simple: Dump object files (COFF) directly from one exe

Mickey 94 Dec 13, 2022
Cookiecutter for creating open source Python packages

Cookiecutter for rapidly developing new open source Python packages. Best practices with all the modern bells and whistles included.

Wolt 177 Dec 22, 2022
This script allows you to make a onion host instantly.

Installation It only works in Debian based Linux distros. Clone the repo: git clone https://github.com/0xStevenson/Auto-Tor-Host.git Go to the direct

Steven 4 Feb 22, 2022
Implementation of RITA (Real Intelligence Threat Analytics) in Jupyter Notebook with improved scoring algorithm.

RITA (Real Intelligence Threat Analytics) in Jupyter Notebook RITA is an open source framework for network traffic analysis sponsored by Active Counte

Mehmet E. 157 Nov 24, 2022
Bypass's HCaptcha by overloading their api causing it to throwback a generated uuid. (Released due to exposure)

HCaptcha-Bypass Bypass's HCaptcha by overloading their api causing it to throwback a generated uuid. Not working? If it is not seeming to work for you

Dropout 17 Aug 23, 2021
Providing DevOps and security teams script to identify cloud workloads that may be vulnerable to the Log4j vulnerability(CVE-2021-44228) in their AWS account.

We are providing DevOps and security teams script to identify cloud workloads that may be vulnerable to the Log4j vulnerability(CVE-2021-44228) in their AWS account. The script enables security teams

Mitiga 13 Jan 04, 2022
Hikvision 流媒体管理服务器敏感信息泄漏

Hikvisioninformation Hikvision 流媒体管理服务器敏感信息泄漏 Options optional arguments: -h, --help show this help message and exit -u url, --url url

Henry4E36 13 Nov 09, 2022
A simple python code for hacking profile views

This code for hacking profile views. Not recommended to adding profile views in profile. This code is not illegal code. This code is for beginners.

Fayas Noushad 3 Nov 28, 2021
Detection tool of malware(s) by checksum (useful for forensic)

🐍 malware_checker.py Detection tool of malware(s) by checksum (useful for forensic) 📦 Dependencies installation $ pip3 install -r requirements.txt

Fayred 1 Jan 30, 2022
Workshop Material on VM-based Deobfuscation

Analysis of Virtualization-based Obfuscation This repository contains slides, samples and code of the 4h code deobfuscation workshop at r2con2021. We

Tim Blazytko 133 Dec 18, 2022
The disassembler parses evm bytecode from the command line or from a file.

EVM Bytecode Disassembler The disassembler parses evm bytecode from the command line or from a file. It does not matter whether the bytecode is prefix

alpharush 22 Dec 27, 2022
Discord exploit allowing you to be unbannable.

Discord-Ban-Immunity Discord exploit allowing you to be unbannable. 9/3/2021 Found in late August. Found by Passive and Me. Explanation If a user gets

orlando 9 Nov 23, 2022
Kunyu, more efficient corporate asset collection

Kunyu(坤舆) - More efficient corporate asset collection English | 中文文档 0x00 Introduce Tool introduction Kunyu (kunyu), whose name is taken from , is act

Knownsec, Inc. 772 Jan 05, 2023