HyDiff: Hybrid Differential Software Analysis

Related tags

Deep Learninghydiff
Overview

DOI

HyDiff: Hybrid Differential Software Analysis

This repository provides the tool and the evaluation subjects for the paper HyDiff: Hybrid Differential Software Analysis accepted for the technical track at ICSE'2020. A pre-print of the paper is available here.

Authors: Yannic Noller, Corina S. Pasareanu, Marcel Böhme, Youcheng Sun, Hoang Lam Nguyen, and Lars Grunske.

The repository includes:

A pre-built version of HyDiff is also available as Docker image:

docker pull yannicnoller/hydiff
docker run -it --rm yannicnoller/hydiff

Tool

HyDiff's technical framework is built on top of Badger, DifFuzz, and the Symbolic PathFinder. We provide a complete snapshot of all tools and our extensions.

Requirements

  • Git, Ant, Build-Essentials, Gradle
  • Java JDK = 1.8
  • Python3, Numpy Package
  • recommended: Ubuntu 18.04.1 LTS

Folder Structure

The folder tool contains 2 subfolders: fuzzing and symbolicexecution, representing the both components of HyDiff.

fuzzing

  • afl-differential: The fuzzing component is built on top of DifFuzz and KelinciWCA (the fuzzing part of Badger). Both use AFL as the underlying fuzzing engine. In order to make it easy for the users, we provide our complete modified AFL variant in this folder. Our modifications are based on afl-2.52b.

  • kelinci-differential: Kelinci leverages a server-client architecture to make AFL applicable to Java applications, please refer to the Kelinci poster-paper for more details. We modified it to make usable in a general differential analysis. It includes an interface program to connect the Kelinci server to the AFL fuzzer and the instrumentor project, which is used to instrument the Java bytecode. The instrumentation handles the coverage reporting and the collection of our differential metrics. The Kelinci server handles requests from AFL to execute a mutated input on the application.

symbolicexecution

  • jpf-core: Our symbolic execution is built on top of Symbolic PathFinder (SPF), which is an extension of Java PathFinder (JPF), which makes it necessary to include the core implementation of JPF.

  • jpf-symbc-differential: In order to make SPF applicable to a differential analysis, we modified in several locations and added the ability to perform some sort of shadow symbolic execution (cf. Complete Shadow Symbolic Execution with Java PathFinder). This folder includes the modified SPF project.

  • badger-differential: HyDiff performs a hybrid analysis by running fuzzing and symbolic execution in parallel. This concept is based on Badger, which provides the technical basis for our implementation. This folder includes the modified Badger project, which enables the differential hybrid analysis, incl. the differential dynamic symbolic execution.

How to install the tool and run our evaluation

Be aware that the instructions have been tested for Unix systems only.

  1. First you need to build the tool and the subjects. We provide a script setup.sh to simply build everything. Note: the script may override an existing site.properties file, which is required for JPF/SPF.

  2. Test the installation: the best way to test the installation is to execute the evaluation of our example program (cf. Listing 1 in our paper). You can execute the script run_example.sh. As it is, it will run each analysis (just differential fuzzing, just differential symbolic execution, and the hybrid analysis) once. The values presented in our paper in Section 2.2 are averaged over 30 runs. In order to perform 30 runs each, you can easily adapt the script, but for some first test runs you can leave it as it is. The script should produce three folders:

    • experiments/subjects/example/fuzzer-out-1: results for differential fuzzing
    • experiments/subjects/example/symexe-out-1: results for differential symbolic execution
    • experiments/subjects/example/hydiff-out-1: results for HyDiff (hybrid combination) It will also produce three csv files with the summarized statistics for each experiment:
    • experiments/subjects/example/fuzzer-out-results-n=1-t=600-s=30.csv
    • experiments/subjects/example/symexe-out-results-n=1-t=600-s=30.csv
    • experiments/subjects/example/hydiff-out-results-n=1-t=600-s=30-d=0.csv
  3. After finishing the building process and testing the installation, you can use the provided run scripts (experiments/scripts) to replay HyDiff's evaluation or to perform your own differential analysis. HyDiff's evaluation contains three types of differential analysis. For each of them you will find a separate run script:

In the beginning of each run script you can define the experiment parameters:

  • number_of_runs: N, the number of evaluation runs for each subject (30 for all experiments)
  • time_bound: T, the time bound for the analysis (regression: 600sec, side-channel: 1800sec, and dnn: 3600sec)
  • step_size_eval: S, the step size for the evaluation (30sec for all experiments)
  • [time_symexe_first: D, the delay with which fuzzing gets started after symexe for the DNN subjects] (only DNN)

Each run script first executes differential fuzzing, then differential symbolic execution and then the hybrid analysis. Please adapt our scripts to perform your own analysis.

For each subject, analysis_type, and experiment repetition i the scripts will produce folders like: experiments/subjects/ / -out- , and will summarize the experiments in csv files like: experiments/subjects/ / -out-results-n= -t= -s= -d= .csv .

Complete Evaluation Reproduction

In order to reproduce our evaluation completely, you need to run the three mentioned run scripts. They include the generation of all statistics. Be aware that the mere runtime of all analysis parts is more than 53 days because of the high runtimes and number of repetitions. So it might be worthwhile to run it only for some specific subjects or to run the analysis on different machines in parallel or to modify the runtime or to reduce the number of repetitions. Feel free to adjust the script or reuse it for your own purpose.

Statistics

As mentioned earlier, the statistics will be automatically generated by our run script, which execute the python scripts from the scripts folder to aggregate the several experiment runs. They will generate csv files with the information about the average result values.

For the regression analysis and the DNN analysis we use the scripts:

For the side-channel analysis we use the scripts:

All csv files for our experiments are included in experiments/results.

Feel free to adapt these evaluation scripts for your own purpose.

Maintainers

  • Yannic Noller (yannic.noller at acm.org)

License

This project is licensed under the MIT License - see the LICENSE file for details

You might also like...
Python framework for Stochastic Differential Equations modeling

SDElearn: a Python package for SDE modeling This package implements functionalities for working with Stochastic Differential Equations models (SDEs fo

Differential rendering based motion capture blender project.
Differential rendering based motion capture blender project.

TraceArmature Summary TraceArmature is currently a set of python scripts that allow for high fidelity motion capture through the use of AI pose estima

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

BossNAS This repository contains PyTorch evaluation code, retraining code and pretrained models of our paper: BossNAS: Exploring Hybrid CNN-transforme

Hybrid Neural Fusion for Full-frame Video Stabilization

FuSta: Hybrid Neural Fusion for Full-frame Video Stabilization Project Page | Video | Paper | Google Colab Setup Setup environment for [Yu and Ramamoo

Code for Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations
Code for Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations

Implementation for Iso-Points (CVPR 2021) Official code for paper Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations paper |

The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Graph Optimizer This repo contains the official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averagin

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.
A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

collie_recs Collie is a library for preparing, training, and evaluating implicit deep learning hybrid recommender systems, named after the Border Coll

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

R²SQL The PyTorch implementation of paper Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing. (AAAI 2021) Requirement

Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network
Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network

DeepCDR Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network This work has been accepted to ECCB2020 and was also published in the

Releases(v1.0.0)
  • v1.0.0(Jan 26, 2020)

    First official release for HyDiff. We added all parts of our tool and all evaluation subjects to support the reproduction of our results. This release is submitted to the ICSE 2020 Artifact Evaluation.

    Source code(tar.gz)
    Source code(zip)
Owner
Yannic Noller
Yannic Noller
HybridNets: End-to-End Perception Network

HybridNets: End2End Perception Network HybridNets Network Architecture. HybridNets: End-to-End Perception Network by Dat Vu, Bao Ngo, Hung Phan 📧 FPT

Thanh Dat Vu 370 Dec 29, 2022
1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

Instead, two models for appearance modeling are included, together with the open-source BAGS model and the full set of code for inference. With this code, you can achieve around 79 Oct 08, 2022

This Jupyter notebook shows one way to implement a simple first-order low-pass filter on sampled data in discrete time.

How to Implement a First-Order Low-Pass Filter in Discrete Time We often teach or learn about filters in continuous time, but then need to implement t

Joshua Marshall 4 Aug 24, 2022
Implements a fake news detection program using classifiers.

Fake news detection Implements a fake news detection program using classifiers for Data Mining course at UoA. Description The project is the categoriz

Apostolos Karvelas 1 Jan 09, 2022
rastrainer is a QGIS plugin to training remote sensing semantic segmentation model based on PaddlePaddle.

rastrainer rastrainer is a QGIS plugin to training remote sensing semantic segmentation model based on PaddlePaddle. UI TODO Init UI. Add Block. Add l

deepbands 5 Mar 04, 2022
PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition [CVPR 2021].

Involution: Inverting the Inherence of Convolution for Visual Recognition Unofficial PyTorch reimplementation of the paper Involution: Inverting the I

Christoph Reich 100 Dec 01, 2022
Combine Tacotron2 and Hifi GAN to generate speech from text

EndToEndTextToSpeech Combine Tacotron2 and Hifi GAN to generate speech from text Download weights Hifi GAN - hifi_gan/checkpoint/ : pretrain 2.5M ste

Phạm Quốc Huy 1 Dec 18, 2021
Robotics environments

Robotics environments Details and documentation on these robotics environments are available in OpenAI's blog post and the accompanying technical repo

Farama Foundation 121 Dec 28, 2022
Code accompanying the paper Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs (Chen et al., CVPR 2020, Oral).

Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs This repository contains PyTorch implementation of our pa

Shizhe Chen 178 Dec 29, 2022
Constructing interpretable quadratic accuracy predictors to serve as an objective function for an IQCQP problem that represents NAS under latency constraints and solve it with efficient algorithms.

IQNAS: Interpretable Integer Quadratic programming Neural Architecture Search Realistic use of neural networks often requires adhering to multiple con

0 Oct 24, 2021
Nicely is a real-time Feedback and Intervention Program Depression is a prevalent issue across all age groups, socioeconomic classes, and cultural identities.

Nicely is a real-time Feedback and Intervention Program Depression is a prevalent issue across all age groups, socioeconomic classes, and cultural identities.

1 Jan 16, 2022
(ICCV'21) Official PyTorch implementation of Relational Embedding for Few-Shot Classification

Relational Embedding for Few-Shot Classification (ICCV 2021) Dahyun Kang, Heeseung Kwon, Juhong Min, Minsu Cho [paper], [project hompage] We propose t

Dahyun Kang 82 Dec 24, 2022
ULMFiT for Genomic Sequence Data

Genomic ULMFiT This is an implementation of ULMFiT for genomics classification using Pytorch and Fastai. The model architecture used is based on the A

Karl 276 Dec 12, 2022
[NeurIPS 2021] A weak-shot object detection approach by transferring semantic similarity and mask prior.

TransMaS This repository is the official pytorch implementation of the following paper: NIPS2021 Mixed Supervised Object Detection by TransferringMask

BCMI 49 Jul 27, 2022
PyTorch implementation of neural style transfer algorithm

neural-style-pt This is a PyTorch implementation of the paper A Neural Algorithm of Artistic Style by Leon A. Gatys, Alexander S. Ecker, and Matthias

770 Jan 02, 2023
A flexible framework of neural networks for deep learning

Chainer: A deep learning framework Website | Docs | Install Guide | Tutorials (ja) | Examples (Official, External) | Concepts | ChainerX Forum (en, ja

Chainer 5.8k Jan 06, 2023
[ACM MM 2021] Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Diverse Image Inpainting with Bidirectional and Autoregressive Transformers Installation pip install -r requirements.txt Dataset Preparation Given the

Yingchen Yu 25 Nov 09, 2022
Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

Karush Suri 8 Nov 07, 2022
Revisting Open World Object Detection

Revisting Open World Object Detection Installation See INSTALL.md. Dataset Our new data division is based on COCO2017. We divide the training set into

58 Dec 23, 2022
A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.

2021: A Year Full of Amazing AI papers- A Review 📌 A curated list of the latest breakthroughs in AI by release date with a clear video explanation, l

Louis-François Bouchard 2.9k Dec 31, 2022