(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

Last update: Jan 05, 2023

Overview

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

Background: Outlier detection (OD) is a key data mining task for identifying abnormal objects from general samples with numerous high-stake applications including fraud detection and intrusion detection.

To scale outlier detection (OD) to large-scale, high-dimensional datasets, we propose TOD, a novel system that abstracts OD algorithms into basic tensor operations for efficient GPU acceleration.

The corresponding paper. The code is being cleaned up and released. Please watch and star!

One reason to use it:

On average, TOD is 11 times faster than PyOD!

If you need another reason: it can handle much larger datasets:more than a million sample OD within an hour!

TOD is featured for:

Unified APIs, detailed documentation, and examples for the easy use (under construction)
Supports more than 10 different OD algorithms and more are being added
TOD supports multi-GPU acceleration
Advanced techniques like provable quantization

Programming Model Interface

Complex OD algorithms can be abstracted into common tensor operators.

https://raw.githubusercontent.com/yzhao062/pytod/master/figs/abstraction.png

For instance, ABOD and COPOD can be assembled by the basic tensor operators.

https://raw.githubusercontent.com/yzhao062/pytod/master/figs/abstraction_example.png

End-to-end Performance Comparison with PyOD

Overall, it is much (on avg. 11 times) faster than PyOD takes way less run time.

https://raw.githubusercontent.com/yzhao062/pytod/master/figs/run_time.png

Code is being released. Watch and star for the latest news!

Comments

Error while installing package
I installed Pytorch 1.10 from their site. It seen in virtual environment. I try pip install pytod but when searching for pytorch, it cannot find it because it searches with the "pytorch" package, not the "torch" package.

ERROR: Could not find a version that satisfies the requirement pytorch>=1.7 (from pytod) (from versions: 0.1.2, 1.0.2) ERROR: No matching distribution found for pytorch>=1.7
opened by nuriakiin 1
decision_function() returns None

Thanks for the package. When I try to implement LOF (or KNN) decision_function() on test data returns empty object. Is there a fix to this? Following is the code that replicates the issue (on GPU):

from pytod.models.lof import LOF import torch import numpy as np

x = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [75,80]], dtype=np.float32) x = torch.from_numpy(x)

y = np.array([[6, 5], [1, 2], [3, 4], [5, 1], [11,12]], dtype=np.float32) y = torch.from_numpy(y)

lof = LOF(n_neighbors=2, device = 'cuda:0')

lof.fit(x)

print(lof.decision_function(y))

opened by sugatc 0
Support for novelty detection and changing distance metric with local outlier factor

The current implementation of LOF doesn't allow changing the distance metric to 'cosine', for example or setting novelty = True which prevents it from being used for novelty detection task. It will be great if support can be added for these.

opened by sugatc 2
can't fit model in colab

when i try fit on any model in colab gpu instance i get the following error. my dataset has 2 columns and 1 million rows:

AttributeError Traceback (most recent call last) in () 4 clf_name = 'KNN' 5 clf = LOF() ----> 6 clf.fit(X)

3 frames /usr/local/lib/python3.7/dist-packages/pandas/core/generic.py in getattr(self, name) 5485 ): 5486 return self[name] -> 5487 return object.getattribute(self, name) 5488 5489 def setattr(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'to'

opened by yairVanti 0
clean up reproducibility scripts

We are cleaning up these scripts for an easy run, while the primary results are reproducible with the compare_real_data.py (https://github.com/yzhao062/pytod/tree/main/reproducibility)
enhancement

opened by yzhao062 0

Releases(v0.0.2)

v0.0.2(Jun 19, 2022)

v<0.0.1>, <04/12/2021> -- Add LOF. v<0.0.1>, <04/23/2021> -- Add ABOD. v<0.0.2>, <06/19/2021> -- Add PCA and HBOS. v<0.0.2>, <06/19/2021> -- Turn on test suites.

Now we have updated both the paper the repo to cover more algorithms.
Source code(tar.gz)
Source code(zip)

Owner

Yue Zhao

Ph.D. Student @ CMU. Outlier Detection Systems | ML Systems (MLSys) | Anomaly/Outlier Detection | AutoML. Twitter@ yzhao062

GitHub Repository https://www.andrew.cmu.edu/user/yuezhao2/papers/21-preprint-tod.pdf

Official repo for the work titled "SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation"

SharinGAN Official repo for the work titled "SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation" The official project we

23 Oct 19, 2022

Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

Deep Multi-Magnification Network This repository provides training and inference codes for Deep Multi-Magnification Network published here. Deep Multi

12 Aug 06, 2022

Pytorch implementation of "Get To The Point: Summarization with Pointer-Generator Networks"

About this repository This repo contains an Pytorch implementation for the ACL 2017 paper Get To The Point: Summarization with Pointer-Generator Netwo

7 Oct 14, 2022

Cobalt Strike teamserver detection.

Cobalt-Strike-det Cobalt Strike teamserver detection. usage: cobaltstrike_verify.py [-l TARGETS] [-t THREADS] optional arguments: -h, --help show this

17 Sep 27, 2022

IPATool-py: download ipa easily

IPATool-py Python version of IPATool! Installation pip3 install -r requirements.txt Usage Quickstart: download app with specific bundleId into DIR: p

159 Dec 30, 2022

Official implementation of the PICASO: Permutation-Invariant Cascaded Attentional Set Operator

PICASO Official PyTorch implemetation for the paper PICASO:Permutation-Invariant Cascaded Attentive Set Operator. Requirements Python 3 torch = 1.0 n

0 Dec 23, 2021

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects This repo contains the code of Segcache described in the followi

78 Jan 07, 2023

P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

P-tuning v2 P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks An optimized prompt tuning strategy achievi

540 Dec 30, 2022

Code of the paper "Shaping Visual Representations with Attributes for Few-Shot Learning (ASL)".

Shaping Visual Representations with Attributes for Few-Shot Learning This code implements the Shaping Visual Representations with Attributes for Few-S

9 Sep 01, 2022

[cvpr22] Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation

PS-MT [cvpr22] Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation by Yuyuan Liu, Yu Tian, Yuanhong Chen, Fengbei Liu, Vasile

132 Jan 03, 2023

A distributed deep learning framework that supports flexible parallelization strategies.

FlexFlow FlexFlow is a deep learning framework that accelerates distributed DNN training by automatically searching for efficient parallelization stra

528 Dec 25, 2022

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

37 Nov 27, 2022

FirmWire is a full-system baseband firmware emulation platform for fuzzing, debugging, and root-cause analysis of smartphone baseband firmwares

___ __ __ -. .-. | __|(+) _ _ _ _\ \ / /(+) _ _ ___ .-. .- \ / \ | _| | | '_| ' \ \/

571 Dec 25, 2022

MoveNet Single Pose on DepthAI

MoveNet Single Pose tracking on DepthAI Running Google MoveNet Single Pose models on DepthAI hardware (OAK-1, OAK-D,...). A convolutional neural netwo

64 Dec 29, 2022

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

Related tags

Overview

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

One reason to use it:

Programming Model Interface

End-to-end Performance Comparison with PyOD

Comments

Error while installing package

decision_function() returns None

Support for novelty detection and changing distance metric with local outlier factor

can't fit model in colab

clean up reproducibility scripts

Releases(v0.0.2)

v0.0.2(Jun 19, 2022)

Owner

Yue Zhao

Official repo for the work titled "SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation"

Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

Pytorch implementation of "Get To The Point: Summarization with Pointer-Generator Networks"

Cobalt Strike teamserver detection.

IPATool-py: download ipa easily

Official implementation of the PICASO: Permutation-Invariant Cascaded Attentional Set Operator

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

Code of the paper "Shaping Visual Representations with Attributes for Few-Shot Learning (ASL)".

[cvpr22] Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation

A distributed deep learning framework that supports flexible parallelization strategies.

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

An implementation of the 1. Parallel, 2. Streaming, 3. Randomized SVD using MPI4Py

The implementation of the lifelong infinite mixture model

Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks

Python Classes: Medical Insurance Project using Object Oriented Programming Concepts

DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

FirmWire is a full-system baseband firmware emulation platform for fuzzing, debugging, and root-cause analysis of smartphone baseband firmwares

MoveNet Single Pose on DepthAI