DALLE-tools provided useful dataset utilities to improve you workflow with WebDatasets.

Overview

DALLE tools

DALLE-tools is a github repository with useful tools to categorize, annotate or check the sanity of your datasets.

Installation

Just clone this repository to your folder and use one of the following commands in the section underneath.

WebDataset Annotator

python annotator.py

Press to switch to the next page, to change the annotation category or click on the image to add it to the current cateogry and save it in annotations.json. Please upload your annotations.json by creating a push request into community_annotations folder into the folder of the dataset you used (e.g. YFCC100m, or LAION400m etc.), so everyone can use the data for better dataset annotations! If you want to continue to annotate a dataset where someone else already started, just copy the annotations.json from the community_annotations folder and the used dataset into the root directory and run the annotator!

Screenshot

WebDataset aligner

python aligner.py

This tool helps to align the shuffled keys, so the WebDataset module can read your datasets correctly. You just need to specify the keys you want to look for and keep in your new dataset.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

Python library that makes it easy for data scientists to create charts.

Chartify Chartify is a Python library that makes it easy for data scientists to create charts. Why use Chartify? Consistent input data format: Spend l

Spotify 3.2k Jan 04, 2023
Draw interactive NetworkX graphs with Altair

nx_altair Draw NetworkX graphs with Altair nx_altair offers a similar draw API to NetworkX but returns Altair Charts instead. If you'd like to contrib

Zachary Sailer 206 Dec 12, 2022
Moscow DEG 2021 elections plots

Построение графиков на основе публичных данных о ДЭГ в Москве в 2021г. Описание Скрипты в данном репозитории позволяют собственноручно построить графи

9 Jul 15, 2022
A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

Visdom A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Python. Overview Concepts Setup Usage API To

FOSSASIA 9.4k Jan 07, 2023
A customized interface for single cell track visualisation based on pcnaDeep and napari.

pcnaDeep-napari A customized interface for single cell track visualisation based on pcnaDeep and napari. 👀 Under construction You can get test image

ChanLab 2 Nov 07, 2021
ecoglib: visualization and statistics for high density microecog signals

ecoglib: visualization and statistics for high density microecog signals This library contains high-level analysis tools for "topos" and "chronos" asp

1 Nov 17, 2021
A Python toolbox for gaining geometric insights into high-dimensional data

"To deal with hyper-planes in a 14 dimensional space, visualize a 3D space and say 'fourteen' very loudly. Everyone does it." - Geoff Hinton Overview

Contextual Dynamics Laboratory 1.8k Dec 29, 2022
在原神中使用围栏绘图

yuanshen_draw 在原神中使用围栏绘图 文件说明 toLines.py 将一张图片转换为对应的线条集合,视频可以按帧转换。 draw.py 在原神家园里绘制一张线条图。 draw_video.py 在原神家园里绘制视频(自动按帧摆放,截图(win)并回收) cat_to_video.py

14 Oct 08, 2022
Python Data Structures for Humans™.

Schematics Python Data Structures for Humans™. About Project documentation: https://schematics.readthedocs.io/en/latest/ Schematics is a Python librar

Schematics 2.5k Dec 28, 2022
Chem: collection of mostly python code for molecular visualization, QM/MM, FEP, etc

chem: collection of mostly python code for molecular visualization, QM/MM, FEP,

5 Sep 02, 2022
A simple interpreted language for creating basic mathematical graphs.

graphr Introduction graphr is a small language written to create basic mathematical graphs. It is an interpreted language written in python and essent

2 Dec 26, 2021
Some examples with MatPlotLib library in Python

MatPlotLib Example Some examples with MatPlotLib library in Python Point: Run files only in project's directory About me Full name: Matin Ardestani Ag

Matin Ardestani 4 Mar 29, 2022
Plotting data from the landroid and a raspberry pi zero to a influx-db

landroid-pi-influx Plotting data from the landroid and a raspberry pi zero to a influx-db Dependancies Hardware: Landroid WR130E Raspberry Pi Zero Wif

2 Oct 22, 2021
python partial dependence plot toolbox

PDPbox python partial dependence plot toolbox Motivation This repository is inspired by ICEbox. The goal is to visualize the impact of certain feature

Li Jiangchun 723 Jan 07, 2023
Bokeh Plotting Backend for Pandas and GeoPandas

Pandas-Bokeh provides a Bokeh plotting backend for Pandas, GeoPandas and Pyspark DataFrames, similar to the already existing Visualization feature of

Patrik Hlobil 822 Jan 07, 2023
Fast scatter density plots for Matplotlib

About Plotting millions of points can be slow. Real slow... 😴 So why not use density maps? ⚡ The mpl-scatter-density mini-package provides functional

Thomas Robitaille 473 Dec 12, 2022
2021 grafana arbitrary file read

2021_grafana_arbitrary_file_read base on pocsuite3 try 40 default plugins of grafana alertlist annolist barchart cloudwatch dashlist elasticsearch gra

ATpiu 5 Nov 09, 2022
script to generate HeN ipfs app exports of GLSL shaders

HeNerator A simple script to generate HeN ipfs app exports from any frag shader created with: GlslViewer GlslEditor The Book of Shaders glslCanvas VS

Patricio Gonzalez Vivo 22 Dec 21, 2022
Python Data. Leaflet.js Maps.

folium Python Data, Leaflet.js Maps folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js

6k Jan 02, 2023
Tools for calculating and visualizing Elo-like ratings of MLB teams using Retosheet data

Overview This project uses historical baseball games data to calculate an Elo-like rating for MLB teams based on regular season match ups. The Elo rat

Lukas Owens 0 Aug 25, 2021