This is a tool for speculation of ancestral allel, calculation of sfs and drawing its bar plot.

Last update: Dec 16, 2022

Related tags

Overview

superSFS

This is a tool for speculation of ancestral allel, calculation of sfs and drawing its bar plot. It is easy-to-use and runing fast. What you should prepare is the phased vcf file containg the data of populations you intrested and the outgroup, the outgroup name file, and the annotation file. Enjoy it!!!

It has four models:

0：Using all function, from original vcf data to sfs barplot
1: Only speculate the ancestral allel and output new vcf file using speculated allel as reference
2: Only count the frequency of derived allel in each snp of each population
3: Only draw bar polt of sfs using data generated from the results of calutation of sfs

Example:

Model 0: python superSFS 0 ogdir threshold vcfdir annodir modir coutdir plotdir group
Model 1: python superSFS 1 ogdir threshold vcfdir outdir
Model 2: python superSFS 2 annodir modir coutdir
Model 3: python superSFS 3 coutdir plotdir group

Explation for each parameter:

ogdir: direction of outgroup names file
threshold: a number that if the sum of variant allel in outpgroup greater than it,the variant allel will be counted as ancestral allel
vcfdir: direction of vcf data
vannodir: direction of annotation file with sample names in first column and group name in second colum. This file should has header in first row
vmodir: assign the output direction of generated vcf file using speculated allel as reference
countdir: assign the output direction of calculation of derived allels for each snp in each group
plotdir: assign the output direction of bar plot of sfs
group: the group that you want to analysis

This is a tool for speculation of ancestral allel, calculation of sfs and drawing its bar plot.

Related tags

Overview

superSFS

Owner

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

Creating a statistical model to predict 10 year treasury yields

Vectorizers for a range of different data types

Bearsql allows you to query pandas dataframe with sql syntax.

Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data.

First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we want to understand column level lineage and automate impact analysis.

track your GitHub statistics

PyChemia, Python Framework for Materials Discovery and Design

Show you how to integrate Zeppelin with Airflow

Single-Cell Analysis in Python. Scales to >1M cells.

PCAfold is an open-source Python library for generating, analyzing and improving low-dimensional manifolds obtained via Principal Component Analysis (PCA).

Elasticsearch tool for easily collecting and batch inserting Python data and pandas DataFrames

Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data

An extension to pandas dataframes describe function.

Functional tensors for probabilistic programming

MeSH2Matrix - A set of Python codes for the generation of biomedical ontologies from the MeSH keywords of the PubMed scholarly publications

Python-based Space Physics Environment Data Analysis Software

A tool to compare differences between dataframes and create a differences report in Excel

Monitor the stability of a pandas or spark dataframe ⚙︎

Automatic earthquake catalog building workflow: EQTransformer + Siamese EQTransformer + PickNet + REAL + HypoInverse