A variant caller for the GBA gene using WGS data

Related tags

MiscellaneousGauchian
Overview

Gauchian: WGS-based GBA variant caller

Gauchian is a targeted variant caller for the GBA gene based on a whole-genome sequencing (WGS) BAM file. Gauchian uses a novel method to solve the problems caused by the high sequence similarity with the pseudogene paralog GBAP1 and is able to detect variants accurately in the Exons 9-11 homology region, such as large deletions or duplications between GBA and GBAP1, and GBAP1-like variants in GBA, including p.A495P, p.L483P, p.D448H, c.1263del, RecNciI, RecTL and c.1263del+RecTL. In addition to these challenging variants, Gauchian also calls known pathogenic or likely pathogenic GBA variants classified in ClinVar. Please refer to our preprint for more details about the method.

Running the program

This Python3 program can be run as follows:

python -m gauchian --manifest MANIFEST_FILE \
                   --genome [19/37/38] \
                   --prefix OUTPUT_FILE_PREFIX \
                   --outDir OUTPUT_DIRECTORY \
                   --threads NUMBER_THREADS

The manifest is a text file in which each line should list the absolute path to an input BAM/CRAM file. For CRAM input, it’s suggested to provide the path to the reference fasta file with --reference in the command.

Interpreting the output

The program produces a .tsv file in the directory specified by --outDir. The fields are explained below:

Fields in tsv Explanation
Sample Sample name
is_biallelic_GBAP1-like_variant_exon9-11 Whether the sample is called as biallelic for GBAP1-like variants in exon9-11
is_carrier_GBAP1-like_variant_exon9-11 Whether the sample is called as a carrier for GBAP1-like variants in exon9-11
total_CN Total copy number of GBA+GBAP1
deletion_breakpoint_in_GBA_gene Whether the deletion breakpoint is in GBA gene if a deletion exists
GBAP1-like_variant_exon9-11 GBAP1-like variants called in exon9-11, two alleles separated by /
other_variants Other variants called (non-GBAP1-like variants or variants outside of exon9-11)

A .json file is also produced that contains more information about each sample.

Fields in json Explanation
Coverage_MAD Median absolute deviation of depth, measure of sample quality
Median_depth Sample median depth
deletion_CN CN of the unique region between GBA and GBAP1. This value plus 2 is the total CN
deletion_CN_raw Raw normalized depth of the unique region between GBA and GBAP1
variant_raw_count Supporting reads for each variant
snp_call GBA copy number call at GBA/GBAP1 differentiating sites
snp_raw Raw GBA copy number at GBA/GBAP1 differentiating sites
haplotypes Summary of haplotypes assembled across GBA/GBAP1 differentiating sites in Exon9-11
You might also like...
Data Structures and Algorithms Python - Practice data structures and algorithms in python with few small projects

Data Structures and Algorithms All the essential resources and template code nee

Adansons Base is a data management tool that organizes metadata of unstructured data and creates and organizes datasets.

Adansons Base is a data management tool that organizes metadata of unstructured data and creates and organizes datasets. It makes dataset creation more effective and helps find essential insights from training results and improves AI performance.

Open-source data observability for modern data teams
Open-source data observability for modern data teams

Use cases Monitor your data warehouse in minutes: Data anomalies monitoring as dbt tests Data lineage made simple, reliable, and automated dbt operati

A demo of a data science project using Kedro

iris Overview This is your new Kedro project, which was generated using Kedro 0.17.4. Take a look at the Kedro documentation to get started. Rules and

Data Poisoning based on Adversarial Attacks using Non-Robust Features

Data Poisoning based on Adversarial Attacks using Non-Robust Features Usage python main.py [-h] [--gpu | -g GPU] [--eps |-e EPSILON] [--pert | -p PER

Cisco IOS-XE Operations Program. Shows operational data using restconf and yang
Cisco IOS-XE Operations Program. Shows operational data using restconf and yang

XE-Ops View operational and config data from devices running Cisco IOS-XE software. NoteS The build folder is the latest build. All other files are fo

Run python scripts and pass data between multiple python and node processes using this npm module

Run python scripts and pass data between multiple python and node processes using this npm module. process-communication has a event based architecture for interacting with python data and errors inside nodejs.

ARRU seismic backprojection - Earthquake waveform detection and P/S arrivals picking on continuous data using ARRU phase picker Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.
Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.

Download and display GOES-East and GOES-West data GOES-East and GOES-West satellite data are made available on Amazon Web Services through NOAA's Big

Comments
  • UserWarning: multiple_iterators not implemented for CRAM

    UserWarning: multiple_iterators not implemented for CRAM

    When running with .cram file, got the following warnings /gauchian/depth_calling/snp_count.py:131: UserWarning: multiple_iterators not implemented for CRAM ignore_orphan=False /gauchian/depth_calling/haplotype.py:189: UserWarning: multiple_iterators not implemented for CRAM min_base_quality=13

    Will these warnings affect the quality of calls?

    opened by LNGDingj 1
Releases(v1.0.2)
Owner
Illumina
Illumina Open Source Software
Illumina
Demo of patching a python context manager

patch-demo-20211203 demo of patching a python context manager poetry install poetry run python -m my_great_app to run the code poetry run pytest to te

Brad Smith 1 Feb 09, 2022
The program calculates the BMI of people

Programmieren Einleitung: Das Programm berechnet den BMI von Menschen. Es ist sehr einfach zu handhaben, so können alle Menschen ihren BMI berechnen.

2 Dec 16, 2021
This repo presents you the official code of "VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention"

VISTA VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention Shengheng Deng, Zhihao Liang, Lin Sun and Kui Jia* (*) Corresponding a

104 Dec 29, 2022
A collection of online resources to help you on your Tech journey.

Everything Tech Resources & Projects About The Project Coming from an engineering background and looking to up skill yourself on a new field can be di

Mohamed A 396 Dec 31, 2022
A simple, fantasy and fast note taking program.

notes A simple, fantasy and fast note taking program Installation This program supposed to run in linux and may have some bugs on windows or any other

Ali Hosseinverdi 1 Apr 06, 2022
easy_sbatch - Batch submitting Slurm jobs with script templates

easy_sbatch - Batch submitting Slurm jobs with script templates

Wei Shen 13 Oct 11, 2022
Urban Big Data Centre Housing Sensor Project

Housing Sensor Project The Urban Big Data Centre is conducting a study of indoor environmental data in Scottish houses. We are using Raspberry Pi devi

Jeremy Singer 2 Dec 13, 2021
Store Simulation

Almacenes Para clonar el Repositorio: Vaya a la terminal de Linux o Mac, o a la cmd en Windows y ejecute:

Johan Posada 1 Nov 12, 2021
Ked interpreter built with Lex, Yacc and Python

Ked Ked is the first programming language known to hail from The People's Republic of Cork. It was first discovered and partially described by Adam Ly

Eoin O'Brien 1 Feb 08, 2022
Write-ups for CTF Internacional MetaRed 2021 5th stage

MetaRed2021-5th-Writeups Write-ups for CTF Internacional MetaRed 2021 5th stage Easy (15) No Status Category Name Creator(s) 01 Done osint Cybersecuri

UA Cybersecurity 2 Dec 22, 2021
This program goes thru reddit, finds the most mentioned tickers and uses Vader SentimentIntensityAnalyzer to calculate the ticker compound value.

This program goes thru reddit, finds the most mentioned tickers and uses Vader SentimentIntensityAnalyzer to calculate the ticker compound value.

195 Dec 13, 2022
The only purpose of a byte-sized application is to help you create .desktop entry files for downloaded applications.

Turtle 🐢 The only purpose of a byte-sized application is to help you create .desktop entry files for downloaded applications. As of usual with elemen

TenderOwl 14 Dec 29, 2022
A comparison of mesh generators.

This repository creates meshes of the same domains with multiple mesh generators and compares the results.

Nico Schlömer 29 Dec 12, 2022
Kolibri: the offline app for universal education

Kolibri This repository is for software developers wishing to contribute to Kolibri. If you are looking for help installing, configuring and using Kol

Learning Equality 564 Jan 02, 2023
An example module hooking system, will be used in PySAMP.

An example module hooking system, will be used in PySAMP.

2 May 01, 2022
A Unified Framework for Hydrology

Unified Framework for Hydrology The Python package unifhy (Unified Framework for Hydrology) is a hydrological modelling framework which combines inter

Unified Framefork for Hydrology - Community Organisation 6 Jan 01, 2023
🛠️ Learn a technology X by doing a project - Search engine of project-based learning

Learn X by doing Y 🛠️ Learn a technology X by doing a project Y Website You can contribute by adding projects to the CSV file.

William 408 Dec 20, 2022
List of resources for learning Category Theory

A curated list of resources for studying category theory. As resources aimed at mathematicians are abundant, this list is aimed at materials whose target audience is not people with a graduate-level

Bruno Gavranović 100 Jan 01, 2023
Anonymous Dark Web Tool

Anonymous Dark Web Tool v1.0 Features Anonymous Mode Darkweb Search Engines Check Onion Url/s Scanning Host/IP Keep eyes on v2.0 soon. Requirement Deb

Mounib Kamhaz 11 Apr 10, 2022
The semi-complete teardown of Cosmo's Cosmic Adventure.

The semi-complete teardown of Cosmo's Cosmic Adventure.

Scott Smitelli 10 Dec 02, 2022