Skip to content

mikoar/GenomeAssembler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GenomeAssembler

Summary

GenomeAssembler is a program for de novo genome assembly program. It uses De Bruijn Graph approach to merge intersecting short genome sequencing reads into longer sequences (contigs).

Overview

The method was inpsired by Velvet algorithm (Zerbino et al.). First step is to represent reads as a graph of lenght k sequences, where nodes connected by an edge overlap by k - 1 nucleotides.

Then the graph undergoes simplification process, i.e. linear connected subgraphs are merged.

Short chains of nodes disconnected on one end, called "tips", are predominantly result of sequencing errors and are removed in next step.

The original algorithm's essential part is removal of redundant paths - starting and ending at the same node and containing similar sequences. This feature is part of a roadmap and is not implemented yet.

After all, abiguous connections that failed to resolve in previous steps are removed based on coverage cutoff. Sequences corresponding to subgraphs with minimum length of 300bp are returned.

Initially an error correction approach based on k-mer frequency was considered. However it has turned out to yield inappropriate results and was disabled.

Getting started

Prerequisites

Building from source requires .Net Core SDK >= 2.2

Compile

git clone https://github.com/mikoar/GenomeAssembler
cd GenomeAssembler
dotnet publish Assembly/Assembly.csproj -c Release -o <output-dir>

Usage

assembly [arguments] [options]

Arguments:
  reads                     fasta file with reads
  contigs                   contigs output fasta file

Options:
  --version                 Show version information
  -?|-h|--help              Show help information
  -d|--dot <DOT_FILE_PATH>  output graph to dot file
  -k|--k <K>                Mer length, default is 19

License

This software is distributed under WTFPL license.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages