Fiona: A parallel and automatic strategy for read error correction
Fiona is a tool for the automatic correction of sequencing errors in reads produced by high throughput sequencing experiments. It uses an efficient implementation of suffix arrays to detect read overlaps with different seed lengths in parallel. Fiona was compared on several real datasets to state-of-the-art methods and showed overall superior correction accuracy. It was also among the fastest. Additionaly Fiona embarks unique characteristics which makes it a good choice over existing programs:
- No parameters to set for the user. You just need to know the length of the genome!
- Correction of both substitution and indel errors.
- Optimal correction over a range of seed values.
- Multicore-Parallelization using OpenMP.
- Efficient, memory-saving implementation.
Schulz M.H., Weese D., Holtgrewe M., Dimitrova V., Niu S., Reinert K., & Richard H. (2014) Fiona: a parallel and automatic strategy for read error correction. Bioinformatics (2014) 30 (17): i356-i363
ANISE and BASIL
ANISE and BASIL
Motivation: Large insertions of novel sequence are an important type of structural variants. Previous studies used traditional de novo assemblers for assembling non-mapping high-throughput sequencing (HTS) or capillary reads and then tried to anchor them in the reference using paired read information.
Results: We present approaches for detecting insertion breakpoints and targeted assembly of large insertions from HTS paired data: BASIL and ANISE. On near identity repeats that are hard for assemblers, ANISE employs a repeat resolution step. This results in far better reconstructions than obtained by ABYSS. On simulated data, we found our insert assembler to be competitive with the de novo assembler ABYSS while yielding already anchored inserted sequence as opposed to unanchored contigs as from ABYSS. On real-world data, we detected novel sequence in a human individual and thoroughly validated the assembled sequence.
Holtgrewe, Manuel, Leon Kuchenbecker, and Knut Reinert. “Methods for the detection and assembly of novel sequence in high-throughput sequencing data.” Bioinformatics (2015) 31 (12): 1904-1912.