RazerS 3

Abstract

Motivation: During the last years NGS sequencing has become a key technology for many applications in the biomedical sciences. Throughput continues to increase and new protocols provide longer reads than currently available. In almost all applications, read mapping is a first step. Hence, it is crucial to have algorithms and implementations that perform fast, with high sensitivity, and are able to deal with long reads and a large absolute number of indels.

Results: RazerS is a read mapping program with adjustable sensitivity based on counting q-grams. In this work we propose the successor RazerS 3 which now supports shared-memory parallelism, an additional seed-based filter with adjustable sensitivity, a much faster, banded version of the Myers’ bit-vector algorithm for verification, memory saving measures and support for the SAM output format. This leads to a much improved performance for mapping reads, in particular long reads with many errors. We extensively compare RazerS 3 with other popular read mappers and show that its results are often superior to them in terms of sensitivity while exhibiting practical and often competetive run times. In addition, RazerS 3 works without a precomputed index.

Main Features:

  • import of FASTA/FASTQ read and genome files
  • 5 output formats (including SAM)
  • reads can be of arbitrary length
  • supports Hamming and edit distance read mapping with configurable error rates
  • supports paired-end read mapping
  • configurable and predictable sensitivity (runtime/sensitivity tradeoff)
  • key improvements (compared to RazerS):
    • multicore parallelization
    • additional pigeonhole filter optimized for low error-rates with controllable sensitivity
    • banded Myers’ algorithm for verification
    • full sensitivity under the definition given in Rabema
    • SAM output
    • Availability and Implementation: Source code and binaries are freely available for download at https://www.seqan.de/projects/razers. RazerS 3 is implemented in C++ and OpenMP under a GPL license using the SeqAn library and supports Linux, Mac OS X, and Windows.

Links

Please Cite

  • D. Weese, M. Holtgrewe, K. Reinert, “RazerS 3: Faster, fully sensitive read mapping”, vol. 28, iss. 20, 2012-08-24.
    cite this publication
    @article{fu_mi_publications1159,
     abstract = {Motivation: During the last years NGS sequencing has become a key technology for many applications in the biomedical sciences. Throughput continues to increase and new protocols provide longer reads than currently available. In almost all applications, read mapping is a first step. Hence, it is crucial to have algorithms and implementations that perform fast, with high sensitivity, and are able to deal with long reads and a large absolute number of indels.
    
    Results: RazerS is a read mapping program with adjustable sensitivity based on counting q-grams. In this work we propose the successor RazerS 3 which now supports shared-memory parallelism, an additional seed-based filter with adjustable sensitivity, a much faster, banded version of the Myers? bit-vector algorithm for verification, memory saving measures and support for the SAM output format. This leads to a much improved performance for mapping reads, in particular long reads with many errors. We extensively compare RazerS 3 with other popular read mappers and show that its results are often superior to them in terms of sensitivity while exhibiting practical and often competetive run times. In addition, RazerS 3 works without a precomputed index.
    
    Availability and Implementation: Source code and binaries are freely available for download at http://www.seqan.de/projects/razers. RazerS 3 is implemented in C++ and OpenMP under a GPL license using the SeqAn library and supports Linux, Mac OS X, and Windows.},
     author = {D. Weese and M. Holtgrewe and K. Reinert},
     journal = {Bioinformatics},
     month = {August},
     number = {20},
     pages = {2592--2599},
     publisher = {Oxford University Press},
     title = {RazerS 3: Faster, fully sensitive read mapping},
     url = {http://publications.imp.fu-berlin.de/1159/},
     volume = {28},
     year = {2012}
    }
  • D. Weese, A.-K. Emde, T. Rausch, A. Döring, K. Reinert, “RazerS - Fast Read Mapping with Sensitivity Control”, vol. 19, iss. 9, 2009-07-10.
    cite this publication
    @article{fu_mi_publications453,
     abstract = {Second-generation sequencing technologies deliver DNA sequence data at unprecedented high throughput. Common to most biological applications is a mapping of the reads to an almost identical or highly similar reference genome. Due to the large amounts of data, e?cient algorithms and implementations are crucial for this task. We present an e?cient read mapping tool called RazerS. It allows the user to align sequencing reads of arbitrary length using either the Hamming distance or the edit distance. Our tool can work either lossless or with a user-de?ned loss rate at higher speeds. Given the loss rate, we present an approach that guarantees not to lose more reads than speci?ed. This enables the user to adapt to the problem at hand and provides a seamless tradeo? between sensitivity and running time.},
     author = {D. Weese and A.-K. Emde and T. Rausch and A. D{\"o}ring and K. Reinert},
     journal = {Genome Research},
     month = {July},
     number = {9},
     pages = {1646--1654},
     title = {RazerS - Fast Read Mapping with Sensitivity Control},
     url = {http://publications.imp.fu-berlin.de/453/},
     volume = {19},
     year = {2009}
    }

Contact

For questions, comments, or suggestions please contact:

David Weese david.weese@fu-berlin.de
˄