RazerS Previous Versions

Abstract

Second-generation sequencing technologies deliver DNA sequence data at unprecedented high throughput. Common to most biological applications is a mapping of the reads to an almost identical or highly similar reference genome. Due to the large amounts of data, efficient algorithms and implementations are crucial for this task. We present an efficient read mapping tool called RazerS. It allows the user to align sequencing reads of arbitrary length using either the Hamming distance or the edit distance. Our tool can work either lossless or with a user-defined loss rate at higher speeds. Given the loss rate, we present an approach that guarantees not to lose more reads than specified. This enables the user to adapt to the problem at hand and provides a seamless tradeoff between sensitivity and running time.

Main Features

  • import of MultiFASTA read and genome files
  • reads can be of arbitrary length
  • supports Hamming and edit distance read mapping with configurable error rates
  • supports paired-end read mapping
  • configurable and predictable sensitivity (runtime/sensitivity tradeoff)

RazerS Binaries

Supported platforms are: Windows, Linux, Linux 64, and Mac OS X. Please take a look at the README file for usage instructions.

There is a newer version of RazerS available.

Version History

2013-03-13: v1.1.1

  • releasing with new automatic build system

2010-06-18: v1.1

  • added: memory efficient support for large q-grams (up to 31)
  • added: omptimized mapping onto many short contigs, deferred Swift post-processing
  • fixed: minor bug fixes

2009-07-10: v1.0

  • first official release of RazerS
  • added: paired-end mapping
  • added: Eland and GFF output formats
  • added: minor optimizations

2008-10-29

  • dramatically decreased memory consumption
  • added: "--purge-ambiguous" and "--distance-range" options
  • changed "--max-hits" behavior

2008-09-25

  • important: If your version doesn't contain the 'gapped_params' folder please re-download
  • added: "--recognition-rate" option to control the sensitivity of RazerS
  • added: automatic configuration of the filter depending on the recognition rate

2008-09-18

  • added: "--hamming-only" option to ignore Indels and consider only mismatches
  • added: "--match-N" option allows 'N' to match with all characters

2008-09-04

  • added: "--max-hits" option to neglect reads with too many hits or non-unique reads
  • added: "--shape" option to change the underlying (un)gapped k-mer shape

2008-08-21

  • fixed: matches at the beginning of the genome were not found on 32bit machines
  • added: optimized verifications and increased performance

2008-08-20

  • fixed: matches were primarily sorted by their orientation and secondarily by the sort-order

2008-08-18

  • added: "--repeat-length" option to ignore single character repeats in the genome
  • added: "--overabundance-cut" option to remove overabundant read k-mers
  • added: "--position-format" option to specify how match positions are denoted

2008-08-07

  • added: "--sort-order" option now allows to select the sort order of matches
  • fixed: many non-repeat regions were causing a significant performance drop

2008-08-05

  • added: RepeatMasker masks all Ns of the reference genomes automatically
  • fixed: precision bug relating the percent identity
  • fixed: closely adjacent matches got lost

2008-08-01

  • first pre-release of RazerS using a fixed k=11 and manual recognition/performance parametrization

Contact

For questions, comments, or suggestions feel free to contact David Weese.

 

References

Weese, D., Emde, A.-K., Rausch, T., Döring, A., & Reinert, K. (2009). RazerS - Fast read mapping with sensitivity controlGenome Research, 19(9), 1646–1654. doi:10.1101/gr.088823.108

 

Last Update 1. May 2013