Raptor
Abstract
Raptor is a system for approximately searching many queries like NGS reads or transcripts in large collections of nucleotide sequences. Raptor uses winnowing minimizers to define a set of representative k-mers, an extension of the Interleaved Bloom Filters (IBF) as a set membership data structure, and probabilistic thresholding for minimizers. Our approach allows compression and partitioning of the IBF to enable the effective use of secondary memory.
Links
- Official Website
- Download binaries
- View the source code and README on GitHub
Please Cite
- Enrico Seiler,
Svenja Mehringer,
Mitra Darvish,
Etienne Turc,
Knut Reinert,
“Raptor: A fast and space-efficient pre-filter for querying very large collections of nucleotide sequences”, 2020.
cite this publication
@unpublished{fu_mi_publications2519, abstract = {We present Raptor, a tool for approximately searching many queries in large collections of nucleotide sequences. In comparison with similar tools like Mantis and COBS, Raptor is 12-144 times faster and uses up to 30 times less memory. Raptor uses winnowing minimizers to define a set of representative k-mers, an extension of the Interleaved Bloom Filters (IBF) as a set membership data structure, and probabilistic thresholding for minimizers. Our approach allows compression and a partitioning of the IBF to enable the effective use of secondary memory. Competing Interest Statement: The authors have declared no competing interest.}, author = {Enrico Seiler and Svenja Mehringer and Mitra Darvish and Etienne Turc and Knut Reinert}, booktitle = {Raptor: A fast and space-efficient pre-filter for querying very large collections of nucleotide sequences}, journal = {bioRxiv}, title = {Raptor: A fast and space-efficient pre-filter for querying very large collections of nucleotide sequences}, url = {http://publications.imp.fu-berlin.de/2519/}, year = {2020} }
Contact
For questions, comments, or suggestions please contact:
