Lambda

Lambda: The Local Aligner for Massive Biological Data

Overview

Lambda is a BLAST compatible local aligner optimized for NGS and metagenomics that adapts a new approach inspired by the read mapper Masai. In contrast to other tools, Lambda builds an index over both, the seeds (of the queries) and the database. The indexing structure built on the query sequences is a Radix tree and the indexing structure built over the subject sequences is a suffix array that is conceptionally used as a suffix trie. Lambda uses the Radix trie to search in the suffix array and, in doing so, efficiently searches for multiple queries in parallel.

Although Lambda was designed mainly for the purpose of replacing BlastX, it supports all other versions of the BLAST suite, as well (BlastP, BlastN, TBlastN, TBlastX).

Depending on the BLAST mode the sequences in the query strings are translated and then converted using an alphabet reduction. Lambda supports several alphabet reductions, the default is the commonly used Murphy10.

Further, Lambda offers three modes (fast, default and sensitive) in each of which it beats the competitors in speed or sensitivity or both.

Finally Lambda is multi-threaded to offer the advantage to run in parallel on modern multi-core architectures, it reads fasta and fastq files and outputs standard blast format.

 

Downloads

 
Lambda Source Code (v0.1) - First public release. [7.4MiB]
Lambda Source Code (v0.2) - This is the version submitted to ECCB. [9.0MiB]

 

Build instructions

Please make sure, that the build requirements are met:

  • CMake
  • C++11 support in the compiler
  • OpenMP support is highly recommend
  • Best use GCC4.8 or later

Then execute the following commands to extract the source and build the two binaries in seqan-lambda-build/release/bin

% tar xf seqan-lambda-v0.2.tar.gz
% mkdir -p seqan-lambda-build/release
% cd seqan-lambda-build/release
% cmake ../../seqan-lambda-v0.2 \
  -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_CXX_FLAGS:STRING="-std=c++11 -Wno-vla"
% make -j2 lambda lambda_indexer

Please be aware that due to excessive use of templating and compile-time optimizations the build might take well over 10min.

Usage

These examples assume that you have the files query.fasta and db.fasta, with the query and the subject sequences respectively.

If you have sufficient memory, please set the following environment variable prior to execution:

% export TMPDIR=/dev/shm

Storing query.fasta and db.fasta in TMPDIR and/or changing the working directory to it will speed up the program even further.

Default profile

Optionally mask the database:

% /path/to/segmasker -infmt fasta -in db.fasta -outfmt interval -out db.seg

Run the indexer:

% bin/lambda_indexer -i db.fasta [-s db.seg]

Run lambda:

% bin/lambda -q query.fasta -d db.fasta

Sensitive profile

The more sensitive profile described in the publication is available by appending

-so 5

to the the lambda call of the default instructions.

Fast profile

The fast profile introduced in the publication requires the following calls:

Optionally mask the database:

% /path/to/segmasker -infmt fasta -in db.fasta -outfmt interval -out db.seg

Run the indexer:

% bin/lambda_indexer -i db.fasta [-s db.seg] -a 0

Run lambda:

% bin/lambda -q query.fasta -d db.fasta -a 0 -sl 8 -ss 26 -sd 0 -so 4

Contact

For questions, comments, or suggestions feel free to contact Hannes Hauswedell or Knut Reinert.

References

  • Hauswedell H., Singer J., & Reinert K. (2014) Lambda: The Local Aligner for Massive Biological Data. submitted.
Last Update 7. April 2014

No Comments

Write a comment · TrackBack · RSS Comments

Write a comment