Lambda: The Local Aligner for Massive Biological Data


Lambda is a BLAST compatible local aligner optimized for NGS and metagenomics that adapts a new approach inspired by the read mapper Masai. In contrast to other tools, Lambda builds an index over both, the seeds (of the queries) and the database. The indexing structure built on the query sequences is a Radix tree and the indexing structure built over the subject sequences is a suffix array that is conceptionally used as a suffix trie. Lambda uses the Radix trie to search in the suffix array and, in doing so, efficiently searches for multiple queries in parallel.

Although Lambda was designed mainly for the purpose of replacing BlastX, it supports all other versions of the BLAST suite, as well (BlastP, BlastN, TBlastN, TBlastX).

Depending on the BLAST mode the sequences in the query strings are translated and then converted using an alphabet reduction. Lambda supports several alphabet reductions, the default is the commonly used Murphy10.

Further, Lambda offers three modes (fast, default and sensitive) in each of which it beats the competitors in speed or sensitivity or both.

Finally Lambda is multi-threaded to offer the advantage to run in parallel on modern multi-core architectures, it reads fasta and fastq files and outputs standard blast format.



Lambda Source Code (v0.1) - First public release. [2014/01/15]
Lambda Source Code (v0.2) - This is the version submitted to ECCB. [2014/04/07]
Lambda Source Code (v0.3) - Release for ECCB14 [2014/09/06]

  • Speed increased by ~20% over v0.2
  • Suffix-Array index memory consumption reduced from 16x to 6x input database size (please be aware that you cannot use index files created with prior versions anymore)
  • experimental support for FM-index as index (instead of SA) [not widely tested, yet]
  • small bugs in BLAST output formats corrected


Build instructions

Please make sure, that the build requirements are met:

  • CMake
  • C++11 support in the compiler
  • OpenMP support is highly recommend
  • Best use GCC-4.8 (older versions have insufficient C++11 and GCC-4.9 has an issue with current seqan-develop)

Then execute the following commands to extract the source and build the two binaries in seqan-lambda-build/release/bin

% tar xzf seqan-lambda-v0.3.tar.gz
% mkdir -p seqan-lambda-build/release
% cd seqan-lambda-build/release
% cmake ../../seqan-lambda-v0.3 \
% make -j2 lambda lambda_indexer

Warnings concerning a lack of C++14 can be ignored. Please be aware that due to excessive use of templating and compile-time optimizations the build might take well over 10min.


These examples assume that you have the files query.fasta and db.fasta, with the query and the subject sequences respectively.

If you have sufficient memory, please set the following environment variable prior to execution:

% export TMPDIR=/dev/shm

Storing query.fasta and db.fasta in TMPDIR and/or changing the working directory to it will speed up the program even further.

Default profile

Optionally mask the database:

% /path/to/segmasker -infmt fasta -in db.fasta -outfmt interval -out db.seg

Run the indexer:

% bin/lambda_indexer -i db.fasta [-s db.seg]

Run lambda:

% bin/lambda -q query.fasta -d db.fasta

Sensitive profile

The more sensitive profile described in the publication is available by appending

-so 5

to the the lambda call of the default instructions.

Fast profile

The fast profile introduced in the publication requires the following calls:

Optionally mask the database:

% /path/to/segmasker -infmt fasta -in db.fasta -outfmt interval -out db.seg

Run the indexer:

% bin/lambda_indexer -i db.fasta [-s db.seg] -a 0

Run lambda:

% bin/lambda -q query.fasta -d db.fasta -a 0 -sl 8 -ss 26 -sd 0 -so 4


For questions, comments, or suggestions feel free to contact Hannes Hauswedell or Knut Reinert.


  • Lambda: the local aligner for massive biological data
    Hannes Hauswedell; Jochen Singer; Knut Reinert
    Bioinformatics 2014 30 (17): i349-i355
    doi: 10.1093/bioinformatics/btu439
Last Update 23. September 2014