Enly is a tool that allows the closure of roughly 10% of the gaps that are commonly present in a draft genome. It is best suited for long reads (e.g. 454 or Sanger reads) and is based on the iterative mapping of reads at the extremities of contigs obtained after de novo assembling.
The overall strategy of the tool is schematically reported in Figure 1. Enly takes a multiFASTA file embedding all the contigs and tries to increase their length by reiterating the following procedure for each of the contigs. Initially, a number of bases (selectable by the user) are detached from one of the contig extremity and this sequence fragment is used as input for a BLAST search against a database embedding all the reads resulting from the sequencing run. Since a typical 454 sequencing run embeds reads with variable lengths, different BLAST searches are performed, using fragments of different length at every step within the same cycle. We suggest to explore different fragments lengths (-b and –m parameters, see below) ranging from 100/200 bp greater than the average reads length to 100/200 bp lower than average reads length. The BLAST output is then parsed to identify those reads that can be used to increase the overall length of the contig, that is those reads only partially aligned at the end of the contig and projecting outside from its extremity. The identified reads and the original contig are then assembled together resulting in a (possibly) enlarged contig. The same procedure is repeated for the other extremity of the contig. These steps are repeated for a certain (user specifiable) cycles, or until no further elongation of the contigs is possible.