Corn snake genome v2

Multiple technologies and software are now available facilitating the de novo sequencing and assembly of any vertebrate genome. Yet the quality of most available sequenced genomes is substantially poorer than that of the golden standard in the field: the human genome. Here, we present a step-by-step protocol for the successful sequencing and assembly of a high-quality snake genome that can be applied to any other reptilian or avian species. We combine the great sequencing depth and accuracy of short reads with the use of different insert size libraries for extended scaffolding followed by optical mapping. We show that this procedure improved the corn snake scaffold N50 from 3.7 kbp to 1.4 Mbp, currently making it one of the best snake genomes. Details on the assembly are provided in the associated publication, the figure below and the accompanying website, where we provide links to the suggested software, examples of input and output files and running parameters.

Corn snake genome statistics
Version No. of sequences (>1 kbp) N50 (kbp) L50 (seqs) Total length (Gbp)
1 297,768 3.7 94,091 1.53
2 114,644 1,378.4 279 1.94

Associated Publication
Ullate-Agote A., Chan F.Y., Tzika A.C.
A Step-by-step Guide to Assemble a Reptilian Genome
Methods in Molecular Biology: Avian and Reptilian Developmental Biology (2017)

Assembly outline
Assembly outline (A) Paired-end reads (green) are assembled into contigs (blue) using ‘DISCOVAR de novo’; (B) Scaffolds (purple) are produced with the software ‘BESST’ by combining the long-insert size mate-pair libraries (green/grey) and the DISCOVAR contigs (blue); (C) Mis-assemblies are identified by the software ‘REAPR’ on the basis of the longest mate-pair library; (D) A second round of scaffolding (red) is performed using the ‘SSPACE’ software ; (E) we combine transcriptomic data (green) with the assembly (red) to extend the scaffolds (orange) using the software ‘L_RNA_scaffolder’; (F) A final hybrid assembly (black) is built using BioNano optical maps (yellow). Vertical bars represent nicked/labeled sites.