simulation tools to generate synthetic next-generation sequencing reads
ART is a set of simulation tools to generate synthetic next-generation
sequencing reads. ART simulates sequencing reads by mimicking real
sequencing process with empirical error models or quality profiles
summarized from large recalibrated sequencing data. ART can also
simulate reads using user own read error model or quality profiles. ART
supports simulation of single-end, paired-end/mate-pair reads of three
major commercial next-generation sequencing platforms: Illumina's
Solexa, Roche's 454 and Applied Biosystems' SOLiD. ART can be used to
test or benchmark a variety of method or tools for next-generation
sequencing data analysis, including read alignment, de novo assembly,
SNP and structure variation discovery. ART was used as a primary tool
for the simulation study of the 1000 Genomes Project . ART is
implemented in C++ with optimized algorithms and is highly efficient in
read simulation. ART outputs reads in the FASTQ format, and alignments
in the ALN format. ART can also generate alignments in the SAM
alignment or UCSC BED file format. ART can be used together with genome
variants simulators (e.g. VarSim) for evaluating variant calling tools
or methods.