outputs artificial FASTQ files derived from a reference genome
ArtificialFastqGenerator takes the reference genome (in FASTA format) as
input and outputs artificial FASTQ files in the Sanger format. It can
accept Phred base quality scores from existing FASTQ files, and use them
to simulate sequencing errors. Since the artificial FASTQs are derived
from the reference genome, the reference genome provides a gold-standard
for calling variants (Single Nucleotide Polymorphisms (SNPs) and
insertions and deletions (indels)). This enables evaluation of a Next
Generation Sequencing (NGS) analysis pipeline which aligns reads to the
reference genome and then calls the variants.