Simulates next-generation sequencing reads from input FASTA files in parallel using the ART sequencing simulator (Huang et al., Bioinformatics 2012). This function supports both single-end and paired-end read generation.

simulate_read_dynamic_sc(
  fasta_inputs,
  output_prefixes,
  readLen = 100,
  depth = 30,
  artPath = "art_illumina",
  seqSys = "HS25",
  paired = TRUE,
  numCores = 2,
  otherArgs = ""
)

Arguments

fasta_inputs

Character vector of paths to input FASTA files

output_prefixes

Character vector of output prefixes for generated read files

readLen

Integer specifying the length of simulated reads in base pairs (default: 100)

depth

Numeric value specifying the desired sequencing depth/coverage (default: 30)

artPath

Character string specifying the path to the ART executable (default: "art_illumina")

seqSys

Character string specifying the sequencing system to simulate (default: "HS25" for HiSeq 2500)

paired

Logical indicating whether to generate paired-end reads (default: TRUE)

numCores

Integer specifying the number of CPU cores to use for parallel processing (default: 2)

otherArgs

Character string with additional arguments to pass to ART (default: "")

Value

None. The function generates sequencing read files at the locations specified by output_prefixes and prints completion messages.

Details

This function provides a parallel interface to the ART sequencing simulator by:

  1. Defining an internal function simulateOne() that constructs and executes the ART command for a single FASTA input

  2. Using mclapply() to run multiple simulations in parallel

  3. Redirecting ART's output to log files

For paired-end reads (when paired=TRUE), the function sets a mean fragment length of 200bp with a standard deviation of 10bp. For single-end reads, these parameters are omitted.

The function requires that the ART executable is installed and accessible.

Examples

if (FALSE) { # \dontrun{
# Simulate single-end reads for 3 genomes
simulate_read_dynamic_sc(
  fasta_inputs = c("data/genome1.fa", "data/genome2.fa", "data/genome3.fa"),
  output_prefixes = c("results/sim1", "results/sim2", "results/sim3"),
  readLen = 150,
  depth = 30,
  artPath = "/usr/local/bin/art_illumina",
  seqSys = "HS25",
  paired = FALSE,
  numCores = 3
)

# Simulate paired-end reads with custom ART arguments
simulate_read_dynamic_sc(
  fasta_inputs = c("data/genome1.fa", "data/genome2.fa"),
  output_prefixes = c("results/paired1", "results/paired2"),
  paired = TRUE,
  otherArgs = "--noALN --rndSeed 42"
)
} # }