R/read_sim.R
simulate_read_dynamic_sc.RdSimulates next-generation sequencing reads from input FASTA files in parallel using the ART sequencing simulator (Huang et al., Bioinformatics 2012). This function supports both single-end and paired-end read generation.
simulate_read_dynamic_sc(
fasta_inputs,
output_prefixes,
readLen = 100,
depth = 30,
artPath = "art_illumina",
seqSys = "HS25",
paired = TRUE,
numCores = 2,
otherArgs = ""
)Character vector of paths to input FASTA files
Character vector of output prefixes for generated read files
Integer specifying the length of simulated reads in base pairs (default: 100)
Numeric value specifying the desired sequencing depth/coverage (default: 30)
Character string specifying the path to the ART executable (default: "art_illumina")
Character string specifying the sequencing system to simulate (default: "HS25" for HiSeq 2500)
Logical indicating whether to generate paired-end reads (default: TRUE)
Integer specifying the number of CPU cores to use for parallel processing (default: 2)
Character string with additional arguments to pass to ART (default: "")
None. The function generates sequencing read files at the locations specified by output_prefixes and prints completion messages.
This function provides a parallel interface to the ART sequencing simulator by:
Defining an internal function simulateOne() that constructs and executes
the ART command for a single FASTA input
Using mclapply() to run multiple simulations in parallel
Redirecting ART's output to log files
For paired-end reads (when paired=TRUE), the function sets a mean fragment length of 200bp with a standard deviation of 10bp. For single-end reads, these parameters are omitted.
The function requires that the ART executable is installed and accessible.
if (FALSE) { # \dontrun{
# Simulate single-end reads for 3 genomes
simulate_read_dynamic_sc(
fasta_inputs = c("data/genome1.fa", "data/genome2.fa", "data/genome3.fa"),
output_prefixes = c("results/sim1", "results/sim2", "results/sim3"),
readLen = 150,
depth = 30,
artPath = "/usr/local/bin/art_illumina",
seqSys = "HS25",
paired = FALSE,
numCores = 3
)
# Simulate paired-end reads with custom ART arguments
simulate_read_dynamic_sc(
fasta_inputs = c("data/genome1.fa", "data/genome2.fa"),
output_prefixes = c("results/paired1", "results/paired2"),
paired = TRUE,
otherArgs = "--noALN --rndSeed 42"
)
} # }