R/read_sim.R
simulate_sc_dynamic_reads_for_batches.RdSimulates sequencing reads for multiple single cells in batches by synthesizing cell-specific genomes with mutations and generating sequencing reads using the ART simulator. This function processes cells in batches to manage memory usage.
simulate_sc_dynamic_reads_for_batches(
sampled_cell_idx,
dynamics_ob,
sc_founder_genomes,
all_sampled_sc_mutations,
sc_mut_table_with_nt,
sim_updates,
fa_dir,
output_dir,
art_path,
n_cores = 4,
depth = 30,
readLen = 150,
otherArgs = "",
keep_genome_files = FALSE,
batch_size = 10
)Vector of cell indices for which to simulate reads
List object returned by simulate_sc_dynamics containing cell lineage information
List containing genome information for clone founders with elements: all_node_genomes, child_node_founders_df
List containing mutation information for sampled cells
Data frame containing mutation details with nucleotide changes
List containing updated simulation information including all_node_segments
Character string specifying the directory to store FASTA files
Character string specifying the directory for output read files
Character string specifying the path to the ART sequencing simulator executable
Integer specifying the number of CPU cores to use for parallel processing (default: 4)
Numeric value specifying the desired sequencing depth/coverage (default: 30)
Integer specifying the length of simulated reads in base pairs (default: 150)
Character string with additional arguments to pass to ART (default: "")
Logical indicating whether to retain the temporary FASTA genome files after simulation (default: FALSE)
Integer specifying the number of cells to process in each batch (default: 10)
A list where each element contains mutation check results for a cell, with the cell index as the name of each element
This function performs the following steps for each batch of cells:
For each cell in the batch:
Extracts cell lineage information from dynamics_ob
Synthesizes a cell-specific genome with mutations using synth_sc_genome()
Performs checks on the generated genome using check_genome_mutations()
Validates the sanity check results with validate_mutation_check()
Combines maternal and paternal haplotypes into a single FASTA file
Writes the merged genome to disk
Simulates sequencing reads for all cells in the batch in parallel using simulate_read_dynamic_sc()
Optionally removes temporary genome files to save disk space
Processing cells in batches helps manage memory usage when dealing with large numbers of cells. The function relies on external functions like synth_sc_genome(), check_genome_mutations(), validate_mutation_check(), and simulate_read_dynamic_sc().
if (FALSE) { # \dontrun{
# Assuming you have already run a simulation and have necessary objects
mutation_checks <- simulate_sc_dynamic_reads_for_batches(
sampled_cell_idx = c(50, 51, 52, 53, 54),
dynamics_ob = sc_dynamics_result,
sc_founder_genomes = founder_genomes,
all_sampled_sc_mutations = sampled_mutations,
sc_mut_table_with_nt = mutations_with_nt,
sim_updates = simulation_updates,
fa_dir = "data/genomes/",
output_dir = "results/reads/",
art_path = "/usr/local/bin/art_illumina",
n_cores = 4,
depth = 30,
readLen = 150,
keep_genome_files = FALSE,
batch_size = 2
)
# Check results for a specific cell
print(mutation_checks[["50"]])
} # }