Simulate Single Nucleotide Variants for Single Cell Data

Simulates nucleotide-level details for single nucleotide variants (SNVs) in single cell mutation data, handling both regular and recurrent mutations with appropriate nucleotide changes.

sim_snv_nt_sc(
  genome_sequence,
  seg_list,
  mutation_table,
  recurrent_mutation_tracker = NA,
  nt_transition_matrix
)

Arguments

genome_sequence: List of DNA sequences representing the reference genome
seg_list: Data frame or list containing genomic segment information with mapping between chromosome coordinates and segment identifiers
mutation_table: Data frame containing mutation information with columns: clone, cell_index, haplotype, chrom, pos, time
recurrent_mutation_tracker: List structure tracking recurrent mutations, organized by genomic location and mutation sets (default is NA for no recurrent mutations)
nt_transition_matrix: Matrix specifying nucleotide transition probabilities for different mutation types

Value

An extended mutation_table data frame with additional columns:

seg_id: Segment identifier for the mutation
ref_pos: Reference position within the segment
original_nt: Original nucleotide at the mutation site
alternative_nt: Mutated nucleotide
processed: Logical flag indicating whether the mutation has been processed

Details

This function simulates the nucleotide-level details of mutations by:

Extending the mutation table with columns for segment ID, reference position, original nucleotide, alternative nucleotide, and processing status
Processing recurrent mutations first, maintaining proper nucleotide changes across mutation sets (later mutations in a set build upon earlier ones)
Processing regular (non-recurrent) mutations

For recurrent mutations, the function ensures that:

The first mutation in a set is processed normally
Subsequent mutations in the set use the alternative nucleotide from the previous mutation as their original nucleotide

The function relies on helper functions:

find_identical_row_index(): To locate specific mutations in the table
simulate_single_nt_change(): To determine nucleotide changes for individual mutations

Simulate Single Nucleotide Variants for Single Cell Data

Arguments

Value

Details

See also