Extracts and organizes mutation information for specified sampled cells, including mutations in their ancestral lineage, and identifies recurrent mutations.

get_mutations_sc(cell_info, mutation_info, sampled_cell_idx = NULL)

Arguments

cell_info

Data frame containing cell lineage information with columns: clone, parent, birth_time, death_time, cell_index

mutation_info

Data frame containing mutation information with columns: clone, cell_index, haplotype, chrom, pos, time

sampled_cell_idx

Vector of cell indices for which to retrieve mutation information

Value

A list containing three elements:

  • sampled_sc_mutations: List where each element corresponds to a sampled cell, containing a data frame of mutations for that cell and its ancestors

  • sampled_mutation_table: Data frame containing all unique mutations across all sampled cells and their ancestors

  • recurrent_mutation_tracker: List tracking recurrent mutations found across multiple cells or lineages

Details

This function reconstructs the complete mutation profile for each sampled cell by:

  1. Identifying all ancestral cells for each sampled cell using get_sc_ancestors()

  2. Collecting mutations from the sampled cell and all its ancestors

  3. Identifying recurrent mutations (mutations that occur multiple times independently)

  4. Grouping and processing recurrent mutations for tracking

  5. Removing duplicate mutations from the final table

The function requires the following helper functions to be defined:

  • get_sc_ancestors(): To identify ancestral cells

  • identify_recurrent_mutations(): To find mutations that appear multiple times

  • group_recurrent_mutations(): To group similar recurrent mutations

  • process_recurrent_mutation(): To track and analyze recurrent mutations

Examples

# Create sample cell_info and mutation_info data frames
cell_info <- data.frame(
  clone = c("A", "A", "B", "A", "B"),
  parent = c(NA, 1, 1, 2, 3),
  birth_time = c(0, 10, 10, 15, 20),
  death_time = c(10, NA, NA, NA, NA),
  cell_index = 1:5
)

mutation_info <- data.frame(
  clone = c("A", "A", "B", "A"),
  cell_index = c(1, 2, 3, 4),
  haplotype = c("hap1", "hap1", "hap2", "hap1"),
  chrom = c("chr1", "chr1", "chr2", "chr1"),
  pos = c(1000, 2000, 1500, 1000),
  time = c(5, 12, 15, 18)
)

# Get mutations for cells 4 and 5
mutations <- get_mutations_sc(cell_info, mutation_info, c(4, 5))
#> Warning: restarting interrupted promise evaluation
#> Warning: internal error -3 in R_decompress1
#> Error: lazy-load database '/Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/library/cancerSimCraft/R/cancerSimCraft.rdb' is corrupt

# View mutation summary
print(mutations$sampled_mutation_table)
#> Error: object 'mutations' not found