Validates that genomic regions marked as lost (CN_change = -1) contain only 'N' nucleotides in the simulated genome sequence. This ensures proper handling of deletion events.

check_loss_segments(clone_genome, clone_segments, verbose = TRUE)

Arguments

clone_genome

A nested list containing genome sequences:

  • First level: haplotypes (maternal/paternal)

  • Second level: chromosomes with nucleotide sequences

clone_segments

A nested list containing segment information:

  • First level: haplotypes (maternal/paternal)

  • Second level: data frame with columns:

    • chrom - Chromosome name

    • ori_start - Original start position

    • ori_end - Original end position

    • CN_change - Copy number change (-1 for losses)

    • seg_id - Segment identifier

Value

A list where:

  • Names are in format "haplotype_segment_id"

  • Values are nucleotide frequency counts (from Biostrings::alphabetFrequency)

  • Empty list if no loss segments are found

Details

The function:

  1. Identifies segments with CN_change = -1

  2. For each lost segment:

    • Extracts the sequence

    • Counts nucleotide frequencies

    • Should show only 'N' nucleotides if loss is correctly implemented

  3. Prints message if no loss segments found in a haplotype

Examples

if (FALSE) { # \dontrun{
# Check lost segments in a clone
loss_check <- check_loss_segments(
  clone_genome = synthesized_genome,
  clone_segments = segment_info
)
# Results should show only 'N' nucleotides in lost regions
} # }